Python Profiling - Part 1

I gave a talk on profiling python code at the 2012 Utah Open Source Conference. Here are the slides and the accompanying code.

There are three parts to this profiling talk:

  • Standard Lib Tools - cProfile, Pstats
  • Third Party Tools - line_profiler, mem_profiler
  • Commercial Tools - New Relic

This is Part 1 of that talk. It covers:

  • cProfile module - usage
  • Pstats module - usage
  • RunSnakeRun - GUI viewer

Why Profiling:

  • Identify the bottle-necks.
  • Optimize intelligently. 

In God we trust, everyone else bring data

cProfile:

cProfile is a profiling module that is included in the Python's standard library. It instruments the code and reports the time to run each function and the number of times each function is called. 

Basic Usage:

The sample code I'm profiling is finding the lowest common multiplier of two numbers. lcm.py

# lcm.py - ver1 
    def lcm(arg1, arg2):
        i = max(arg1, arg2)
        while i < (arg1 * arg2):
            if i % min(arg1,arg2) == 0:
                return i
            i += max(arg1,arg2)
        return(arg1 * arg2)

    lcm(21498497, 3890120)

Let's run the profiler.

$ python -m cProfile lcm.py 
     7780242 function calls in 4.474 seconds
    
    Ordered by: standard name
   
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.000    0.000    4.474    4.474 lcm.py:3()
         1    2.713    2.713    4.474    4.474 lcm.py:3(lcm)
   3890120    0.881    0.000    0.881    0.000 {max}
         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   3890119    0.880    0.000    0.880    0.000 {min}

Output Columns:

  • ncalls - number of calls to a function.
  • tottime - total time spent in the function without counting calls to sub-functions.
  • percall - tottime/ncalls
  • cumtime - cumulative time spent in a function and it's sub-functions.
  • percall - cumtime/ncalls

It's clear from the output that the built-in functions max() and min() are called a few thousand times which could be optimized by saving the results in a variable instead of calling it every time. 

    Pstats:

    Pstats is also included in the standard library that is used to analyze profiles that are saved using the cProfile module. 

    Usage:

    For scripts that are bigger it's not feasible to analyze the output of the cProfile module on the command-line. The solution is to save the profile to a file and use Pstats to analyze it like a database. Example:  Let's analyze shorten.py.

    $ python -m cProfile -o shorten.prof shorten.py   # saves the output to shorten.prof
    
    $ ls
    shorten.py shorten.prof

    Let's analyze the profiler output to list the top 5 frequently called functions.

    $ python 
    >>> import pstats
    >>> p  = pstats.Stats('script.prof')   # Load the profiler output
    >>> p.sort_stats('calls')              # Sort the results by the ncalls column
    >>> p.print_stats(5)                   # Print top 5 items
    
        95665 function calls (93215 primitive calls) in 2.371 seconds
        
       Ordered by: call count
       List reduced from 1919 to 5 due to restriction <5>
        
           ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        10819/10539    0.002    0.000    0.002    0.000 {len}
               9432    0.002    0.000    0.002    0.000 {method 'append' of 'list' objects}
               6061    0.003    0.000    0.003    0.000 {isinstance}
               3092    0.004    0.000    0.005    0.000 /lib/python2.7/sre_parse.py:182(__next)
               2617    0.001    0.000    0.001    0.000 {method 'endswith' of 'str' objects}

    This is quite tedious or not a lot of fun. Let's introduce a GUI so we can easily drill down. 

    RunSnakeRun:

    This cleverly named GUI written in wxPython makes life a lot easy. 

    Install it from PyPI using (requires wxPython)

    $ pip install SquareMap RunSnakeRun
    $ runsnake shorten.prof     #load the profile using GUI

    The output is displayed using squaremaps that clearly highlights the bigger pieces of the pie that are worth optimizing. 

    It also lets you sort by clicking the columns or drill down by double clicking on a piece of the SquareMap.

    Conclusion:

    That concludes Part 1 of the profiling series. All the tools except RunSnakeRun are available as part of the standard library. It is essential to introspect the code before we start shooting in the dark in the hopes of optimizing the code.

    We'll look at line_profilers and mem_profilers in Part 2. Stay tuned. 

    You are welcome to follow me on twitter (@amjithr).