Launching pgcli

I've been developing pgcli for a few months now. 

It is now finally live http://pgcli.com

It all started when Jonathan Slenders sent me a link to his side-project called python-prompt-toolkit

I started playing around with it to write some toy programs. Then I wrote a tutorial for how to get started with prompt_toolkit https://github.com/jonathanslenders/python-prompt-toolkit/tree/master/examples/tutorial

Finally I started writing something more substantial to scratch my own itch. I was dealing with Postgres databases a lot at that time. The default postgres client 'psql' is a great tool, but it lacked auto-completion as I type and it was quite bland (no syntax highlighting). So I decided to take this as my opportunity to write an alternate. 

Thus the creatively named project 'pgcli' was born.

Details about pgcli.com:

It is built using pelican a static site generator written in Python. 

It is hosted by Github pages. 

The content is written using RestructuredText.

Inspiration:

The design inspiration for the tool comes from my favorite python interpreter bpython.

Goodbye Utah

The time has come to part ways. I'm leaving Utah to move to Portland on May 25th 2012. This July would have marked the 10 year anniversary of living in Utah. I can't believe I spent one third of my life so far in Utah. Some of the best memories in my life were formed here.

True Love: 

I met my beautiful wife here in Utah. She kicked my ass in TaeKwonDo, I asked her out, and the rest is history. Happily married for  four years with a baby on the way.

First Job:

My first real programming job was with Delcam USA. I still have my first paystub from Delcam. Tom my boss at Delcam is still the best boss I've had so far.

Higher Education:

University of Utah. So many memories, so many sleepless nights at the computer lab. I still get a nostalgia when I walk through the campus.

Parksvan:

Eight clueless kids from India got together to share accomodation while going to school and ended up sharing the best parts of our life. Although we have all parted ways since our college days, I can't help but feel they are part of my family.
Aikido:

When I went to my first class, I thought I was going there to give my wife some company. Four years and 6 belts later, it has become a dominating force in my life. I never knew getting thrown around was the way to make new friends. :)

Along the way I discovered the wonderful art of Iaido. I call it the art of playing with Japanese swords.
Python:

I couldn't believe there was a group of people who met every month to geek out about their favorite language. My heartfelt thanks to herlomharrisonsmcquay, travis and seth. You guys welcomed me into the group and helped me shape my future in the world of Python. 

Outdoor:

I took up Mountain Biking, Hiking and Snowboarding and I've loved every minute of it. I will miss the rocky moutains, Uintahs and the snowy hills. 

Leaving all of this behind makes me sad. Utah made me into what I am today.

But now I'm headed to Portland to work for New Relic. I'm told Oregon is a wonderful place, but I'll always have fond memories of Utah wherever I go. 

Python Profiling - Part 1

I gave a talk on profiling python code at the 2012 Utah Open Source Conference. Here are the slides and the accompanying code.

There are three parts to this profiling talk:

  • Standard Lib Tools - cProfile, Pstats
  • Third Party Tools - line_profiler, mem_profiler
  • Commercial Tools - New Relic

This is Part 1 of that talk. It covers:

  • cProfile module - usage
  • Pstats module - usage
  • RunSnakeRun - GUI viewer

Why Profiling:

  • Identify the bottle-necks.
  • Optimize intelligently. 

In God we trust, everyone else bring data

cProfile:

cProfile is a profiling module that is included in the Python's standard library. It instruments the code and reports the time to run each function and the number of times each function is called. 

Basic Usage:

The sample code I'm profiling is finding the lowest common multiplier of two numbers. lcm.py

# lcm.py - ver1 
    def lcm(arg1, arg2):
        i = max(arg1, arg2)
        while i < (arg1 * arg2):
            if i % min(arg1,arg2) == 0:
                return i
            i += max(arg1,arg2)
        return(arg1 * arg2)

    lcm(21498497, 3890120)

Let's run the profiler.

$ python -m cProfile lcm.py 
     7780242 function calls in 4.474 seconds
    
    Ordered by: standard name
   
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.000    0.000    4.474    4.474 lcm.py:3()
         1    2.713    2.713    4.474    4.474 lcm.py:3(lcm)
   3890120    0.881    0.000    0.881    0.000 {max}
         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   3890119    0.880    0.000    0.880    0.000 {min}

Output Columns:

  • ncalls - number of calls to a function.
  • tottime - total time spent in the function without counting calls to sub-functions.
  • percall - tottime/ncalls
  • cumtime - cumulative time spent in a function and it's sub-functions.
  • percall - cumtime/ncalls

It's clear from the output that the built-in functions max() and min() are called a few thousand times which could be optimized by saving the results in a variable instead of calling it every time. 

    Pstats:

    Pstats is also included in the standard library that is used to analyze profiles that are saved using the cProfile module. 

    Usage:

    For scripts that are bigger it's not feasible to analyze the output of the cProfile module on the command-line. The solution is to save the profile to a file and use Pstats to analyze it like a database. Example:  Let's analyze shorten.py.

    $ python -m cProfile -o shorten.prof shorten.py   # saves the output to shorten.prof
    
    $ ls
    shorten.py shorten.prof

    Let's analyze the profiler output to list the top 5 frequently called functions.

    $ python 
    >>> import pstats
    >>> p  = pstats.Stats('script.prof')   # Load the profiler output
    >>> p.sort_stats('calls')              # Sort the results by the ncalls column
    >>> p.print_stats(5)                   # Print top 5 items
    
        95665 function calls (93215 primitive calls) in 2.371 seconds
        
       Ordered by: call count
       List reduced from 1919 to 5 due to restriction <5>
        
           ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        10819/10539    0.002    0.000    0.002    0.000 {len}
               9432    0.002    0.000    0.002    0.000 {method 'append' of 'list' objects}
               6061    0.003    0.000    0.003    0.000 {isinstance}
               3092    0.004    0.000    0.005    0.000 /lib/python2.7/sre_parse.py:182(__next)
               2617    0.001    0.000    0.001    0.000 {method 'endswith' of 'str' objects}

    This is quite tedious or not a lot of fun. Let's introduce a GUI so we can easily drill down. 

    RunSnakeRun:

    This cleverly named GUI written in wxPython makes life a lot easy. 

    Install it from PyPI using (requires wxPython)

    $ pip install SquareMap RunSnakeRun
    $ runsnake shorten.prof     #load the profile using GUI

    The output is displayed using squaremaps that clearly highlights the bigger pieces of the pie that are worth optimizing. 

    It also lets you sort by clicking the columns or drill down by double clicking on a piece of the SquareMap.

    Conclusion:

    That concludes Part 1 of the profiling series. All the tools except RunSnakeRun are available as part of the standard library. It is essential to introspect the code before we start shooting in the dark in the hopes of optimizing the code.

    We'll look at line_profilers and mem_profilers in Part 2. Stay tuned. 

    You are welcome to follow me on twitter (@amjithr).

    PyCon 2012 Review

    PyCon 2012 was held at Santa Clara, California.

    Tutorial:

    I was there on the Thrusday to attend a tutorial called Python Epiphanies. The tutorial was educational in understanding some of the inner workings of Python. But I have a hard time trying to figure out how to use the knowledge I gained there. 

    Opening Ceremony:
    We had ROBOTS.

    And they were dancing.... how cool was that? It was a splendid opening ceremony. 

    Socializing:

    Thursday evening was bag stuffing. Where we helped out by stuffing the swag bag. I got to work side-by-side some well known figures in the community (Jesse Noller, Pydanny). After that I hung out with some Heroku folks and learned about their awesome work culture.

    Later that night, Yannick and Bryan gave a Pycon newbie orientation. I took their advice and gave a Lightning Talk about bpython (my talk is 10min into the video).

    I socialized plenty and got a lot of useful contacts from different companies. I got to meet the founders of Octopart, my favorite Electrical Engineering based startups. 

    I also met with Kenneth Reitz who is famous for his requests library and this awesome talk Python for Humans.

    I was quite thrilled when I first saw Guido at the Lunch hall sitting right next to my table. I was too shy to talk to him, but I managed to get a picture with him (in the frame). 

    Real gutsy! Maybe next year I'll actually shake his hand and get a picture with him. 

    Keynotes:
    • I enjoyed Paul Graham's keynote quite a bit. He talked about daring startup ideas. His keynote is summarized in these two essays
    • David Beazly's keynote was a walk-through (demo) of tinkering with PyPy. It looked hard as balls and I kept hoping a happy ending where he declared victory. But it ended up being one of those art movie endings that leaves the listeners in a confused and inconclusive state. 
    • Guido's Keynote on the other hand was interesting. His talk was sprinkled with unintended hilarity that ensued due to Google's presentation software. He was sporting an awesome T-shirt that read "Python is for girls" and talked about dealing with Python Trolls.
    Talks: 
    I knew that all the talks were video taped and posted online, so I didn't worry too much about missing some when I had conflicts.

    The following talks piqued my interest and will make me go exploring a little bit. 

    Permission or Forgiveness - Quaint. Applying Grace Murray Hopper's logic to Python programming. 

    Webserver Performance Tuning - Sounded like a sales pitch for New Relic, but not in a bad way.

    Angry Birds playing Robot - Hilarious and Informative.

    Capoeira: 
    I went to the open space organized by Pydanny where he brought his Capoeira instructor who taught us some awesome moves. By the end of the class, we are all breathing heavily and energized. Once I tried to do a hand stand and lost my balance, but Aikido training kicked in and I gracefully rolled out of my fall with just a carpet burn on my elbow. 

    Open Spaces:
    I didn't get a chance to go to many of them, but I did attend the open space for SaltStack and sat with Seth House to try and get Salt running on my Macbook. After a few failed attempts, I decided to give Salt a chance on my linux desktop once I got home. 

    Babbage Difference Engine: (Not Pycon related)
    Sunday afternoon Stephen McQuay (a fellow Utah Pythonista) and I decided to take up Vijay's offer to go visit the Computer History Museum where they were doing a live demo of the Babbage Difference Engine. OMG! It was awesome to watch the history come alive. 

    Memoization Decorator

    Recently I had the opportunity to give a short 10 min presentation on Memoization Decorator at our local UtahPython Users Group meeting. 

    Memoization: 

    • Everytime a function is called, save the results in a cache (map).
    • Next time the function is called with the exact same args, return the value from the cache instead of running the function.

    The code for memoization decorator for python is here: http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize

    Example:

    The typical recursive implementation of fibonacci calculation is pretty inefficient O(2^n).   

    def fibonacci(num):
            print 'fibonacci(%d)'%num
            if num in (0,1):
                return num
            return fibonacci(num-1) + fibonacci(num-2)

    >>> math_funcs.fibonacci(4) # 9 function calls fibonacci(4) fibonacci(3) fibonacci(2) fibonacci(1) fibonacci(0) fibonacci(1) fibonacci(2) fibonacci(1) fibonacci(0) 3

    But the memoized version makes it ridiculously efficient O(n) with very little effort.

    import memoized
    @memoized
    def fibonacci(num):
        print 'fibonacci(%d)'%num
        if num in (0,1):
            return num
        return fibonacci(num-1) + fibonacci(num-2)
        
    >>> math_funcs.mfibonacci(4)  # 5 function calls
        fibonacci(4)
        fibonacci(3)
        fibonacci(2)
        fibonacci(1)
        fibonacci(0)
        3

    We just converted an algorithm from Exponential Complexity to Linear Complexity by simply adding the memoization decorator.

    Slides:

    Presentation:

    I generated the slides using LaTeX Beamer. But instead of writing raw LaTeX code I used reStructured Text (rst) and used rst2beamer script to generate the .tex file. 

    Source:

    The rst file and tex files are available in Github.

    https://github.com/amjith/User-Group-Presentations/tree/master/memoization_de...

     

    Productive Meter

    A few weeks ago I decided that I should suck it up and start learning how to develop for the web. After asking around, my faithful community brethren, I decided to learn Django from its docs

    ::Django documentation is awesome::

    Around this time I came across this post about Waking up at 5am to code. I tried it a few times and it worked wonders. I've been working on a small project that can keep track of my productivity on the computer. The concept is really simple, just log the window that is on top and find a way to display that data in a meaningful way. 

    Today's 5am session got me to a milestone on my project. I am finally able to visaulize the time I spend using a decent looking graph. Which is a huge milestone for someone who learned how to display html tables 3 weeks ago.

    Tools:

    A huge thanks to my irc friends and random geeks who wrote awesome blog posts and SO answers on every problem I encountered.

    I will be open-sourcing the app pretty soon. Stay tuned.

    Too Many Classes Too Little Time

    I'm taking a couple of the free online classes offered by Standford. One on Artifical Intelligence and one on Machine Learning

    I haven't had so much fun since kindergarten. Actually that's not fair, I didn't enjoy kindergarten this much. I'm listening to the classes during my lunch, after work, during weekends. I'm working on my assignment with so much enthusiasm, I dread the day when this class ends. 

    Stanford just announced a slew of new online classes offered starting in Jan 2012. I was way too excited when I first read the description on them. Now I'm a little sad, becasue I want to take 8 out of the 11 courses that are being offered and I don't have enough time. :(

    Woe is me. 

    ps: If you are not taking any of these classes you are missing out big time. Please do yourself a favor and sign up. 

    Picking 'k' items from a list of 'n' - Recursion

    Let me preface this post by saying I suck at recursion. But it never stopped me from trying to master it. Here is my latest (successful) attempt at an algorithm that required recursion. 

    Background: 

    You can safely skip this section if you're not interested in the back story behind why I decided to code this up. 

    I was listening to KhanAcademy videos on probability. I was particularly intrigued by the combinatorics video. The formula to calculate the number of combinations of nCr was simple, but I wanted to print all the possible combinations of nCr. 

    Problem Statement:

    Given 'ABCD' what are the possible outcomes if you pick 3 letters from it to form a combination without repetition (i.e. 'ABC' is the same as 'BAC'). 

    At first I tried to solve this using an iterative method and gave up pretty quickly. It was clearly designed to be a recursive problem. After 4 hours of breaking my head I finally got a working algorithm using recursion. I was pretty adamant about not looking it up online but I seeked some help from IRC (Thanks jtolds). 

    Code: 

    def combo(w, l):
            lst = []
            if l < 1:
                return lst
            for i in range(len(w)):
                if l == 1:
                    lst.append(w[i])
                for c in combo(w[i+1:], l-1):
                    lst.append(w[i] + c)
            return lst

    Output:

    >>> combinations.combo('abcde',3)
        ['abc', 'abd', 'abe', 'acd', 'ace', 'ade', 'bcd', 'bce', 'bde', 'cde']

    Thoughts:

    • It helps to think about recursion with the assumption that an answer for step n-1 already exists.
    • If you are getting partial answers check the condition surrounding the return statement.
    • Recursion is still not clear (or easy). 

    I have confirmed that this works for bigger data sets and am quite happy with this small victory.