Pycast - Python screencasts

Pycast - Weekly screencasts on Python and DataScience by Matt Harrison. 

Matt is bootstrapping pycast through kickstarter. I'm excited about it because I've attended Matt's tutorials and came away feeling leveled up on my Python chops. 

Nearly 5 years ago I was getting started in Python and learning on my own by writing small scripts to automate silly stuff. I wasn't writing anything adventurous and I was looking for a way to improve my skills.

Right around that time I started getting involved in the open source community in Utah and decided to go to a local conference. Matt was doing a 3 hour tutorial that covered beginner to intermediate Python. When the session was over I felt empowered. I couldn't wait to get back home to do the exercises that he had laid out during the training. After working through them I felt like I really knew the language. I was writing generators and decorators by the end of it. It was an accelerated learning experience that took me from a novice to a journeyman

The beauty of his training is, it wasn't merely a brain dump, he was teaching me to how to learn, where to look up the docs, how to recognize idiomatic python and best practices of programming. 

I eventually landed a job doing full time Python at an awesome company.

That's why I'm excited about his new venture. This is a great opportunity for me to dive into Data Science and I can't wait to see his videos and workout the exercises.

If you're still on the fence about it, leave a comment on his kickstarter page with your question. He's a friendly and responsive person.

Conversations with a 2 yo

Sempi insisted on doing laundry, helping put away the clothes and sweeping the floor. 

Yoshi: When did you become such a big boy?

Sempi: Three minutes ago. 

Continues sweeping the floor with a big smile and a song. 

Sempi: I'm a street sweeper with a broom in my hand.  

Montreal Bagels - 2yo review

I was in Montreal for PyCon 2015. I was told that Montreal was famous for it's bagels. So I brought home half-dozen bagels.

I made my son a delicious toasted bagel with cream cheese in the morning. He ate it with gusto and the following conversation ensues: 

Me: Sempi, did you like the Montreal Bagels.

2yo: It was ok (as he proceeds to lick his fingers clean). 

This kid is hard to impress. :)

Kickstarter: mysql-cli

I'm starting a project called mysql-cli.

mysql-cli will be a command line client for MySQL, with auto-completion and syntax highlighting. An equivalent of pgcli for MySQL database.

I'm raising funds for the project through kickstarter. The goal is to compensate for the development time and resources (hosting, testing etc) as well as motivate me to keep going.

When I launched pgcli earlier this year I had high hopes for it. I anticipated that I might reach a hundred users and even a couple of contributions. I announced it on Twitter and HackerNews and it took about an hour to reach 100 stars. By the second day it was a top trending repo in all of Github. Right now it is hovering near the 1600 mark for the number of stars, with more than 70 pull requests (merged). 

During the first week of launch, I slept about 3 hours each night because the pull requests and issues came flooding. I made a personal resolve to answer every communique within 24 hours. This meant answering personal emails, responding to issues filed, reviewing pull requests etc. I vowed to be polite and respectful to my users and contributors and I've had nothing but pleasant interactions with them. 

My hope is to provide the same level of dedication and support to mysql-cli. There is definitely a need for it, since every time I use the default MySQL client I want to scream obscenities at my computer and I can't be the only one. :)

The plan is to launch mysql-cli in July 2015 and open up the repo to public. If you want to get involved sooner, please back the project on kickstarter and I'll add you to the early access list. :)

Launching pgcli

I've been developing pgcli for a few months now. 

It is now finally live http://pgcli.com

It all started when Jonathan Slenders sent me a link to his side-project called python-prompt-toolkit

I started playing around with it to write some toy programs. Then I wrote a tutorial for how to get started with prompt_toolkit https://github.com/jonathanslenders/python-prompt-toolkit/tree/master/examples/tutorial

Finally I started writing something more substantial to scratch my own itch. I was dealing with Postgres databases a lot at that time. The default postgres client 'psql' is a great tool, but it lacked auto-completion as I type and it was quite bland (no syntax highlighting). So I decided to take this as my opportunity to write an alternate. 

Thus the creatively named project 'pgcli' was born.

Details about pgcli.com:

It is built using pelican a static site generator written in Python. 

It is hosted by Github pages. 

The content is written using RestructuredText.

Inspiration:

The design inspiration for the tool comes from my favorite python interpreter bpython.

Goodbye Utah

The time has come to part ways. I'm leaving Utah to move to Portland on May 25th 2012. This July would have marked the 10 year anniversary of living in Utah. I can't believe I spent one third of my life so far in Utah. Some of the best memories in my life were formed here.

True Love: 

I met my beautiful wife here in Utah. She kicked my ass in TaeKwonDo, I asked her out, and the rest is history. Happily married for  four years with a baby on the way.

First Job:

My first real programming job was with Delcam USA. I still have my first paystub from Delcam. Tom my boss at Delcam is still the best boss I've had so far.

Higher Education:

University of Utah. So many memories, so many sleepless nights at the computer lab. I still get a nostalgia when I walk through the campus.

Parksvan:

Eight clueless kids from India got together to share accomodation while going to school and ended up sharing the best parts of our life. Although we have all parted ways since our college days, I can't help but feel they are part of my family.
Aikido:

When I went to my first class, I thought I was going there to give my wife some company. Four years and 6 belts later, it has become a dominating force in my life. I never knew getting thrown around was the way to make new friends. :)

Along the way I discovered the wonderful art of Iaido. I call it the art of playing with Japanese swords.
Python:

I couldn't believe there was a group of people who met every month to geek out about their favorite language. My heartfelt thanks to herlomharrisonsmcquay, travis and seth. You guys welcomed me into the group and helped me shape my future in the world of Python. 

Outdoor:

I took up Mountain Biking, Hiking and Snowboarding and I've loved every minute of it. I will miss the rocky moutains, Uintahs and the snowy hills. 

Leaving all of this behind makes me sad. Utah made me into what I am today.

But now I'm headed to Portland to work for New Relic. I'm told Oregon is a wonderful place, but I'll always have fond memories of Utah wherever I go. 

Python Profiling - Part 1

I gave a talk on profiling python code at the 2012 Utah Open Source Conference. Here are the slides and the accompanying code.

There are three parts to this profiling talk:

  • Standard Lib Tools - cProfile, Pstats
  • Third Party Tools - line_profiler, mem_profiler
  • Commercial Tools - New Relic

This is Part 1 of that talk. It covers:

  • cProfile module - usage
  • Pstats module - usage
  • RunSnakeRun - GUI viewer

Why Profiling:

  • Identify the bottle-necks.
  • Optimize intelligently. 

In God we trust, everyone else bring data

cProfile:

cProfile is a profiling module that is included in the Python's standard library. It instruments the code and reports the time to run each function and the number of times each function is called. 

Basic Usage:

The sample code I'm profiling is finding the lowest common multiplier of two numbers. lcm.py

# lcm.py - ver1 
    def lcm(arg1, arg2):
        i = max(arg1, arg2)
        while i < (arg1 * arg2):
            if i % min(arg1,arg2) == 0:
                return i
            i += max(arg1,arg2)
        return(arg1 * arg2)

    lcm(21498497, 3890120)

Let's run the profiler.

$ python -m cProfile lcm.py 
     7780242 function calls in 4.474 seconds
    
    Ordered by: standard name
   
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.000    0.000    4.474    4.474 lcm.py:3()
         1    2.713    2.713    4.474    4.474 lcm.py:3(lcm)
   3890120    0.881    0.000    0.881    0.000 {max}
         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   3890119    0.880    0.000    0.880    0.000 {min}

Output Columns:

  • ncalls - number of calls to a function.
  • tottime - total time spent in the function without counting calls to sub-functions.
  • percall - tottime/ncalls
  • cumtime - cumulative time spent in a function and it's sub-functions.
  • percall - cumtime/ncalls

It's clear from the output that the built-in functions max() and min() are called a few thousand times which could be optimized by saving the results in a variable instead of calling it every time. 

    Pstats:

    Pstats is also included in the standard library that is used to analyze profiles that are saved using the cProfile module. 

    Usage:

    For scripts that are bigger it's not feasible to analyze the output of the cProfile module on the command-line. The solution is to save the profile to a file and use Pstats to analyze it like a database. Example:  Let's analyze shorten.py.

    $ python -m cProfile -o shorten.prof shorten.py   # saves the output to shorten.prof
    
    $ ls
    shorten.py shorten.prof

    Let's analyze the profiler output to list the top 5 frequently called functions.

    $ python 
    >>> import pstats
    >>> p  = pstats.Stats('script.prof')   # Load the profiler output
    >>> p.sort_stats('calls')              # Sort the results by the ncalls column
    >>> p.print_stats(5)                   # Print top 5 items
    
        95665 function calls (93215 primitive calls) in 2.371 seconds
        
       Ordered by: call count
       List reduced from 1919 to 5 due to restriction <5>
        
           ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        10819/10539    0.002    0.000    0.002    0.000 {len}
               9432    0.002    0.000    0.002    0.000 {method 'append' of 'list' objects}
               6061    0.003    0.000    0.003    0.000 {isinstance}
               3092    0.004    0.000    0.005    0.000 /lib/python2.7/sre_parse.py:182(__next)
               2617    0.001    0.000    0.001    0.000 {method 'endswith' of 'str' objects}

    This is quite tedious or not a lot of fun. Let's introduce a GUI so we can easily drill down. 

    RunSnakeRun:

    This cleverly named GUI written in wxPython makes life a lot easy. 

    Install it from PyPI using (requires wxPython)

    $ pip install SquareMap RunSnakeRun
    $ runsnake shorten.prof     #load the profile using GUI

    The output is displayed using squaremaps that clearly highlights the bigger pieces of the pie that are worth optimizing. 

    It also lets you sort by clicking the columns or drill down by double clicking on a piece of the SquareMap.

    Conclusion:

    That concludes Part 1 of the profiling series. All the tools except RunSnakeRun are available as part of the standard library. It is essential to introspect the code before we start shooting in the dark in the hopes of optimizing the code.

    We'll look at line_profilers and mem_profilers in Part 2. Stay tuned. 

    You are welcome to follow me on twitter (@amjithr).