Programming - A Gateway Drug to Math

I decided to try my hand at the Stanford's AI Class. The pre-requisites mentioned Probability and Linear Algebra. So I started watching Probability videos on KhanAcademy

Sal Khan was teaching how to find the probability of 2 heads when you toss a coin 5 times.

A classic nCk problem: 

The probability of getting 2 heads while tossing a coin 5 times is:

But I wanted to find out the probability of getting at least 2 heads when I toss 5 coins.
Its really simple. All I had to do is P(2) + P(3) + P(4) + P(5). 
But then computingby hand (or a calculator) was painfully slow, let alone do it 4 times.
So I wrote two little functions in Python that will calculate factorial (yes I reinvented the wheel) and
Nothing teaches you math faster than trying to write a program to do the math for you. 
Writing a program is the same as teaching the computer how to do a certain task. The only way you can teach someone to do a task is to become a master at doing that task yourself.

Bonus: It also teaches you corner cases like 0! = 1 and  that you wouldn't think of otherwise.

Rant about C++ dependency hell

When was the last time I vented about C++? The answer for that is always:

"TOO LONG AGO". 

The initial friction to setup a substantial project using C++ is unfucking bearable.

When we started code revamp at work recently, I decided to be a good citizen and decided to incorporate cpptest, a unit testing framework in our project.

It made me realize how unreasonably complicated Makefiles can be. After 3 hours of peeling away at the complexity I managed to add cpptest to the build dependency of the project. 

Now time to write a few tests and check it out. I'm thinking "We are almost there". 

FALSE!

Compilation gives me a gazillion error messages that make absolutely no sense. After about 30mins of StackOverflowing and Googling, I find out that its not enough to include string.h and map.h in my  header files, but I also need to namespace it. Of course there is no indication (not even a hint) of that in the error messages. So I add 'using namespace std' and get past it.

Awesome my first test is compiling successfully. Time to run this baby and declare victory. 

Close! But no cigar.

The executable was unable to load the CppTest library during runtime. Argh!

I set my LD_LIBRARY_PATH env variable and now it's running. But I can't ask everyone in my team to do that, so I have to figure out how to statically link that library. 

It's already 6pm and I'm hungry. That'll have to wait for another day. 

TL;DR - C++ and Makefile can burn in a fire of thousand suns.

Rapid Prototyping in Python

I was recently assigned to a new project at work. Like any good software engineer I started writing the pseudocode for the modules. We use C++ at work to write our programs.

I quickly realized it's not easy to translate programming ideas to English statements without a syntactic structure. When I was whining about it to Vijay, he told me to try prototyping it in Python instead of writing pseudocode. Intrigued by this, I decided to write a prototype in Python to test how various modules will come together.

Surprisingly it took me a mere 2 hours to code up the prototype. I can't emphasize enough, how effortless it was in Python.

What makes Python an ideal choice for prototyping:

Dynamically typed language:

Python doesn't require you to declare the datatype of a variable. This lets you write a function that is generic enough to handle any kind of data. For eg:

def max_val(a,b):
    return a if a >b else b

This function can take integers, floats, strings, a combination of any of those, or lists, dictionaries, tuples, whatever.

A list in Python need not be homogenous. This is a perfectly good list:

[1, 'abc', [1,2,3]]

This lets you pack data in unique ways on the fly which can later be translated to a class or a struct in a statically typed language like C++.

class newDataType
{
    int i;
    String str;
    Vector vInts;
};

Rich Set to Data-Structures:

Built-in support for lists, dictionaries, sets, etc reduces the time involved in hunting for a library that provides you those basic data-structures.

Expressive and Succinct:

The algorithms that operate on the data-structures are intuitive and simple to use. The final code is more readable than a pseudocode.

For example: Lets check if a list has an element

>>> lst = [1,2,3]    # Create a list
>>> res = 2 in lst   # Check if 2 is in 'lst'
True

If we have to do it in C++.

list lst;
lst.push_back(3);
lst.push_back(1);
lst.push_back(7);
list::iterator result = find(lst.begin(), lst.end(), 7); 
bool res = (result != lst.end())

Python Interpreter and Help System:

This is a huge plus. The presence of interpreter not only aids you in testing snippets of code, but it acts as an help system. Lets say we want to look up the functions that operate on a List.

>>> dir([])
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__',
'__delslice__', '__doc__', '__eq__', '__format__', '__ge__', 
'__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__',
'__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__',
'__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__',
'__setslice__', '__sizeof__', '__str__', '__subclasshook__', 'append',
'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

>>> help([].sort)
Help on built-in function sort:
     
sort(...)
    L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
    cmp(x, y) -> -1, 0, 1

Advantages of prototyping instead of pseudocode:

  • The type definition of the datastructures emerge as we code.
  • The edge cases start to emerge when you prototype.
  • A set of required supporting routines.
  • A better estimation of the time required to complete a task.

Scripting Tmux Layouts

Tmux is an awesome replacement for Screen. I have a couple of standard terminal layouts for programming. One of them is show below.

  • Vim editor on the left.
  • Top right pane has the bpython interpreter. 
  • Bottom right pane has the bash prompt. 

I have a small tmux script in my ~/.tmux/pdev file that has the following lines

selectp -t 0              # Select pane 0
splitw -h -p 50 'bpython' # Split pane 0 vertically by 50%
selectp -t 1              # Select pane 1
splitw -v -p 25           # Split pane 1 horizontally by 25%
selectp -t 0              # Select pane 0

In my tmux.conf file I have bound <prefix>+P to sourcing this file. So now anytime I want to launch my python dev layout, I hit <prefix>+<shift>+p. 

bind P source-file ~/.tmux/pdev

Contributing to Open Source

Last week I successfully submitted my first patch to an open source project and it was accepted. 

I like the bpython interpreter for all my python needs. It is quite handy for a python newbie like me. A few weeks ago I was in the middle of building an elaborate datastructure to learn list comprehension in python, when bpython crashed and took all the history with it. I whined about it on twitter and one of the developers of the project prompted me to submit a bug report. I was quite impressed by the fact that a core developer of bpython replied to my bitching on twitter.

After I filed the bug report, I decided to get the source code and poke around. I finally implemented a feature that saved the history after each command instead of waiting till the end of a session. 

The following factors were the main impetus that led me to contribute to the project. 

Project Hosting: 

The project was hosted on bit bucket which is a Github equivalent for mercurial. This makes it so easy to fork a project and issue pull requests, compared to the traditional source forge model of submitting patches in a mailing list. The social coding sites like Github and BitBucket have reduced much of the initial friction in starting an open source project.

Project Size:

This one has a huge impact when I decide to dive into the code. Traditional C projects tend to have a ton of files that are too big which is daunting for a beginner. The bpython project was written in python and had a total of 13 .py files. This makes it dead simple to make a quick change and run the project without compiling it. Again the choice of language has a lot to do with this. 

IRC:

The welcoming nature of the community around a project does a lot to encourage a new comer. The IRC channels are a great way to interact with the developers compared to a passive form of communication such as emails. I jumped on #bpython irc channel and started asking questions when I ran into an issue with bpython source code. People on that channel are really helpful and prompt in answering questions.

Persistence:

My first pull request was scrutinized by the core developers and some suggestions for improvements were given. During that process I learned a lot about code review and how to check for corner cases. Finally after I made all those improvements the pull request was accepted and merged with the main repo. So having a beginners mind (no ego) is an absolute must when getting started on any project. Don't be discouraged if your first attempt is unsuccessful. 

Now I'm proud to say my name is listed in the AUTHORS file of bpython project.

Synchronize Panes in Tmux

Tmux is an alternative for screen. For anyone who doesn't know screen, it is a terminal multiplexer which means, it allow multiple windows in terminal. It can split your window into multiple panes (vertical/horizontal), detach a session which can be attached at a later time. Detach/Attach is very useful for running a job in a remote server without having to keep the ssh open the whole time. 

Tmux can be configured by  ~/.tmux.conf file.
My prefix key is Ctrl-q.
Synchronizing panes:
If you want to send your keystrokes to all the panes in your tmux window: 
<prefix> :setw synchronize-panes
In my case I do:
Ctrl-q:setw synchronize-panes
This is immensely useful if you want to execute the same set of commands on multiple servers.

Five hours of Aikido and Javascript

It was a good day for Aikido. We started at 10am and went till 4pm with an hour break. My joints are sore but my mind is exuberant. 

I actually managed to get more than 2 hours of Javascript learning after that. I'm quite impressed by the features of Javascript. Passing a function to another function as a parameter, returning a function, a function inside a function, anonymous functions (lambdas), wow, its a surprisingly powerful language for doing web development. I haven't even reached the section about closures yet. 

I'm a little blown away by Javascript.  

Utah Python Users Group - 11/11/10

I've been messing around with Python for the past 6 months and I'm loving it. Today I went to my second UtahPython users group meeting and had a lot of fun and learned a ton of stuff.

Chronological order of things I learned:
  • Supy bot - an IRC bot written in Python. 
  • supybot-doxygen - A plugin for supy bot that can provide api documentation for any software that uses doxygen.
    • This could be really useful at work if I can setup an internal IRC server for the developers to hang out. 
  • Objectify is a module in python for parsing XML files. 
  • doctest - a python module for TDD that is super simple. I'm really excited about this. Thanks to Matt for showing me how to use this.
Matt suggested that we do some pair programming during the meetup.
Here's the task: Write a simple python program that can take page numbers as user input and convert it to a list of numbers. 
User Input: 0, 1, 5, 7-10
Output: 0,1,5,7,8,9,10
Here is the code I wrote with the doc test based unit test in the doc-string:
###### PrintParser.py #######
    
    #!/usr/bin/env python

    def convert(inp):
        """
        * Get the input from user.
        * Parse the input to extract numbers
        ** Split by comma
        *** Each item in the list will then be split by '-'
        **** Populate the number between a-b using range(a,b)

        >>> convert("")
        []
        >>> convert("1")
        [1]
        >>> convert("1,2")
        [1, 2]
        >>> convert("1,2-5")
        [1, 2, 3, 4, 5]
        >>> convert("1-3,2-5,8,10,15-20")
        [1, 2, 3, 2, 3, 4, 5, 8, 10, 15, 16, 17, 18, 19, 20]
        """
        if not inp:
            return []
        pages = []
        comma_separated = []
        comma_separated = inp.split(",")
        for item in comma_separated:
            if "-" in item:
                a = item.split("-")
                pages.extend(range(int(a[0]),int(a[1])+1))
            else:
                pages.append(int(item))

        return pages

    if __name__ == '__main__' :
        import doctest
        doctest.testmod()

Cows and Bulls Game

A typical Saturday evening conversation between me and my inner voice. 

Setting: Salt Lake Roasting Co., after an intense Aikido training. 

Me: I'm bored. 
Inner Voice: Go take a nap.
Me: Meh! I think I'm gonna do some Python coding. 
Inner Voice: What are you going to code? 
Me: Remember the Cows and Bulls word game that I used to play when I was little.
Inner Voice: What? That sounds kinda lame. Work on something productive.
Me: Screw you, Inner Voice!!
Inner Voice: I don't like you very much. 
Me: Oh yeah, well the feeling is mutual. 

So I fired up Vim and coded up the Cows and Bulls game :)

It is text-based right now and still in its primitive stage. I'll probably add more features when I get into a fight with my Inner Voice again.