Throttling Your Network Connection on Mac OS X

April 11th, 2012

Sometimes you just need to sloooooow doooooooown to test how your software behaves when your internet connection is crappy.

Linux has tc to do this, but what about Mac OS X?

That’s where ipfw comes in. It does a lot of stuff. I mean a lot, but we’re just going to use it to slow down our internet connection today.

Here’s an example that throttles your web browsing experience to 50 KBytes/second:

sudo ipfw pipe 1 config bw 50KByte/s >/dev/null
sudo ipfw add 1 pipe 1 src-port 80
sudo ipfw add 1 pipe 1 dst-port 80

And to turn it off (this is an important step!):

sudo ipfw delete 1

To make this super easy to use, I wrote a handy little shell script called network-throttle, which you can put in your PATH and run like this:

network-throttle on --port 80 --rate 50KByte/s

And to turn it off:

network-throttle off

You can download the shell script below. Put it in your PATH and name it network-throttle.

Or, if you like things shiny, pointy, and clicky, you can use the Apple Network Link Conditioner by installing X-Code.

Here’s the magical shell script:

#!/bin/bash
#
# Throttles your Mac OS X internet connection on one port.
# Handy for testing

set -e

RATE=15KByte/s
PORT=80
PIPE_NUMBER=1
ACTION=

function usage()
{
    echo $1
    echo
    echo "Usage: `basename "$0"` <action> [options]"
    echo "  Action:"
    echo "     on"
    echo "     off"
    echo
    echo "  Options:"
    echo "  --rate <rate>"
    echo "      Example: --rate 100KByte/s"
    echo "  --port <port> (default is 80 if you don't specify --port)"
    echo "      Example: --port 80"
    exit 1
}

function turn_throttling_off()
{
    echo "Turning off network throttling"
    sudo ipfw delete $PIPE_NUMBER || echo "Is it already turned off?"
}

function turn_throttling_on()
{
    echo "Throttling traffic to port $PORT: $RATE"
    sudo ipfw pipe $PIPE_NUMBER config bw $RATE >/dev/null
    sudo ipfw add $PIPE_NUMBER pipe $PIPE_NUMBER src-port $PORT >/dev/null
    sudo ipfw add $PIPE_NUMBER pipe $PIPE_NUMBER dst-port $PORT >/dev/null
}

# Grab command line args:
while [ -n "$1" ]; do
  case $1 in
    --rate)
      shift
      RATE=$1
      ;;
    --port)
      shift
      PORT=$1
      ;;
    *)
      ACTION=$1
  esac
  shift
done

[ -n "$ACTION" ] || usage "Error: no action specified"

case $ACTION in
  on)
    turn_throttling_off >/dev/null 2>&1 # in case it's already on, clear out the old one
    turn_throttling_on
    ;;
  off)
    turn_throttling_off
    ;;
  *)
    usage "Error: Bad action specified"
    ;;
esac

Dangerous Python Default Arguments

April 1st, 2012

Dangerous? Really? Well, not if you understand how it works.

Note: I’m not the first to write about this subject.

When writing a function in Python, it’s handy to use default argument values like this:

def do_something(some_list=[]):
    some_list.append('some item')
    print some_list
view raw gistfile1.py This Gist brought to you by GitHub.

And you think this will provide an easy way to let lazy callers pass no arguments to your function, and it will simply be called as if they had passed ['some item'].

But you would be wrong.

What actually happens is this (interactive shell output):

>>> do_something()
['some item']
>>> do_something()
['some item', 'some item']
>>> do_something()
['some item', 'some item', 'some item']
>>> do_something()
['some item', 'some item', 'some item', 'some item']
>>>
view raw gistfile1.txt This Gist brought to you by GitHub.

That’s right. Python creates a new object, named some_list that persists as an attribute of the function. If callers don’t pass their own some_list object, this one, the same one, is used each time your function is called.

Weird, huh?

If users pass their own some_list object, then sanity is restored:

>>> do_something(['my item'])
['my item', 'some item']
>>> do_something(['my item'])
['my item', 'some item']
>>> do_something(['my item'])
['my item', 'some item']
>>>
view raw gistfile1.txt This Gist brought to you by GitHub.

So why is this? Python stores each default argument value in a special attribute on the function called func_defaults. You can inspect any function’s default arguments like this:

>>> do_something.func_defaults
(['some item'],)
>>>
view raw gistfile1.txt This Gist brought to you by GitHub.

Because functions in Python are just objects, you can store arbitrary attributes on them. And indeed, you can actually modify the default function arguments at run time, like this:

>>> do_something.func_defaults = ([42],)
>>> do_something()
[42, 'some item']
>>>
view raw gistfile1.txt This Gist brought to you by GitHub.

But remember: Just because you can doesn’t mean you should. I would not recommend making a habit of stuff like this, especially if you like not having your co-workers hate you.

It’s always good to know how your tools work. Inside and out.

I discovered this behavior when investigating the pylint W0102 message, and discovered that this message actually inspired an entire wiki of Pylint message descriptions.

The Code Quality Continuum

March 5th, 2012

What does it take to write high quality code?

Here’s a little table that I use to judge my own code quality:

Code Quality Continuum Chart

Future axes to add:

  1. Ratio of the initial development cost to the long term maintenance cost
  2. Code testability: Code can be manually tested, code can be tested with automation, code can be tested via continuous integration

iPhone + Microscope = Awesome

January 3rd, 2012

I was trying to repair my TV, and I wasn’t happy with the crummy iPhone camera resolution of this chip:

So I took the part to work, put it under the microscope, lined up the iPhone camera with the eye piece, and voila!

I put my left index finger in the frame for size reference.

Note: in the second photo, the chip has been removed. :)

Book Review: Treading on Python

December 28th, 2011

Treading on Python by Matt Harrison provides a basic introduction to the Python programming language for programming novices.

Background of the reviewer

I have been writing code professionally for 10 years. I’ve spent most of my time in C++, but I’ve written a handful of small Python scripts (less than 100 lines) and a couple medium-sized Python applications (hundreds of lines with multi-threading).

In Treading on Python, I was looking to shore up my Python foundation before jumping into my first big Python project.

I was not disappointed.

Who is this book for?

If you are interested in learning how to write computer programs as a beginner, this book is probably a pretty good place to start. The book starts by persuading you to choose Python as your first programming language for two main reasons:

  1. Python is used widely in industry
  2. Python is easy to learn

This book is written primarily for brand new programmers. It provides practical advice for getting started at the very early stages of programming:

  • How to edit Python code
  • How to run Python programs
  • How to use the Python interactive shell
  • What a variable is (complete with cattle analogy)
  • How to use strings, integers, and lists

However, even if you have, like me, written some small to medium sized Python programs, you will still probably benefit from the following useful information:

  • Python’s handy dir and help functions
  • The enumerate() function
  • The dictinoary setdefault() method
  • Python’s concept of None and object id
  • List slicing
  • import and from...import semantics
  • And a surprisingly good list of pitfalls to avoid

Early in the book, Matt spends a lot of time explaining basic programming concepts (like variables). He does this by providing real world analogies (like cattle) that will probably seem superfluous to the experienced programmer, but that may be beneficial to the programming novice. I was tempted to skip the first few chapters but I’m glad I read them competely. The book is peppered with little gems that reveal what writing Python code is all about, and even the most basic topics still provide these insights.

Who is this book not for?

If you are looking for a book that delves deep into Python, this is not the book for you. Notably absent concepts from this book include:

I don’t offer this list as a criticism of the book. The book’s stated purpose is clearly not to provide a comprehensive Python treatise for the experienced programmer. But if you are considering this book as a way to delve into any of these concepts, this is not the book for you.

Opinion of the book

Matt clearly knows his Python. He has peppered the book with helpful tips that compelled me to whip out my Python interpreter to experiment. Many of the tips were very handy, even for a semi-experienced Python programmer such as myself.

Matt is pleasingly frank in his recommendations to avoid certain approaches, and after reading the book, I feel like I have a better eye for assessing how “Pythonic” something is. In fact, now that I have finished the book, I can look back on Python code I wrote before reading the book, and critique the heck out of it. Prior to reading this book, my Python code looked a lot like my C++ code, which is just a shame. This book can help inoculate you against such behavior.

The book reads smoothly and quickly. Matt is very careful to keep his explanations succinct and clear, such that you don’t feel like you’re reading a college text book or a reference manual. Even still, the book does contain a high information density.

On my iPad, with the default font size, the book is 243 pages in landscape mode and 147 pages in portrait mode.

I finished the book in fewer than a dozen 15-30 minute sittings.

Conclusion

If you can already crank out Python list comprehensions and lambda expressions, this is probably not the book for you. If you are an experienced programmer and want to learn Python, this is a fast way to start. If you are a total programming novice, this may be a good way to begin, but I’m not a great judge for this audience.

Fun Python Solution to Euler Problem 79

December 27th, 2011

Nothing says Merry Christmas like Project Euler.

Here’s a nifty solution to Problem 79 that uses Python and Graphviz.

The problem is to identify a user’s password given a bunch of successful logins (taken from some kind of nefarious keylogger–those crafty devils). The çatch is that each login is actually a subset of the actual password. Of course, they give you a sample of 50 successful logins.

So I decided to whip up some Python code to ingest each login and store a list of digit precedence. Then, the code prints out a Dot file which we turn into an image using the “dot” command. I ran this on Mac OS X, but it should work just fine on any Linux box (hint: install the “graphviz” package).

Here’s the Python code:

problem79.py:

digit_precendence = dict()

for line in [line.strip() for line in open("./keylog.txt")]:
    for index, char in enumerate(line):
        digits = digit_precendence.setdefault(int(char), set())
        if index < len(line)-1:
            digits.add(int(line[index+1]))

print "digraph problem79 {"
for (digit, subsequent_digits) in digit_precendence.iteritems():
    for subsequent_digit in subsequent_digits:
        print "   %d -> %d;" % (digit, subsequent_digit)
print "}"

Save that as problem79.py, download keylog.txt into the same folder, and run the program like this:

python problem79.py | dot -Tpng > problem79.png

Then view problem79.png, and voila! Graphviz just put the answer right in front of your nose: 73162890.

Cocoa Noob Pitfalls

September 12th, 2011

I just finished writing my first iPhone app. I have a background in Java, C++, Python, and a smattering of other programming languages on Linux and Windows in both embedded and desktop environments, so that hopefully explains my brain damaged context.

Here are the pit falls I stumbled upon while climbing up the Cocoa learning curve:

Retain/Release Memory Leaks

It wasn’t immediately obvious to me that adding an NSObject pointer to an NSMutableArray (and other containers) would actually increment the NSObject’s retain count. Not knowing this right away, I got into the (bad) habit of double-retaining objects. It took some investigation to find out what was happening, and thanks to Apple’s ingenius profiler and analyzer, I was able to identify the problem. After learning this, I came to find out that, by convention, lots of containers do this, but it’s not too apparent from the documentation.

No Container Static Type Checking?

Apple provides several container classes, including NSArray, NSDictionary, and several others. Much like old versions of Java (like the Java 1.3 stone age), you are free to add instances of any NSObject* child class. As a result, it’s possible to end up with objects of types you did not expect, and the compiler cannot prevent you or even warn you about this.

Sub-Class Insanity

It seems you have to sub-class to do very basic things, like adding items to a UIPickerView. I would have expected a trivial method call to do something like this.

Verbose Names

Stuff like appending one string to anoher is usually pretty trivial and terse in most languages. For example, in C++, you might see this:

myString += "suffix";

But in Cocoa, it looks like this:

myString = [NSString stringByAppendingString:@"suffix"];

That is a seriously verbose method name. If you count, it uses the word “String” 3 times, and that’s not even counting my variable name.

No Static Function Checking

Because Objective-C’s message-passing system is evaluated at runtime, the compiler won’t complain (much) if you try to send a message to an object that does not respond to that message (i.e., like calling a member function in C++ that does not exist). The compiler does warn you, but it will compile, run, and fail with a run-time exception. Some people consider this an advantage of Objective-C, so I can’t hold it against Apple (Smalltalk fans, I’m looking at you).

Bogus Compiler Warnings

If I define a couple messages like this in my class, without putting them in the .h file:

-(void) foo
{
    [self bar];  // compiler warning here
}

-(void) bar
{
}

The compiler will issue a warning, even though this is perfectly safe and will run without exceptions. The compiler is apparently very eager to warn you about this perfectly safe usage, and yet still allows you to send messages that definitely won’t be responded to at run time.

Weird Autorelease Pool Crashes

The first time I misused the autorelease message, I discovered that my application would crash in the event loop processing context. The stack gave no indication where I had gone wrong, because it contained none of my own code. The autorelease pool object itself would crash because it was calling release on an object that had already been freed. The runtime exception was so counter-intuitive that I ended up reverting my code (using git) back to a prior state. I eventually isolated the cause of the problem to my misuse of autorelease, which prompted me to do a more thorough exploration of the feature. Now I recognize those kinds of crashes for what they are, but the first time I encountered it, I was confused for a while before I figured it out.

Managing Multiple Sub-Views

If you have multiple UIPickerView sub-views within your view, managing their contents can be difficult. This is because you have to write a class to implement the UIPickerViewDataSource protocol, which is usually easiest to do right inside your view controller. However, when you add a second or third UIPickerView sub-view to your view, it gets difficult to manage. The UIPickerViewDataSource protocol sends you a pointer to the UIPickerView so you can keep track of which one is which, but it just feels cumbersome. I ended up using this guy’s code to make it easier.

Permissive Compiler Stuff

The Objective-C compiler will allow you to do this:

NSMutableDictionary *myDictionary =
    [[NSDictionary alloc] init];

I would expect it to work the other way around, but not this way. This leads to runtime exceptions when you try to add an element to “myDictionary”, which can be surprising until you realize you have an instance of an immutable NSDictionary. The runtime exception is pretty vague too: All you get is “invalid selector”.

Disappearing Outlets

If you connect an outlet or action in Interface Builder and then later delete the outlet or action object in your Objective-C code, you will get a very cryptic error when you try to instantiate the view:

'NSUnknownException", reason: "[<uiview 0%48a3e0>
setValue:forUndefinedKey:]: this class is not
key value coding-complaint

What it should say is “This xib file is trying to reference item ‘foo’ which does not exist”

Working with UINavigationController

UINavigationController can only have one delegate, even though all it does is notify the delegate when the current view changes. If you want to notify more than one view controller that navigation has moved, then you have to do some juggling to save and restore the UINavigationController’s delegate. It should be easier to notify interested parties when navigation changes.

Good Stuff

Cocoa has a lot of good stuff about it too. Here are some of the highlights from my experience:

Consistent Time Units

Times are always expressed in seconds (and fractions of seconds) instead of having to guess whether it’s seconds, milliseconds, or something else. This is extremely gratifying having come from a world where you pretty much always have to guess (although Google Go’s concept of typing is still far superior).

Analys and Profiling Tools

The X-Code 4 analysis and profiling tools are excellent. I mostly learned how retain/release worked by following the memory leaks that the code analyzer told me about. It even draws little arrows on your code to show you the code path you screwed up. Also, the memory leak finder is fantastic. It shows you when your application leaks, and where each leaked object was originally allocated. This made it trivial to track down and fix my memory leaks.

Easy Animation

Animation is as easy as giving a view a destination size and position, and telling it to go. Animation is integrated into the very core of Cocoa, and it is both easy to use and beautiful on screen.

Conclusion

So there you have it. Cocoa has some pitfalls and some good stuff. That’s all I have to say about it.

Linode: 7 years of awesome

April 18th, 2011

It just ocurred to me that I’ve been using Linode as my hosting provider for 7 years this week. I’ve been so happy with their service that I thought I’d offer a review.

The top 4 reasons you should consider Linode for your hosting needs (I’m sure there are 6 more reasons, but I’m time limited today):

Reason 1. Automatic Upgrades

Every year or so, I wake up to an email from Linode telling me I can have more disk space or RAM on my server. Adding it is super easy: just login to their web site and click a few buttons. Today I’ve got 490 MB of RAM and 16 GB of disk, and I’m still only using their least expnsive option.

Reason 2. Huge Internet Speeds

Linode servers have great Internet connection speeds. I regularly get over 50 Mbits/second.

Reason 3. Customer Service

The Linode support team is very sharp. They know what they’re talking about, and they are very helpful. This is more than can be said for many other support organizations. I’ve only needed to open a support ticket a cople times, and on both occasions, the Linode support staff responded within an hour and gave me all the information I needed, not only to solve my problem but to prevent my problem from happening again in the future

Reason 4. Cost

At $20/month, Linode is very affordable for what you get: Full root access, nearly any Linux distribution you want, lots of RAM and disk, and good CPU speed (my Linode has 4 CPUs).

After 7 years, I very highly recommend Linode as a Virtual Private Server hosting service.

LDS General Conference Podcast Updated

April 3rd, 2011

There were some fantastic messages this conference (April 2011). I plan on re-listening to all of them.

Get the podcast: General Conference Podcast page.

When faster is actually slower

February 12th, 2011

This is a follow-up to my post on code optimization.

Today I did some experimenting with hashes and binary-search-trees, specifically playing with QHash and QMap in C++.

According to the Qt containers algorithmic complexity documentation, QHash provides “significantly faster lookups” than QMap. Your intuition may confirm this, because you’ve heard that QHash uses a constant-time algorithm for lookups, while QMap uses a logarithmic algorithm. However, when it comes to performance questions, intuition is not the best guide.

When considering the QMap vs. QHash speed question, we have to consider everything these classes do when you are looking up an entry. Otherwise, you may be led to assume that QHash lookups are faster than QMap lookups in all cases.

How does QMap work?

Like an associative array or dictionary, the QMap class allows you to store values identified by keys. You can insert key/value pairs into the QMap and retrieve values by key. You can use any type as the value, but the key type must implement the < operator. This is because QMap looks up keys using a binary search through a sorted list.

How does QHash work?

QHash also provides the ability to insert and lookup values like QMap, but its implementation is quite different. Keys used in a QHash must implement the == operator and there must exist a global qHash(key) function for the key type. The QHash class hashes each key on lookup to find the position for each key’s value. This is a constant time operation (amortized). This means that the time it takes to lookup a key does not depend on the number of keys already stored in the QHash.

When is QHash slower than QMap?

Consider the following example:

QHash<QString, int> myHash;
QMap<QString,  int> myMap;

To insert the key “my favorite number” with the value 42 into myHash, the QHash has to first generate a hash code for “my favorite number”. To do so, it uses the qHash() function, which reads every character of the string “my favorite number”. This generates an index that is used to locate the position for this entry. Notice that qHash() must read every character of the string. The QMap, on the other hand, uses the < operator to compare the string “my favorite number” with all other keys in the QMap.

Now let’s suppose we filled myHash and myMap with 1 million short string keys (like a few characters each), and then did a few lookups.

I measured this case using QTime, and found that QHash lookups are about 3 times faster than QMap lookups in this scenario.

This is what I would expect after reading the documentation, but what if the key strings are bigger, like 200 characters each, and all quite distinct from each other?

I measured this case and found that QHash lookups are about 3 times slower than QMap lookups.

How can this be? The documentation says that QHash lookups are supposed to be faster.

The reason is this: Because QMap uses the < operator, it does not need to inspect every character of each key string. Instead, it compares the key string to other key strings, one character at a time, and stops if it discovers a difference before reaching the end of the string. This results in much less computation when locating a key’s value. So much less computation in fact, that even though QMap’s lookup algorithm is logarithmic, it is actually faster than QHash’s constant-time lookup algorithm on the same dataset.

I should offer the disclaimer that QHash will likely be faster than QMap for larger quantities of entries with that key size, but I was unable to test larger quantities due to the limited amount of RAM in my laptop.

Conclusion

When optimizing for speed, always consider all factors. Just because an algorithm is constant time does not mean that it is always faster than a non-constant time algorithm. And always remember to measure execution speed before assuming that one approach is faster than another.