Puzzling Python

Because many programmers I respect are fond of Python, I’m forcing myself to learn it (again). On my previous attempts, I ran into some annoyances and dropped it. This time, I’m trying to solve some Facebook puzzles by gluing together existing libraries. I tried the Peak Traffic puzzle, which is to enumerate the maximal cliques in a directed graph. Some searching led me to an excellent library for graph problems called NetworkX. Some tinkering with the library hit the first problem: on very large complete graphs, I kept blowing the stack. That’s because v0.99 uses (perfectly normal) recursion to solve the problem but hits a stack limit in Python. In v1.0, the devs were forced to implement their own stack to get around this annoying limit. The second issue is that this is a branch & bound optimization problem, which can easily make use of many cores. Instead, the program maxes out at 50% cpu. This would have been trivial to parallelize in .NET 4.0 or Fortress with parallel blocks.

So I thought maybe Jython might solve both problems. Unfortunately, Jython (v2.5) is 4X slower than CPython running the exact same script on a very large input file. Next, Jython also sets the recursion limit to 1000 stack frames, so I ran into the same issue. I know I can raise it manually, but that doesn’t help when I write a program and don’t know how far I’m going to recurse. Why set an arbitrary limit? Just throw the same runtime exception when the platform runs out of space like every other language on the planet. Finally, I imported jsr166, an excellent threadpool library for Java. But on a simple fibonacci test, the performance was so much slower than the equivalent Java program that it simply wasn’t worth considering. It did, however, make use of more threads, though it rarely went over 80%. Maybe there’s a lock somewhere that’s slowing things down. In summary, Jython sucks and CPython is acceptable.

Advertisements

2 comments

  1. Hans

    “That’s because v0.99 uses (perfectly normal) recursion to solve the problem but hits a stack limit in Python.”

    Hm, did you try increasing the limit using sys.setrecursionlimit()? Assuming the library doesn’t do that already…

    • Pinku

      As I said in the second paragraph, I know I can raise the limit with setrecursionlimit, but that doesn’t help when I write a general purpose library and don’t know when people might blow the stack. If I set the stack limit to 1500, what happens when someone uses my library for an even bigger problem size? Python should use all the stack available and fail when the OS raises an error.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s