Archive for March, 2008

Programmer productivity differences

March 20, 2008

Steve McConnell defends the idea that there’s a 10X difference in productivity between programmers. Unfortunately, we don’t have a clear idea of what “productivity” in programming really means. It turns out even economists who study labor productivity have only a crude, high-level definition: output divided by input. For a business, this is fairly easy: revenue divided by expenses. If we narrow this down to programmers, we can say the input is a business requirement doc and the output is version 1 of the software. In this sense, productivity is defined as the cost to deliver software, where cost ($) is a proxy for time, number of devs, free sodas and massages, and whatever else is required. The steps for converting requirements into software are basically design and code. Design stands for stuff like gathering requirements, designing code, choosing technologies, etc. Code stands for programming, testing, configuring, etc. Of course, bad design leads to bad code, so design is more important than code.

To measure the “code” stage, one could give (and explain) detailed pseudo-code to a group of developers and time how long they take to write bug-free code. It’s fairly mechanical; people working like “sufficiently smart compilers”. Measuring the design phase is far more complicated. It depends on a person’s familiarity with a domain. There might be a way to grade people roughly (good, mediocre, bad), but not measure people precisely nor consistently. Furthermore, I suspect there’s something akin to Amdahl’s Law at work here. If, as I believe, design is far more important than code, then optimizing the code stage will have very little impact on overall project time. If coding from specs takes 20% of the project time, optimizing that away (infinitely fast) would yield at most a 25% improvement in project time (totalTime/designTime).

Of course, devs rarely “design” software up front. Instead, they use their programming language as an executable specification language with which to explore the design space. That’s a polite way of saying that most devs just hack up a half-assed solution. The design and code phase are mashed together, making it difficult to figure out which phase was really the bottleneck. But I’ll bet it would take nearly the same effort if devs were simply asked for a moderately detailed design. The point, therefore, is that good programmers are probably better designers, but it’s difficult to measure objectively. And it’s hard to tell that they’re doing something different because they’re just writing code like everyone else. Plus, it’s unlikely they’ll be as effective if you drop them into a brand new, unrelated domain: graphics to planning, for example. While it’s true that some programmers are better than others, I doubt it’s 10X, nor will it be a consistently large multiple.

Eager lazy languages

March 14, 2008

The great thing about the Internet is that CS luminaries might wander by your little blog and tell you you’re wrong. I thought there might be a way to exploit laziness to make more efficient use of the memory hierarchy on conventional machines. I was thinking about how the FFTW compiler gets great performance. At a high-level, a planner figures out how best to execute many “codelets” (small chunks of optimized code) for an instance of a problem (FFTs in this case) to achieve maximum performance. (SPIRAL does something similar) One insight is to use a cache-oblivious memory model rather than a conventional RAM model. The planner breaks a large problem down into pieces that fit into a cache, then apply the appropriate codelets to each chunk to yield a solution.

But what about laziness? In a lazy language, you get a graph of possible computations and the compiler determines which node to evaluate next. In conventional lazy languages, they evaluate only those nodes that are, or will soon be, needed, i.e. lazily. Eager Haskell, on the other hand, is designed to evaluate most things immediately, and only suspend computation when they run out of resources (heap or stack space). We can change the resource constraints to be cache space instead. That is, only compute a subset of an array that fits in a single page of cache, then maybe suspend computation. What Haskell gives you is a clean operational semantics with much more flexibility in scheduling computations.

Combine both ideas and you might get something useful. Write your program in a form of Eager Haskell (no infinite data structures, though) that uses optimized low-level functions (i.e. codelets). Take the lazy graph and use FFTW’s techniques to determine a good execution order. Compile and enjoy. If you look at what these guys did to get good performance from the Cell processor [1, 2], you’d prefer some help from the compiler. A simple 60 line C program became a 1200 line optimized mess for the Cell. That’s not fun.

Drunks know they are asses

March 10, 2008

From NYTimes story:

In a series of studies in the 1970s and ’80s, psychologists at the University of Washington put more than 300 students into a study room outfitted like a bar with mirrors, music and a stretch of polished pine. The researchers served alcoholic drinks, most often icy vodka tonics, to some of the students and nonalcoholic ones, usually icy tonic water, to others. The drinks looked and tasted the same, and the students typically drank five in an hour or two.

The studies found that people who thought they were drinking alcohol behaved exactly as aggressively, or as affectionately, or as merrily as they expected to when drunk. “No significant difference between those who got alcohol and those who didn’t,” Alan Marlatt, the senior author, said. “Their behavior was totally determined by their expectations of how they would behave.”

In a repeat of the session performed for a coming documentary, one participant insisted that she could not have been drinking because alcohol always made her flush.

“We told her that, yes, in fact she was drinking it,” Dr. Marlatt said. “She immediately flushed.”

Homemade Coherence

March 5, 2008

I’m running Xubuntu in VMware Server on XP Pro. I’ve turned off X11 and all other GUI stuff in Ubuntu, which brings base memory usage down to 130MB. I’m running Cygwin/X on XP, which allows me to run Emacs in Ubuntu (via ssh) and display it on my desktop along with my Windows apps much like Parallels’ Coherence. The important thing is to change the GRUB entry for Linux to “nosplash”; otherwise, it runs the XFCE graphical splash page and eats up way more memory. All I really need is Emacs; the rest is just eye candy.