Archive for June, 2008

Composable Parallel.Foreach loops

June 26, 2008

Microsoft has an excellent library for parallel programming in .NET. However, there’s a strange problem with nested parallel loops. Consider this sample code:

      void MultiplyMatrices(int size, double[,] m1, 
                            double[,] m2, double[,] result)
      {
          Parallel.For(0, size, i =>
          {
              for (int j = 0; j < size; j++)
              {
                  result[i, j] = 0;
                  for (int k = 0; k < size; k++)
                  {
                      result[i, j] += m1[i, k] * m2[k, j];
                  }
              }
          });
      }

Why aren’t all the loops turned into parallel loops? The documentation says that you should not convert the inner loops into parallel loops because it will degrade performance. This makes sense because there’s a significant cost to run the loop body through the scheduler, particularly in this example where it’s doing a cheap multiply. But what happens if the work in the inner loop is expensive? Then you would want to run the inner loops in parallel, too. So it really depends on how expensive the work is relative to the cost of using Parallel.For.

I would like to write a library of matrix operations that I can reuse in my code. Therefore, this MatrixMultiply method should take a delegate parameter and use that in the loop body. Now I won’t know whether or not I should use parallel loops. Or, what if I compose several matrix operations, each of which uses some parallel loops? Performance may actually degrade because I’m using too much parallelism.

The problem with the parallel library is that the operations aren’t composable. To be fair, I would have shipped the same parallel library because I don’t know how to solve this problem either. One crude idea is to limit parallelism to a small number based on the number of available CPUs. You maintain the nesting level of these parallel loops in the environment somehow (like nested transactions). At some point, it’s not worth doing things in parallel anymore, so convert the parallel loops into normal, serial loops. I assume the Fortran folks have already figured out some solution. I’ll post it when I find it.

Invest like an economist

June 22, 2008

The American Economic Association invests their $18M account in a small collection of index funds. The main items are 45% in SP500, 25% in international markets, and 15% in bonds.

Peer-to-peer energy storage

June 17, 2008

One problem with electricity is that it’s difficult to store. A power plant has to be capable of generating sufficient power for peak demand on a few summer afternoons, even though most of the year the plant will be underutilized. One wild idea is to store energy in electric car batteries at night, then consume it during the day to offset these peak demand days. This was described by Ed Legge of the Edison Electric Institute:

Legge: Plug-in hybrids actually are one way that could give us a giant virtual battery.

Legge says some utilities eventually hope to borrow electricity from electric car batteries during peak energy hours. The cars would recharge at night, when demand is at its lowest.

Need Bittorrent for iTunes

June 17, 2008

This American Life, a fantastic radio show on NPR, reports that it costs $150k a year in bandwidth charges to deliver their show as a podcast on iTunes. A solution is to use the Bittorrent protocol to spread the load across lots of nodes, thus reducing the bandwidth load on the show. Sadly iTunes doesn’t support it, but there is a web proxy tool that can fake it called iTorrent. I think it was abandoned and then someone else cleaned it up a bit. If NPR could convince their audience to start using something like this tool, it would reduce their costs. And I’m sure there are many other podcasters that would like to reduce their bandwidth costs. So if you’re looking for something to do, here’s an open problem that needs some work.

Bionic vision

June 13, 2008

I just got my vision tested 6 months after LASIK surgery. I think my prescription lenses were at -2.5 diopters and almost no astigmatism. I can now see at 20/15, which is better than “normal” vision. No dryness, itching, halos or blurred vision. There was mild dryness for 2 months after surgery, but one morning it completely disappeared. When I was researching LASIK I ran across lots of scary stories on anti-LASIK forums. But the studies I read showed a very high success rate, especially for my simple prescription. The neighborhood kids no longer call me a “four-eyed, skinny dork”. Now it’s just “skinny dork”, but I’m working on that.

32-bit OS and 4GB memory

June 11, 2008

I helped a friend buy a new laptop that came with 4GB of memory and Vista 32-bit OS. [ed: I've twice bought refurbished laptops from Dell and have not had any problems. This one is $600 less than a new model!] Unfortunately, the OS reports that only 3.5GB is available. So what happened to that last 0.5 GB? Here’s an excellent explanation. To summarize, the computer pretends that all memory and devices (graphics card, PCI cards, BIOS, etc) are in the same flat 32-bit address space. The OS reserves the top portion of the address space (that last 0.5GB for me) to communicate with these other devices (read this); therefore, it’s not mapped to RAM. Apparently, XP once supported PAE, an Intel hack that fakes support for 64GB, but removed it with SP2 and 32-bit Vista because poorly written drivers were causing system crashes. 32-bit Linux, on the other hand, supports PAE and can use up to 64GB of memory. Why didn’t MS allow people to turn on PAE at their own risk? Probably so they could charge more for server OSes.

Information cascades

June 4, 2008

Robert Schiller, an economist at Yale, explains why the experts missed the housing bubble. “Were all these people stupid?” he asks. “It can’t be. We have to consider the possibility that perfectly rational people can get caught up in a bubble. In this connection, it is helpful to refer to an important bit of economic theory about herd behavior.” He points to this paper (PDF) on information cascades, which explains why people follow fads. Assume that three people (A, B and C) want to bid on a house and no one has perfect information on prices (no one ever does). Let’s say A decides to bid 20% over the list price for the house. Even if B believes that is a high price, he doesn’t know A is wrong. Instead, the fact that A paid more can influence B into thinking that his hunch is wrong, and maybe A was right. Let’s say B decides to bid 25% for the house. Now C, who also thinks this is a high price, sees that both A and B believe the house is worth much more. That’s a much stronger influence on C to believe his original estimate is wrong and follow the herd. So a few bad decisions can snowball and influence the herd, thus creating a housing bubble. Even the “experts” fall for this. The paper generalizes the idea and puts a little mathematical rigor on it, but the idea is that when we aren’t sure about something, we can easily be influenced by other people’s decisions. That’s also why we had the Internet bubble.

Xubuntu on Sony Z505

June 4, 2008

I finally got Xubuntu 8.04 running on a 10 year old Sony Z505RX laptop using the instructions found here and here. It runs OK on the laptop’s meager 128MB RAM and Pentium II processor. Unfortunately, there’s a weird high-pitched tone (capacitor buzz?), but it goes away when I stick a Wifi card in the slot. It is usable, but it takes a very long time to run updates or compile anything. Maybe it’ll run better on my XO laptop.

VHS was better than Betamax

June 4, 2008

It is commonly believed that Betamax was the superior video recording technology, but was beaten by VHS due to better marketing and pornography. This is not true. Wikipedia does a good job of describing the history of the format war. In short, consumers cared more about cost and recording time than about video picture quality. To record a TV movie would require 2 expensive 1-hour Betamax tapes, or 1 cheaper VHS tape. Hollywood adopted VHS because they could rent movies on 1 tape.  In fact, VHS quickly matched regular Betamax in quality anyway. Regardless, porn had little if any influence on the format war.

WebKit’s JS Interpreter was an AST walker

June 4, 2008

I was surprised to discover that the current previous JavaScript interpreter in Apple’s WebKit, which is the engine for the Safari browser, is very slow. Go here, scroll down to “Why It’s Fast”. They were interpreting the syntax tree rather than compiling down to bytecodes. The first line says “Like the interpreters for many scripting languages…”, which implies that there are other interpreters using this awful technique. If anyone knows of a popular interpreter that does this, please tell me in the comments.

A better way to implement an interpreter is as a bytecode interpreter. Most people implement the interpreter as a big switch statement, but an obvious improvement is to use direct threading (described in that post). Most interpreters use a stack to store operands, so here’s a paper that tries to reduce the redundant load/stores in a stack-based VM, and another that reduces the code bloat from this technique. Better yet, this paper describes a register layout that reduces the number of bytecode dispatches. Another improvement is to use “selective inlining”, which is described here by the guy working on the Squeak interpreter. This trick is complicated, but improves performance substantially. The point is that there’s been lots of work on efficient interpreters, yet WebKit uses a technique taught in an Intro to Compilers class that is intended to be easy for undergrads to understand. I can’t believe they used this in a real product.

[ed: Apparently, WebKit's previous interpreter was still faster than most other interpreters (IE, Mozilla, Opera, etc).]


Follow

Get every new post delivered to your Inbox.