Databases layout tables carefully on the disk because disk I/O is insanely slow relative to memory. Scientific programmers organize data structures (DTs) carefully to maximize cache utilization because memory is insanely slow relative to the cache. Since writing cache conscious (CC) DTs is hard, the folks in MIT’s Cilk group came up with cache oblivious DTs (CO). This post explains the idea, albeit verbosely. The basic idea is to pack data into arrays so it will all load into the cache. These guys show that CC still beats CO by a mile, but it’s soooo difficult to implement. Some dudes at IBM show that a garbage collector can improve performance on some benchmarks simply by improving memory locality. It seems to me that if the GC documentation told me that it will usually group objects in a particular order, then I can write my DT code to take advantage of that locality. The GC is constructing CO DTs for free. But since I don’t know how the GC groups things, I could inadvertently be jumping all over main memory and suffering zillions of cache misses. That ain’t cool.
Archive for October, 2007
Cache oblivious data structures for free
October 19, 2007Examining .NET’s JIT output
October 18, 2007This post explains how to look at the optimized x86 generated by the CLR.
F# hits the big time
October 18, 2007F# is moving out of research into a first-class language running on .NET. F# is a derivative of OCaml, a strongly-typed functional language with imperative and OO features. I’ve had the great fortune of working with Don Syme on Project 7 (a largely failed attempt to port “academic” languages to .NET) and at MSR Cambridge a long time ago. He’s a very sharp guy who also contributed to the design and implementation of generics in C# and the CLR. Who says nothing ever comes from research groups?
Send code, not data
October 18, 2007There are times when it is more appropriate to send a chunk of code to another machine rather than receive a large amount of data. One would think this would be more practical with portable languages running on the CLR or JVM. One should be able to serialize a method, then regenerate the method on another machine. Fortunately, this post describes how to generate a DynamicMethod from a MethodInfo in .NET. The next step is to serialize the MethodInfo instance, which looks a bit tricky. I’m still looking into that.
Compiler generator using method inlining
October 18, 2007It should be possible to write a language interpreter on top of Java or .NET and rely on method inlining to instantly get compiled code. Implement most of your interpreter opcodes as small static methods: Add, Subtract, LessThan, etc. Translate your language into Java/C# methods that call these opcodes. When the JIT sees all these small static methods, it should inline and optimize them. Presto! You’ve just written a compiler. It’s a bit like using partial evaluation to generate a compiler from an interpreter. I think Peter Sestof did this the hard way a long time ago.