Rerunning an old benchmark

In 2011, Robert Hundt published this paper testing the performance of several languages on a loop detection task typically found in compilers. The code is available here. Go had to tweaked a bit. I ran the benchmarks on EC2 m3.medium instance with Ubuntu 14.04 (ami-1d8c9574). For some benchmarks there’s a Pro version where someone hand optimized the code. Here are the results: 

Language Runtime(sec)
C++ 48
Java 93
Java Pro 40
Scala 74
Scala Pro 28
Go 92
Go Pro 63
Python 891
PyPy 138

 

The following lessons can be learned: PyPy is awesome. Java, Scala, and Go are in the same league. Finding an expert to optimize your hotspots might double performance.

Versions:

  • g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2
  • java version “1.8.0_05″
  • go version go1.3 linux/amd64
  • Python 2.7.6
  • Python 2.7.6 (2.3.1+dfsg-1~ppa1, Jun 20 2014, 09:27:47) [PyPy 2.3.1 with GCC 4.6.3]
  • Scala code runner version 2.11.1 — Copyright 2002-2013, LAMP/EPFL

 

My Next Language

I have a decent background in programming languages. In a professional context (i.e., for real money) I’ve used C#, Scheme (my favorite), Lisp, and C. For projects I’ve also used Java, C++, and Python. For tinkering around I’ve used Prolog, Haskell, OCaml, Factor and Javascript. I’ve studied many more, including Go, F#, Ruby, Scala, Clojure, and on and on.

I want to pick a new language for large-scale production software. Mostly backend data processing and some web development. Here are my criteria, in order of importance:

Static typing w/ inference: Though I still love Scheme, I want the compiler to catch as many errors as possible. Static types are great but verbose. Inference means I only need to add a few declarations to get type safety. Other safety features would be great, like dependent types and C#’s Contracts. This eliminates all dynamically typed languages from consideration. Languages without inference are less likely to get picked.

Performance within 5x of C: If a language is slower than this it can become a hinderance for some projects. This eliminates all languages that run on an interpreter.

Vibrant community: This is a painful one, because it eliminates lots of interesting languages that aren’t mainstream. If you want to ask questions, get help, get hired or hire people, a popular language is necessary.

Libraries & Tools: A major benefit of a big community is access to lots of libraries and tools. Time to market is critical, and it’s better to grab a library than roll your own. Tools like IDEs, debuggers, and other nifty tools really help (like an interpreter!). Also, I’d like support for running on different cloud services, like Heroku or Google Cloud.

Ease of use: I love Scheme because it is small and simple. Even C is still a brilliant diamond. C++ is just too much. I really like Scheme’s macro system, so some kind of language extensibility would be great. Unfortunately most languages pile on features rather than simplify.

Platform neutral: I want to be able to use this language on Linux and Windows, but also iOS and Android if possible.

Support for parallelism: CPUs are only going to add more cores. It is important that the language/runtime not disallow parallelism (e.g. the GIL in Python). Even better would be support for distributed programming, like Erlang.

The contenders, ranked by meeting my criteria:

  1. Scala: Stands on top of the gigantic Java ecosystem. The Akka library offers good parallelism. A bit more complex than I’d like. The only thing missing is support for writing iOS apps.
  2. F#: With Xamarin it really runs everywhere, including iOS and Android. It stands on the less gigantic .NET ecosystem.
  3. Go: No mobile dev, a small but growing community. Go doesn’t have type inference, but declarations don’t look too bad. Support for parallelism is excellent.
  4. D: No mobile dev and a small community. Facebook could push D more into the mainstream. Less complex than C++, but still more complex than I’d like. Why isn’t D more popular?
  5. Haskell: No mobile, small but influential community. Probably the best language here. But integration with cloud services would be poor, and connecting with various middleware products might be complex.
  6. Swift: It’s only been a month, but Apple suggested it could be used for server-side code as well. With Apple and iOS behind it, there’s no doubt this will have a large community. It probably won’t run on Windows nor Android. No support for parallelism beyond GCD.
  7. Rust: In the same space as D, this language is certainly buzzword compliant. It’s too early to tell if it will become popular.

The finalists are Scala and F#, which are really in the same space. The difference is the JVM is excellent everywhere, but Mono can be quite a bit slower. Also, lots of big companies are using Scala, which means the community will grow bigger over time.

The winner is Scala. 

 

Apple’s Continuity

The people at Apple read my old blog post and answered with Continuity. This is a far better approach than Microsoft’s Surface Pro, which attempts to physically merge 2 devices into 1. Apple will let each device do what it’s good at, but integrate those devices where possible. It’s a terrific idea, but it’s likely Apple will not open this feature up to developers, limiting integration to that which Apple deigns to give the masses. Microsoft generally does the opposite, granting developers wide access to do as they please. This usually degenerates into a cacophony of half-assed broken implementations by morons. But at least it would be open. 

Microsoft still has an opportunity to leapfrog Apple with it’s Nokia arm. Imagine a new Windows Phone 9 running on a Surface Phone designed by Nokia. You could have integration between desktop & phone apps like WhatsApp, Skype, Twitter, etc. And MS could have deep integration between Office for phone & desktop, something like Tempo. The thing is, MS Research has been working on this stuff for years. For example, take a look at Eric Horvitz’s early publications in HCI. IIRC they had a smarter version of Mac’s Notification Center in the late 90s using similar techniques found in Google Now. They’ve got lots of papers about working seamlessly across devices. They have the know-how, but it’s not making it into their products. 

First time Mac User

The defining characteristic of the Macs I used in the 80s is they always lost my homework. Therefore, I’ve always owned a Windows PC for personal use and used Unix for development. But this year I bought a 13″ Macbook Pro Retina w/ 16GB RAM and 512GB SSD. The reason I went Apple is because the PC laptops were underwhelming and cost the same or more than the rMBP. No other laptop gives me 16GB of memory in this small package. The Macbook is a fantastic piece of hardware. Battery life is amazing! The trackpad is better than Windows 8 touchscreens. 

On the other hand, Mac OS X is underwhelming. The big issue is my laptop crashes from a Sleep Wake Failure almost every week. The Apple Genius was clueless, and the forums are filled with people stuck with the same issue. I’ve never had a machine that crashed this much, neither Windows nor Linux. Other recent mac converts have also suffered random crashing for unknown reasons. 

I think a lot of devs get Macs because it’s Unix underneath. Why isn’t there a decent package manager then? Homebrew is ok, but nowhere near apt or rpm. I do all my dev stuff in a Linux VM, which is much much better. Many devs use Vagrant, which is the same thing. In both cases the host OS doesn’t matter. So I think the Unix underpinnings is moot. On the other hand, you need Macs for iOS, just like you need Windows for commercial .NET work. There’s no getting around that. 

Here are some small annoyances I’ve run into:

  • No presentation mode. 
  • Doesn’t automatically change security based on network. On Windows, I can set locations as Home, Work or Public. I use ControlPlane to simulate this, but it’s wonky. I also use it to simulate Presentation mode. 
  • Often when I resume after the screen turns off, it takes a very long time before the windows appear again. The screen remains black  (though I can see my pointer) and the screen turns on briefly when I switch desktops. I have no idea what’s going on here.
  • Clamshell mode turns on when I have power & an external monitor plugged in, but no keyboard. The docs say all 3 are required. The Apple guy had no idea. 
  • The filesystem sucks. Backups require HFS+. It does whole file copy rather than block copy, so VMware tells you to turn off backups of images because it will copy the entire 50GB file when a single bit changes. 
  • Integration with Windows networks hasn’t worked well. Linux does a much better job. Windows has 90% of the PC market, so Apple should make this work smoothly. 
  • External monitor support is wonky. When I’ve given presentations, the OS freaked out a few times. I lost access to the projector and couldn’t click on anything. Had to reboot in the middle of a talk. 
  • If I close all the windows for an app, why doesn’t the app shutdown? I have to hit Command-Q. 
  • I can’t Cmd-Tab between windows in an app, only between apps. Cmd-` works sometimes. 
  • Activity Monitor is not as good as Win8’s Task Manager. 
  • I use BetterTouchTool to get Windows 7’s Snap feature. 
  • Virtual desktops aren’t handled as well as Linux desktops. 
  • Copy & paste files is weird. I must be doing it wrong. On Windows I right-click & drag. On Mac I think it’s click, drag and hold Command. It’s like Twister for my fingers.
  • Connecting HDMI to my TV wasn’t easy. I use an old Windows laptop instead. 
  • No fine-grained volume control like Window’s audio mixer. Also, on Windows it will automatically reduce audio when you use a voice app like Skype. I don’t think OSX offers that kind of audio control, else I’d do it with ControlPlane.
  • Is there a way to tweak power use like Windows? I didn’t see anything. Thank goodness the defaults give me great battery life. 
  • Everyone talks about the free apps that come with OSX: Calendar, iPhoto, iMovie, etc. I don’t use these and don’t really care. 
  • I still don’t like the menu bar. On a laptop every vertical pixel is precious. On Windows the task bar can autohide. I hate that Unity and Gnome copied this look. 

If a PC manufacturer just copied the rMBP at a good price I’d switch back to PCs. Windows 8 with Classic Shell is fine. I live in VMware anyway. The Mac hardware is so good right now that I can live with these minor inconveniences. The combination of 16GB RAM and PCIe SSD means I can run 3 VMs without any slowdown. Overall, it’s a great machine and an okay OS. 

I Use This

At usesthis.com they interview a bunch of people to ask about their current computer setup and their dream setup. Most of them are using aged machines, ultraportables (Mac Air is popular), and Emacs/VIM for editing (a few slickedit fans). Most of them have little to add in their dream setup (more battery life, better cloud sync). Are our tools finally good enough? Not for me.

Current setup: A 4 year old Sony Z laptop, 8GB RAM, SSD. I have a 6 year old homebrew desktop w/ a few terabytes of storage and 2 old monitors. I use Windows 7 and do most development in Ubuntu within VMware player (using Unity). When I need it, I’ve got instances running in AWS. I’ve got an ancient jailbroken iPhone, an iPad 1, and an older Kindle. Everything I own is old and works just fine. I use Emacs, bash, Chrome, and the usual grab-bag of software. I even use Visual Studio.

Dream setup: We all own a suite of devices that work OK in isolation. Each device does some things well for certain use cases. But when those devices are together, they should work together much better. Bluetooth and AirPlay are steps in the right direction, but ad-hoc and incomplete. When I sit in my car, my music and phone calls are routed through the car’s sound system. However, I can’t sync the map, listen/send text messages with my voice, nor ask Siri for help. AirPlay is really nice if you have Apple TV. You can send music, video and photos to the TV when you wish. You can send music to the stereo w/ Apple’s AirPort. Google Voice has a feature that allows you to move a call to another phone without hanging up. The Kindle can sync my current location in a book to every device, but I have to move papers and articles over manually.

I want more of this sort of slick integration between my devices. I’d like to group my work into virtual workspaces that have the same intention. Development stuff in one workspace, entertainment in another, communication in another, and so on. If I have 3 monitors, it should spread my workspaces across them. If I get a call while at my computer, I should be able to route the call to my computer’s speakerphone. When I’m reading something in Chrome on my desktop, I should be able to move it to my iPad. When I set my laptop down next to my desktop, it should allow me to move my work seamlessly to my desktop (and vice versa). File syncing is only the first step. I want my activities synced and moved across devices based on what I want, what the device is capable of, and what my work modality is (reading, typing, watching, listening, talking). Some of this can be done now in an ad-hoc way, and I’ve tried to glue the pieces together. But there’s enough friction to still make it fail in irritating ways.

We’re all going to accumulate more and more smart devices. We need to them work together better.

Combinations

Python code to generate k-combinations from N numbers. Included is a recursive and an iterative version. It’s easy.

def choose_rec(n,k):
  a = range(k)
  choose_rec(0,n,k,0,a)

def choose_rec_aux(col,n,k,start,a):
  if col == k:
    print a
    return
  for i in range(start,n-k+col+1):
    a[col] = i
    choose_rec_aux(col+1,n,k,i+1,a)

def choose_iter(n,k):
  a = range(k)
  f = range(n-1,n-k-1,-1)
  f.reverse()

  print a
  while True:
    i = k-1
    while i >= 0 and a[i] == f[i]:
      i = i - 1
      if i < 0:
        return
      else:
        a[i] = a[i] + 1
        for i in range(i+1, k):
          a[i] = a[i-1] + 1
      for j in range(a[i], n):
        a[i] = j
        print a

Bloom Filter in C#

Here’s a simple implementation of a Bloom Filter.

    public class BloomFilter
    {
        HashAlgorithm hash;
        int m, n, k;
        BitArray table;

        public BloomFilter(HashAlgorithm h, int size, float falsePositiveRate)
        {
            hash = h;
            double bits = -(size * Math.Log(falsePositiveRate)) / Math.Pow(Math.Log(2), 2);
            double hashes = -Math.Log(0.7) * bits / size;
            n = size;
            k = (int)hashes;
            m = (int)bits;
            table = new BitArray(m);
        }

        IEnumerable<int> Probe(byte[] input)
        {
            int chunks = hash.HashSize / 32;
            for (int i = 0; i < k; i++)
            {
                byte[] val = hash.ComputeHash(input);
                for (int j = 0; j < chunks && i < k; j++, i++)
                    yield return Math.Abs(BitConverter.ToInt32(val, j)) % m;
            }
        }

        public void PutValue(byte[] input)
        {
            foreach (int index in Probe(input))
                table.Set(index,true);
        }

        public bool Exists(byte[] input)
        {
            foreach (int index in Probe(input))
            {
                if (!table.Get(index)) return false;
            }
            return true;
        }

    }