I want to pick a new language for large-scale production software. Mostly backend data processing and some web development. Here are my criteria, in order of importance:
Static typing w/ inference: Though I still love Scheme, I want the compiler to catch as many errors as possible. Static types are great but verbose. Inference means I only need to add a few declarations to get type safety. Other safety features would be great, like dependent types and C#’s Contracts. This eliminates all dynamically typed languages from consideration. Languages without inference are less likely to get picked.
Performance within 5x of C: If a language is slower than this it can become a hinderance for some projects. This eliminates all languages that run on an interpreter.
Vibrant community: This is a painful one, because it eliminates lots of interesting languages that aren’t mainstream. If you want to ask questions, get help, get hired or hire people, a popular language is necessary.
Libraries & Tools: A major benefit of a big community is access to lots of libraries and tools. Time to market is critical, and it’s better to grab a library than roll your own. Tools like IDEs, debuggers, and other nifty tools really help (like an interpreter!). Also, I’d like support for running on different cloud services, like Heroku or Google Cloud.
Ease of use: I love Scheme because it is small and simple. Even C is still a brilliant diamond. C++ is just too much. I really like Scheme’s macro system, so some kind of language extensibility would be great. Unfortunately most languages pile on features rather than simplify.
Platform neutral: I want to be able to use this language on Linux and Windows, but also iOS and Android if possible.
Support for parallelism: CPUs are only going to add more cores. It is important that the language/runtime not disallow parallelism (e.g. the GIL in Python). Even better would be support for distributed programming, like Erlang.
The contenders, ranked by meeting my criteria:
- Scala: Stands on top of the gigantic Java ecosystem. The Akka library offers good parallelism. A bit more complex than I’d like. The only thing missing is support for writing iOS apps.
- F#: With Xamarin it really runs everywhere, including iOS and Android. It stands on the less gigantic .NET ecosystem.
- Go: No mobile dev, a small but growing community. Go doesn’t have type inference, but declarations don’t look too bad. Support for parallelism is excellent.
- D: No mobile dev and a small community. Facebook could push D more into the mainstream. Less complex than C++, but still more complex than I’d like. Why isn’t D more popular?
- Haskell: No mobile, small but influential community. Probably the best language here. But integration with cloud services would be poor, and connecting with various middleware products might be complex.
- Swift: It’s only been a month, but Apple suggested it could be used for server-side code as well. With Apple and iOS behind it, there’s no doubt this will have a large community. It probably won’t run on Windows nor Android. No support for parallelism beyond GCD.
- Rust: In the same space as D, this language is certainly buzzword compliant. It’s too early to tell if it will become popular.
The finalists are Scala and F#, which are really in the same space. The difference is the JVM is excellent everywhere, but Mono can be quite a bit slower. Also, lots of big companies are using Scala, which means the community will grow bigger over time.
The winner is Scala.
I reported earlier that the N-Queens problem on F# was too slow. I translated the benchmark in C# and it ran 2X faster than F#. It was essentially the same program… lots of recursion and list processing. So how can F# be 2X slower than C# for the same code? One simple instruction: isinst. I learned long ago that type checking on the CLR is too slow for dynamic languages. Though F# is statically typed, it still performs type checking on discriminated union types. Specifically, operations on lists must always check if the list is null. F# does this by checking if the object is a Nil class: “obj is Nil” in C#. Type checking is extremely slow and should be avoided whenever possible in hotspots. List processing is definitely a hot spot.
When I rewrote the N-Queens problem I used my own list class which used null to represent Nil. Checking “obj == null” is extremely fast. I then modified my list to use a Nil class and performance dropped down to F# speed. The solution I used with my Scheme compiler was to use null to represent the end of the list. An alternative is skip the nil check and wrap the list access in a try-catch. It only fails in the exceptional situation anyway, since properly written code will check for nil before it calls hd. This should speed up list performance dramatically.
I’m writing a bunch of Scheme benchmarks in F#. The N-Queens problem is determining how to place 8 queens on an 8×8 chess board so all are safe. Computing all 92 possibilities does a fair amount of recursive function calls and list processing. To do this 2000 times with F# takes about 17 seconds. For comparison, it takes 26 seconds with Scheme48 (a Scheme interpreter). Compiled implementations of Scheme can solve this problem in 2 to 3 seconds. Why is F# only 2X faster than Scheme48? I’ll report the rest of the benchmarks later.
F# is moving out of research into a first-class language running on .NET. F# is a derivative of OCaml, a strongly-typed functional language with imperative and OO features. I’ve had the great fortune of working with Don Syme on Project 7 (a largely failed attempt to port “academic” languages to .NET) and at MSR Cambridge a long time ago. He’s a very sharp guy who also contributed to the design and implementation of generics in C# and the CLR. Who says nothing ever comes from research groups?