The other day, I was reading yet another benchmark: “Lambda Land: Functional Languages Need Not Be Slow”. (Is it a benchmark? Nice article anyway.) It compares Racket, Python, Rust, Julia and JavaScript trying the Collatz conjecture up to 500,000.
First, two disclaimers:
- There are lies, damned lies, statistics and benchmarks.
- The ultimate way of cheating in a benchmark is changing the compiler.
The idea is to try to improve the time of the Racket program following the spirit of the benchmark, and then extract some ideas to improve the compiler to automate the improvements when possible. The Racket compiler transforms the program to Chez Scheme code, so most of the optimization pass improvements will be actually in the Chez Scheme compiler and they will hopefully make programs in both languages faster.
(I don’t know enough of the other languages to try something similar, so I won’t even try. I’d like to read any similar analysis.)
Each benchmark is different. In my opinion, the spirit of this benchmark is to write nice code and see what the compiler can do. I’m more used to benchmarks that try to squeeze every single millisecond and use curse primitives like #3%$unsafe-fl*+! so this one has a very different spirit, only allowing nice code.
With the changes, the run times are:
Program Version Seconds
Original from Lambda Land (Racket 9.0) 17.9
Faster now, but not so nice (Racket 9.0) 5.3
Faster in the future and nice (Racket 9.0) 17.8
Faster in the future and nice (Racket 9.3?) 4.9 (expected?)
(All times measured at the command line, outside DrRacket. By default DrRacket has “debugging” enabled and that adds a lot of additional time in exchange for better error reports.)
As a baseline, this is the original code with two highlighted expressions that will be the slow ones:
(define (count-collatz n [cnt 1])
(cond
[(= n 1) cnt]
[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))]))
Fast but not so nice code
My first step was to write not nice code that is fast. But not very ugly code, just a little ugly. The idea is to have a tradeoff between nice and fast code. I tried a lot of variants until I understood these two expressions were the most important to improve the speed.
The changes are
Split the function in a part for fixnums and another for bignums.
The division / is slow, so let’s replace it with unsafe-quotient. (green part)
Also even? is slow, so let’s split the logic and use another unsafe (blue part)
(require racket/fixnum)
(require racket/unsafe/ops)
(define (count-collatz n [cnt 1])
(if (fixnum? n)
(cond
[(= n 1) cnt]
[(zero? (unsafe-fxremainder n 2)) (count-collatz (unsafe-fxquotient n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])
(cond
[(= n 1) cnt]
[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))
Not so ugly in my opinion, and the run time goes from 17.9 seconds to 5.3 seconds on my computer. Note that the first branch for fixnums has a few tricks, but the second is just to the original code.
Division and quotient
We can measure the run time when we change the green expression:
Blue Green Seconds
(even? n) (/ n 2) 17.9
(even? n) (quotient n 2) 17.8
(even? n) (fxquotient n 2) 17.3
(even? n) (unsafe-fxquotient n 2) 7.6
Let’s think of them like a chain of transformations.
Changing / to quotient is super difficult for the compiler. The optimization pass doesn't know enough algebra to make this transformation. (I think someday this is possible in very specific cases like this one, that uses modulo 2 and a predicate, but just assume it’s impossible.) Some of the examples in the other languages use the integer division, so I think it’s “fair” to use quotient here.
After this first change the time improves a little and the difference with fxquotient is small too. But the last version with unsafe-fxquotient is much faster. The good news is that from the expression that uses quotient it’s possible for an improved version of the compiler to do the replacements during the compilation and get the faster code.
(I also tried arithmetic-shift. There are a few weird things to check there too.)
The even? predicate
Now we can measure the run time when we change the blue expression:
Blue Green Seconds
(even? n) (unsafe-fxquotient n 2) 7.6
(zero? (remainder n 2)) (unsafe-fxquotient n 2) 20.8
(zero? (fxremainder n 2)) (unsafe-fxquotient n 2) 19.6
(zero? (unsafe-fxremainder n 2)) (unsafe-fxquotient n 2) 5.3
(zero? (bitwise-and n 1)) (unsafe-fxquotient n 2) 4.9
(zero? (fxand n 1)) (unsafe-fxquotient n 2) 4.9
(not (bitwise-bit-set? n 0)) (unsafe-fxquotient n 2) 5.4
Again, let’s think of them like a chain of transformations.
Changing even? to (zero? (remainder _ 2)) is super bad, I’m not sure why. It’s worth checking in the future. Some of the other languages use similar expressions, so it’s a “fair” change but also it’s worrying that it’s so slow because someone may use it.
Anyway, after the initial change, it is possible for an improved version of the compiler to change the primitive during compilation to fxremainder and then to unsafe-fxremander and would make the code faster than the original version.
The next two versions with bitwise-and and fxand are slightly faster, but they are too different from the original code, so I don’t classify them as “fair”. In both cases the compiler changes them to use the equivalent of unsafe-fxand.
The last version with (not (bitwise-bit-set? _ 0)) is as fast as the version that uses (zero? (unsafe-fxremainder _ 2)), but it uses a predicate. This is more friendly for the optimization pass that understands better predicates than functions with binary results. This may be relevant in a distant future to allow the compiler to replace / with quotient.
Anyway, it’s better to make the optimization pass just transform even? to the unsafe version of the Chez Scheme primitive cs:fxeven? that is as fast as (zero? (fxand _ 1)).
A macro for the happy path
One of the nice features of Racket is that you can use macros to make weird transformations in the code. (If you don’t want your coworkers and future you to hate you, then use macros wisely, document them, use syntax-parse to get nice error messages and avoid using macros when there is another option.)
So we will define a macro happy-path
(define-syntax-rule (happy-path test
body ...)
(if test
(begin body ...)
(begin body ...)))
It’s a very silly macro. When test is true then it runs body ... and when test is false it runs body ... too! The interesting part is that the compiler can apply different optimizations to each branch. In particular, in our case test will be (fixnum? n) so an improved version of the compiler can make in the first branch all the changes we discussed before and so we get nice code that is fast. The final version is
(define (count-collatz n [cnt 1])
(happy-path (fixnum? n)
(cond
[(= n 1) cnt]
[(even? n) (count-collatz (quotient n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))
Conclusions
If you know that all the arguments and the result are integers, use quotient instead of /. I think this is a good recommendation for all languages and compilers.
Try happy-path to give an opportunity for the optimizer to make improvements in the most expected path, or both if you are very lucky. Probably (fixnum? n) and (flonum? n) are useful tests in general. It would be nice that the compiler can do that for you, but it’s difficult without making all the executable code twice as big and perhaps not getting any speed improvements.
I expect these improvements to land on Racket 9.3 (mid 2026) so use your time machine and upgrade. I made a proof of concept branch [for Racket, for Chez Scheme], but the idea is to make more general versions of the improvements. (For example, if even? is magic, it’s nice that odd? is magic too, to keep the balance of the universe and avoid surprising users.)
Code
Original from Lambda Lang Blog
#lang racket
(define (count-collatz n [cnt 1])
(cond
[(= n 1) cnt]
[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))]))
(define (count-collatz-upto n)
(for/fold
([max-seen 0])
([i (in-range 1 n)])
(max max-seen (count-collatz i))))
(displayln (format "\nDone ~a" (count-collatz-upto 5000000)))
Faster now, but not so nice (Racket 9.0)
#lang racket
(require racket/fixnum)
(require racket/unsafe/ops)
(define (count-collatz n [cnt 1])
(if (fixnum? n)
(cond
[(= n 1) cnt]
[(zero? (unsafe-fxremainder n 2)) (count-collatz (unsafe-fxquotient n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])
(cond
[(= n 1) cnt]
[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))
Faster in the future and nice (Racket 9.3?)
#lang racket
(define-syntax-rule (happy-path test
body ...)
(if test
(begin body ...)
(begin body ...)))
(define (count-collatz n [cnt 1])
(happy-path (fixnum? n)
(cond
[(= n 1) cnt]
[(even? n) (count-collatz (quotient n 2) (+ 1 cnt))]
[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))
(define (count-collatz-upto n)
(for/fold
([max-seen 0])
([i (in-range 1 n)])
(max max-seen (count-collatz i))))
(displayln (format "\nDone ~a" (count-collatz-upto 5000000)))