Gustavo

2026/01/13

A happy path for Racket in the Collatz benchmark

The other day, I was reading yet another benchmark: “Lambda Land: Functional Languages Need Not Be Slow”. (Is it a benchmark? Nice article anyway.) It compares Racket, Python, Rust, Julia and JavaScript trying the Collatz conjecture up to 500,000.

First, two disclaimers:

- There are lies, damned lies, statistics and benchmarks.

- The ultimate way of cheating in a benchmark is changing the compiler.

The idea is to try to improve the time of the Racket program following the spirit of the benchmark, and then extract some ideas to improve the compiler to automate the improvements when possible. The Racket compiler transforms the program to Chez Scheme code, so most of the optimization pass improvements will be actually in the Chez Scheme compiler and they will hopefully make programs in both languages faster.

(I don’t know enough of the other languages to try something similar, so I won’t even try. I’d like to read any similar analysis.)

Each benchmark is different. In my opinion, the spirit of this benchmark is to write nice code and see what the compiler can do. I’m more used to benchmarks that try to squeeze every single millisecond and use curse primitives like #3%$unsafe-fl*+! so this one has a very different spirit, only allowing nice code.

With the changes, the run times are:

Program Version Seconds

Original from Lambda Land (Racket 9.0) 17.9

Faster now, but not so nice (Racket 9.0) 5.3

Faster in the future and nice (Racket 9.0) 17.8

Faster in the future and nice (Racket 9.3?) 4.9 (expected?)

(All times measured at the command line, outside DrRacket. By default DrRacket has “debugging” enabled and that adds a lot of additional time in exchange for better error reports.)

As a baseline, this is the original code with two highlighted expressions that will be the slow ones:

(define (count-collatz n [cnt 1])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))]))

Fast but not so nice code

My first step was to write not nice code that is fast. But not very ugly code, just a little ugly. The idea is to have a tradeoff between nice and fast code. I tried a lot of variants until I understood these two expressions were the most important to improve the speed.

The changes are

Split the function in a part for fixnums and another for bignums.
The division / is slow, so let’s replace it with unsafe-quotient. (green part)
Also even? is slow, so let’s split the logic and use another unsafe (blue part)

(require racket/fixnum)

(require racket/unsafe/ops)

(define (count-collatz n [cnt 1])

(if (fixnum? n)

(cond

[(= n 1) cnt]

[(zero? (unsafe-fxremainder n 2)) (count-collatz (unsafe-fxquotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

Not so ugly in my opinion, and the run time goes from 17.9 seconds to 5.3 seconds on my computer. Note that the first branch for fixnums has a few tricks, but the second is just to the original code.

Division and quotient

We can measure the run time when we change the green expression:

Blue Green Seconds

(even? n) (/ n 2) 17.9

(even? n) (quotient n 2) 17.8

(even? n) (fxquotient n 2) 17.3

(even? n) (unsafe-fxquotient n 2) 7.6

Let’s think of them like a chain of transformations.

Changing / to quotient is super difficult for the compiler. The optimization pass doesn't know enough algebra to make this transformation. (I think someday this is possible in very specific cases like this one, that uses modulo 2 and a predicate, but just assume it’s impossible.) Some of the examples in the other languages use the integer division, so I think it’s “fair” to use quotient here.

After this first change the time improves a little and the difference with fxquotient is small too. But the last version with unsafe-fxquotient is much faster. The good news is that from the expression that uses quotient it’s possible for an improved version of the compiler to do the replacements during the compilation and get the faster code.

(I also tried arithmetic-shift. There are a few weird things to check there too.)

The even? predicate

Now we can measure the run time when we change the blue expression:

Blue Green Seconds

(even? n) (unsafe-fxquotient n 2) 7.6

(zero? (remainder n 2)) (unsafe-fxquotient n 2) 20.8

(zero? (fxremainder n 2)) (unsafe-fxquotient n 2) 19.6

(zero? (unsafe-fxremainder n 2)) (unsafe-fxquotient n 2) 5.3

(zero? (bitwise-and n 1)) (unsafe-fxquotient n 2) 4.9

(zero? (fxand n 1)) (unsafe-fxquotient n 2) 4.9

(not (bitwise-bit-set? n 0)) (unsafe-fxquotient n 2) 5.4

Again, let’s think of them like a chain of transformations.

Changing even? to (zero? (remainder _ 2)) is super bad, I’m not sure why. It’s worth checking in the future. Some of the other languages use similar expressions, so it’s a “fair” change but also it’s worrying that it’s so slow because someone may use it.

Anyway, after the initial change, it is possible for an improved version of the compiler to change the primitive during compilation to fxremainder and then to unsafe-fxremander and would make the code faster than the original version.

The next two versions with bitwise-and and fxand are slightly faster, but they are too different from the original code, so I don’t classify them as “fair”. In both cases the compiler changes them to use the equivalent of unsafe-fxand.

The last version with (not (bitwise-bit-set? _ 0)) is as fast as the version that uses (zero? (unsafe-fxremainder _ 2)), but it uses a predicate. This is more friendly for the optimization pass that understands better predicates than functions with binary results. This may be relevant in a distant future to allow the compiler to replace / with quotient.

Anyway, it’s better to make the optimization pass just transform even? to the unsafe version of the Chez Scheme primitive cs:fxeven? that is as fast as (zero? (fxand _ 1)).

A macro for the happy path

One of the nice features of Racket is that you can use macros to make weird transformations in the code. (If you don’t want your coworkers and future you to hate you, then use macros wisely, document them, use syntax-parse to get nice error messages and avoid using macros when there is another option.)

So we will define a macro happy-path

(define-syntax-rule (happy-path test

body ...)

(if test

(begin body ...)

(begin body ...)))

It’s a very silly macro. When test is true then it runs body ... and when test is false it runs body ... too! The interesting part is that the compiler can apply different optimizations to each branch. In particular, in our case test will be (fixnum? n) so an improved version of the compiler can make in the first branch all the changes we discussed before and so we get nice code that is fast. The final version is

(define (count-collatz n [cnt 1])

(happy-path (fixnum? n)

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (quotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

Conclusions

If you know that all the arguments and the result are integers, use quotient instead of /. I think this is a good recommendation for all languages and compilers.
Try happy-path to give an opportunity for the optimizer to make improvements in the most expected path, or both if you are very lucky. Probably (fixnum? n) and (flonum? n) are useful tests in general. It would be nice that the compiler can do that for you, but it’s difficult without making all the executable code twice as big and perhaps not getting any speed improvements.
I expect these improvements to land on Racket 9.3 (mid 2026) so use your time machine and upgrade. I made a proof of concept branch [for Racket, for Chez Scheme], but the idea is to make more general versions of the improvements. (For example, if even? is magic, it’s nice that odd? is magic too, to keep the balance of the universe and avoid surprising users.)

Code

Original from Lambda Lang Blog

#lang racket

(define (count-collatz n [cnt 1])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))]))

(define (count-collatz-upto n)

(for/fold

([max-seen 0])

([i (in-range 1 n)])

(max max-seen (count-collatz i))))

(displayln (format "\nDone ~a" (count-collatz-upto 5000000)))

Faster now, but not so nice (Racket 9.0)

#lang racket

(require racket/fixnum)

(require racket/unsafe/ops)

(define (count-collatz n [cnt 1])

(if (fixnum? n)

(cond

[(= n 1) cnt]

[(zero? (unsafe-fxremainder n 2)) (count-collatz (unsafe-fxquotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

Faster in the future and nice (Racket 9.3?)

#lang racket

(define-syntax-rule (happy-path test

body ...)

(if test

(begin body ...)

(begin body ...)))

(define (count-collatz n [cnt 1])

(happy-path (fixnum? n)

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (quotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

(define (count-collatz-upto n)

(for/fold

([max-seen 0])

([i (in-range 1 n)])

(max max-seen (count-collatz i))))

(displayln (format "\nDone ~a" (count-collatz-upto 5000000)))

Un camino feliz para Racket en el benchmark de Collatz

► English Version

El otro día estuve leyendo otro benchmark más: "Lambda Land: Functional Languages Need Not Be Slow". (¿Es un benchmark? Lindo artículo, de todas maneras). Compara Racket, Python, Rust, Julia y JavaScript intentando la conjetura de Collatz hasta 500.000.

Antes que nada, dos advertencias:

- Hay mentiras, malditas mentiras, estadísticas y benchmarks.

- La mejor manera de hacer trampa en un benchmark es cambiar el compilador.

La idea es intentar mejorar el tiempo del programa Racket siguiendo el espíritu del benchmark y luego extraer algunas ideas para mejorar el compilador y automatizar las mejoras cuando sea posible. El compilador de Racket transforma el programa a código en Chez Scheme, por lo que la mayoría de las mejoras de optimización las vamos a hacer en el compilador Chez Scheme y, con un poquito de suerte, van a hacer que los programas en los dos lenguajes sean más rápidos.

(No conozco lo suficiente los otros lenguajes como para intentar algo similar, así que ni siquiera lo voy a intentar. Me gustaría leer análisis similares.)

Cada benchmark es diferente. En mi opinión, el objetivo de este benchmark es escribir código lindo y ver qué puede hacer el compilador. Estoy más acostumbrado a benchmarks que intentan exprimir cada milisegundo y usan primitivas que parecen insultos como #3%$unsafe-fl*+!, así que este tiene un enfoque muy diferente, ya que sólo usa código lindo.

Con los cambios, los tiempos de ejecución son:

Versión del programa Segundos

Original de Lambda Land (Racket 9.0) 17.9

Rápido ahora, pero no muy lindo (Racket 9.0) 5.3

Rápido algún día y lindo (Racket 9.0) 17.8

Rápido algún día y lindo (Racket 9.3?) 4.9 (estimado?)

(Todos los tiempos medidos en la línea de comandos, fuera de DrRacket. DrRacket tiene “debugging” habilitado por defecto, lo que agrega mucho tiempo adicional a cambio de mejores mensajes de error.)

Para referencia, este es el código original con dos expresiones resaltadas, que veremos que son las más lentas:

(define (count-collatz n [cnt 1])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))]))

Código rápido pero no tan lindo

Mi primer paso fue escribir código no muy lindo pero rápido. Pero tampoco feo, sólo un poquito feo. La idea es encontrar un equilibrio entre código lindo y rápido. Probé muchas variantes hasta que entendí que estas dos expresiones eran las más importantes para mejorar la velocidad.

Los cambios son:

Dividir la función en una parte para los fixnums y otra para los bignums.
La división / es lenta, así que la reemplazamos por unsafe-quotient (parte verde).
También even? es lento, así que dividimos la lógica y usamos otro unsafe (parte azul).

(require racket/fixnum)

(require racket/unsafe/ops)

(define (count-collatz n [cnt 1])

(if (fixnum? n)

(cond

[(= n 1) cnt]

[(zero? (unsafe-fxremainder n 2)) (count-collatz (unsafe-fxquotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

No quedó tan feo en mi opinión, y el tiempo de ejecución pasa de 17,9 segundos a 5,3 segundos en mi computadora. Notar que la primera rama para fixnums tiene algunos trucos, pero la segunda es directamente el código original.

División y cociente

Podemos medir el tiempo de ejecución modificando la expresión verde:

Azul Verde Segundos

(even? n) (/ n 2) 17.9

(even? n) (quotient n 2) 17.8

(even? n) (fxquotient n 2) 17.3

(even? n) (unsafe-fxquotient n 2) 7.6

Pensémoslo como una cadena de transformaciones.

Cambiar / a quotient es muy difícil para el compilador. La pasada de optimización no sabe suficiente álgebra para hacer esta transformación. (Creo que esto es posible algún día en casos muy específicos como este, que usa módulo 2 y un predicado, pero para simplificar supongamos que es imposible.) Algunos de los ejemplos en otros lenguajes usan la división entera, así que creo que es “justo” usar quotient acá.

Después de este primer cambio, el tiempo mejora un poco y la diferencia con fxquotient también es pequeña. Pero la última versión con unsafe-fxquotient es mucho más rápida. La buena noticia es que, a partir de la expresión que usa quotient, es posible que una versión mejorada del compilador realice los reemplazos durante la compilación y obtenga el código más rápido.

(También probé arithmetic-shift. Hay algunas cosas extrañas que revisar.)

El predicado even?

Ahora podemos medir el tiempo de ejecución al cambiar la expresión azul:

Azul Verde Segundos

(even? n) (unsafe-fxquotient n 2) 7.6

(zero? (remainder n 2)) (unsafe-fxquotient n 2) 20.8

(zero? (fxremainder n 2)) (unsafe-fxquotient n 2) 19.6

(zero? (unsafe-fxremainder n 2)) (unsafe-fxquotient n 2) 5.3

(zero? (bitwise-and n 1)) (unsafe-fxquotient n 2) 4.9

(zero? (fxand n 1)) (unsafe-fxquotient n 2) 4.9

(not (bitwise-bit-set? n 0)) (unsafe-fxquotient n 2) 5.4

De nuevo, pensémoslo como una cadena de transformaciones.

Cambiar even? a (zero? (remainder _ 2)) es muy malo, no sé por qué. Vale la pena mirarlo más adelante. Algunos otros lenguajes usan expresiones similares, así que es un cambio "justo", pero me preocupa que sea tan lento, ya que alguien podría usarlo.

En cualquier caso, tras el primer cambio, es posible que una versión mejorada del compilador cambie la primitiva durante la compilación a fxremainder y luego a unsafe-fxremander, lo que haría que el código fuera más rápido que la versión original.

Las dos versiones siguientes con bitwise-and y fxand son ligeramente más rápidas, pero son demasiado distintas del código original, así que no las clasifico como "justas". En ambos casos, el compilador las modifica para usar el equivalente de unsafe-fxand.

La última versión con (not (bitwise-bit-set? _ 0)) es tan rápida como la versión que usa (zero? (unsafe-fxremainder _ 2)), pero usa un predicado. Esto es más amigable para la pasada de optimización, que entiende mejor los predicados que las funciones con resultados binarios. Esto podría ser relevante en un futuro lejano para permitir que el compilador reemplace / con quotient.

De todas maneras, es mejor hacer que la pasada de optimización simplemente transforme even? en la versión unsafe de la primitiva de Chez Scheme cs:fxeven?, que es tan rápida como (zero? (fxand _ 1)).

Una macro para el camino feliz

Una de las ventajas de Racket es que permite usar macros para realizar transformaciones extrañas en el código. (Si no querés que sus compañeros de trabajo y su futuro vos le odien, use las macros con sabiduría, documentalas, usá syntax-parse para obtener buenos mensajes de error y evitá usar macros cuando haya otra opción.)

Así que definimos una macro happy-path

(define-syntax-rule (happy-path test

body ...)

(if test

(begin body ...)

(begin body ...)))

Es una macro muy tonta. Cuando test es verdadero, ejecuta body... y cuando test es falso, ¡también ejecuta body... ! Lo interesante es que el compilador puede aplicar diferentes optimizaciones a cada rama. En particular, en nuestro caso, test va a ser (fixnum? n), por lo que una versión mejorada del compilador puede realizar en la primera rama todos los cambios que mencionamos antes, obteniendo así un código lindo que es rápido. La versión final es

(define (count-collatz n [cnt 1])

(happy-path (fixnum? n)

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (quotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

Conclusiones

Si sabés que todos los argumentos y el resultado son enteros, usa quotient en vez de /. Creo que es una buena recomendación para todos los lenguajes y compiladores.
Probá happy-path para que el optimizador pueda hacer mejoras en el camino más esperado, o en ambos si hay suerte. Probablemente (fixnum? n) y (flonum? n) sean tests útiles en general. Sería bueno que el compilador pudiera hacerlo por vos, pero es difícil sin duplicar el tamaño del código ejecutable y quizás obtener ninguna mejora de velocidad.
Espero que estas mejoras se incluyan en Racket 9.3 (mediados de 2026), así que usa tu máquina del tiempo y actualiza. Armé una rama de prueba de concepto [for Racket, for Chez Scheme], pero la idea es crear una versión más general de las mejoras. (Por ejemplo, si even? es mágico, es bueno que odd? también sea mágico, para mantener el equilibrio del universo y evitar sorprender a los usuarios).

Código

Original de Lambda Lang Blog

#lang racket

(define (count-collatz n [cnt 1])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))]))

(define (count-collatz-upto n)

(for/fold

([max-seen 0])

([i (in-range 1 n)])

(max max-seen (count-collatz i))))

(displayln (format "\nDone ~a" (count-collatz-upto 5000000)))

Rápido ahora, peor no muy lindo (Racket 9.0)

#lang racket

(require racket/fixnum)

(require racket/unsafe/ops)

(define (count-collatz n [cnt 1])

(if (fixnum? n)

(cond

[(= n 1) cnt]

[(zero? (unsafe-fxremainder n 2)) (count-collatz (unsafe-fxquotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (/ n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

Rápido algún día y lindo (Racket 9.3?)

#lang racket

(define-syntax-rule (happy-path test

body ...)

(if test

(begin body ...)

(begin body ...)))

(define (count-collatz n [cnt 1])

(happy-path (fixnum? n)

(cond

[(= n 1) cnt]

[(even? n) (count-collatz (quotient n 2) (+ 1 cnt))]

[else (count-collatz (+ (* 3 n) 1) (+ 1 cnt))])))

(define (count-collatz-upto n)

(for/fold

([max-seen 0])

([i (in-range 1 n)])

(max max-seen (count-collatz i))))

(displayln (format "\nDone ~a" (count-collatz-upto 5000000)))

Suscribirse a: Comentarios ( Atom )