I’m now having a job at supplyframe Inc., and luckily I can use Clojure for work! Clojure is a young language created by Rich Hickey on 2007. It uses Lisp syntax, immutable data structures by default, and supports both strict and lazy evaluations. As Christ Okasaki suggested:
Strict evaluation is useful in implementing worst-case data structures and lazy evaluation is useful in implementing amortized data structures.
It’s really cheap to define lazy or strict data structures in Clojure that has low amortized cost even in a persistent manner. Let’s dig into the source code and see how does Clojure implement it.
Before we go into Clojure’s java source code, we can first look at the memoize
function.
Memoize
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
The function is elegant; it captures whatever arguments that pass in the function, and pairs the arguments and the returned value to a persistent map. In order to make it thread safe, the persistent map is cast into an atom
and can be modified via swap!
.
This is nice. And since Clojure uses $\log_{32}(N)$ hash map, it is also fast enough to do a memoized lookup. For the implementation of Clojure’s persistent hash map, you can check out this post.
lazy-seq
In contrast to memoize
, which is implemented in Clojure, lazy-seq
is implemented in java. It contains three fields:
1 2 3 4 5 |
|
fn
is an un-evaluated thunk (function without arguments), and sv
is the captured value after executing the thunk. The ISeq s
is the realized version of the sequence.
When the program tries to realize the lazy sequence, it calls seq()
and sval()
functions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
In the sval()
function, Clojure handles the caching elegantly. If fn
is not null, execute it and stores the value to sv
. If the LazySeq
is dereferenced, the whole object will be recycled by the garbage collector, else the object will hold the value and is thread-safe to be accessed by other threads.
The seq()
function is the wrapper around sval()
. It realizes all LazySeq
objects recursively, and wrap it into a seq
object that implements ISeq
interface.
With the realized seq
, it can support common sequence functions like first
and next
:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Conclusion
In Clojure’s documentation, it said that lazy-seq
is cached after realizing it, but it didn’t document how it does it. Luckily the source code is pretty easy to understand. Clojure uses lazy sequences a lot, so knowing that it handles lazy sequence efficiently is important for all Clojure programmers. :)