
ELS’19, April 01–02 2019, Genova, Italy Didier Verna
Library Batch
Batch 1
Batch 2
Declt
Declt
Declt
Figure 10: Solution 3, stage 1
Main thread, Declt threads
in the current Quicklisp distribution. We got 11 batches of 460 to
only 1 libraries, from rst to last.
In order to adapt solution 2 to this new scheme, a new shared
buer is created (see Figure 10). The Declt threads pick libraries
from it instead of from the original libraries pool. The main thread
sends successive batches of standalone libraries to this buer, and
waits for them to have been exhausted before sending the next
batch in. The rest of solution 2 is unchanged (in particular, the
Html generation code can be re-used without modication).
5.3.1 Advantages. At the expense of a slightly more complicated
synchronization logic, this solution may be used in any of our 3
scenarios. In the current status of Quicklisp, the dependency graph
is relatively small (less than two thousand nodes), which means that
the additional computation time required to handle it is negligible
compared to the 21m 47s of our current most optimistic situation.
5.3.2 Drawbacks. Before sending the next batch in, the main thread
must wait for all libraries in the current batch to have been entirely
processed by a Declt thread; not just have been picked up by one
of them. At a rst glance, this may not appear as a serious issue
because we only have 11 batches and a few threads handling them.
However, remember again from Section 4.2 that some libraries will
take a very long time to process. If, for example, such a “long”
library is part of a small batch, the batch will be quickly emptied,
and all Declt threads will essentially become dormant until the
“long” library is treated. This is yet another form of accumulation
eect that can potentially hinder the parallelization.
5.3.3 Experimentation. Because the time required to maintain the
dependency graph is negligible, this solution is not expected to
make much dierence in scenarios 1 (no compilation) and 3 (local
cache), as it would boil down to handling the libraries in a dierent
order. For scenario 2, the best result was obtained with an equal
number of threads for Declt and Makeinfo, namely, 4 of each (again,
corresponding to the hyper-threaded quad-core hardware cong-
uration used in the experiments). There, the overall computation
time fell down to 29m 21s, that is, 26% of the original sequential
time. Given the time distribution in Figure 2, we also tried matching
that proportion, for example with 5 Declt threads and 3 Makeinfo
ones. We only got similar (inconclusive) result only diering by
less than 5%.
6 CONCLUSION
As mentioned in the introduction, the absolute worst case scenario
for Quickref, which is to build the complete Quicklisp documenta-
tion from scratch, takes around 7 hours on our test machine. Even
if such a duration may appear reasonable for batch processing,
we still believe that parallelization is not a vain endeavor. First of
all, the ability to use Quickref interactively (creating for example
one’s own local documentation website) makes it worth improving
its eciency as much as possible. Secondly, Quicklisp itself is an
ever-growing repositor y (monthly updates usually add at least a
dozen new libraries to the pool), and so is the time to generate the
documentation for it.
In this paper, we have devised a set of parallel algorithms, and
experimented with them in dierent scenarios corresponding to
the typical use-cases of Quickref. On our test machine, we were
able to reduce the required processing time roughly by a factor of
4 compared to the naive sequential version, which is already quite
satisfactory. The absolute worst-case scenario fell under 2 hours,
and the most frequent one under half an hour. For all that, and in
spite of the fact that gracefully handling concurrency is always
a tricky business, our parallel solutions remain quite simple. The
implementation of solution 3, for example, requires only 3 shared
resources (2 buers and a counter), 2 mutexes and 3 condition
variables. It was implemented directly with Sbcl’s multi-threading
layer, without resorting to higher level libraries.
This work also lead us to perform various preliminary measure-
ments and analysis on Common Lisp libraries (compilation and
load time, Declt and Makeinfo run time, dependency graphs, etc.).
As mentioned before, the collected experimental data and their in-
terpretation is publicly available. We think this data could be useful
for other projects, and we already know for a fact that the current
Texinfo maintainers are interested. Only a small part of those re-
sults have been presented in this paper. We are condent the rest
will be extremely useful for future renements. Indeed, there are
still many things that can be done to improve the situation even
more.
7 DISCUSSION & PERSPECTIVES
7.1 Alternative Solution
Yet another, alternative, parallel solution exists, depicted in its en-
tirety in Figure 11. This solution consists in processing the libraries
in parallel, yet, without breaking the Declt / makeinfo chain. Mul-
tiple threads (8 would probably be an appropriate number on our
test machine) pick libraries to process, and sequentially run Declt
followed by Makeinfo on them. As solution 2 (Section 5.2), this
algorithm can be made to work on scenarios 1 and 3 only, or, as
solution 3 (Section 5.3) can be combined with library batches in
order to also work on scenario 2. This is what Figure 11 depicts. In
the future, and mostly out of curiosity, we may experiment with
this solution.
Note however that we don’t expect it to make much dierence
compared to solution 3. In solution 3, we have indeed fewer threads
picking libraries up for Declt processing, but on the other hand,
these threads also return more quickly to the library pool / batch,
since they are not in charge of Makeinfo. In fact, our gut feeling is