Some recent comp.lang.lisp and IRC discussions prompted me to test the performance effect of different SBCL gencgc page sizes. Smaller page sizes give better write-barrier granularity and should thereby make the average garbage collections faster. The conservativeness of the GC works on page level, so smaller pages would also result in somewhat more precise GCs. On the other hand, I can think of some effects that might favour using larger pages:
-
The x86-64 port does inline allocation, by checking at the allocation site whether the allocation pointer would move over a page boundary. If not, just increment the pointer. Otherwise make a call to the somewhat expensive call to the alloc() function written in C. With a larger page size we take the fast path more often.
-
The page table eats up a lot of memory. With larger pages the table will have fewer elements.
To test this I compiled SBCL with 5 different page sizes and ran cl-bench on them.
Instead of the verbose and informative cl-bench output I'm going to present a compact and cryptic plot of the data instead. On the x-axis you have the cl-bench tests (for example 30 is ACKERMANN), on the y-axis you have the relative runtimes when compared to the reference implementation (page size 4096).
The performance effects on small benchmarks can be pretty dramatic, but the larger ones (like COMPILER) aren't affected all that much. Of course you can't tell this from the plot, since the tests are helpfully labeled with integers. (I'm aiming for a "bad graph of the week" award).
Based on the complete data it looks like using a page size of 8192 might be better than the current 4096. It wins on more benchmarks than it loses on, it's wins tend to be larger than it's losses, and it saves some memory on the page tables.