Garbage collection is evil.
For several days I've been wrestling with a peculiar performance problem in Maple. In the Quadratic Sieve code I'm currently writing, I use external C code to perform the sieving -- that is, I have a QuadraticSieveSieveInterval() function which I wrote in C and call from Maple -- and the relations are collected and filtered in Maple. This allows me to keep the amount of C code needed to a minimum by using Maple for some of the messy initialization (e.g., computing modular square roots).The performance problem arose in the "collecting relations in Maple" part. My code is roughly as follows:
while (Nrels < NumberOfRelationsWanted) do rels := QuadraticSieveSieveInterval( ... ); for rel in [rels] do Nrels := Nrels + 1; rtab[Nrels] := rel; od; od;
With NumberOfRelationsWanted equal to 30000, I noticed something very odd: If I commented out the "rtab[Nrels] := rel" line -- that is, if I counted the relations, but didn't store them -- then the code would be faster by roughly 150 seconds. However, after collecting all the relations, I could copy them all into a new hash table in under one second. Somehow adding the relations to a table while they were being generated was 200 times slower than adding the relations to a table after they are generated.
After some exploration of Maple's profiling capabilities, I noticed an unexpected function was (according to the profiler) using 15% of the total CPU time: the garbage collector. This made me immediately suspicious, since the most significant difference (aside from the very much increased time taken) between throwing the relations away and collecting them in a table is that collecting them means that the total memory usage increases over time. With a bit more searching, I found that a kernel option "gcfreq" which controls the frequency with which Maple's garbage collector is called. The default value is "every million words allocated"; I changed this to "every hundred million words allocated", and suddenly my code was 160s faster -- even with the "store the relation in a table" operation which had been peculiarly slow, my code was now faster than it had been without that operation before.
I'm not sure quite why my code was causing the garbage collector to perform so poorly, but it might be related to the combination of very small memory allocations (used by Maple) and rather large memory allocations (in the sieving code itself). Whatever the cause, it's worth remembering that while garbage collection isn't always slow, it certainly can be slow and should be investigated as a possible cause of unexplained poor performance. J.K. Rowling remarked, via a character in the second Harry Potter book, that one should "never trust anything that can think for itself, if you can't see where it keeps its brain"; in much the same vein, I would suggest that one should never trust a programming language if you can't see where and how it allocates and deallocates memory.