Ruby Garbage Collection

| Comments

Last week, I came across a very interesting tidbit of information: The Ruby Garbage Collector was not meant to run

Garbage collection is one of the Achilles’heels of MRI ruby performance. It combines a fairly naive algorithm with a ‘stop the world’ assumption, as in it’s not a background process, and any other ruby activity can bloody well wait until it is done.

I found this thread in the ruby-lang mailing list very interesting. Next quotes are enlightening:

In particular, the GC algorithms in MRI and YARV are specifically designed with the assumption that they will never actually run in 99.999% of all cases. They are designed for scripting, where a script doesn’t even allocate enough memory to trigger a collection, runs for a couple of seconds and then exits, after which the OS simply reclaims the memory: no GC needed.
That’s why YARV and especially MRI are so exceptionally bad for server loads. It’s also why REE can never be merged into mainline. (Jorg W. Mittag Aug 21 2010)
99.999% is a bit over-exaggerated, but it is true that garbage collection algorithm of YARV and MRI focus for throughput on non-memory extensive short-running programs, and GC of REE is not suitable for those programs. (Matz Aug 21 2010)

Scripting was the original use case for ruby, so that’s understandable. It explains a lot.

In other related news, I asked on Twitter whether there was an Enterprise Edition for Ruby 1.9 in the works. Enterprise edition Ruby fixes some of the worst evils by, amongst other things:

  • exposing some garbage collection parameters, and tuning it a little out of the box
  • doing Copy on Write, allowing several threads to use common resources in a read-only way.

Unfortunately, Phusion people told me there are no immediate plans to make an enterprise edition of ruby 1.9, since they are now going full throttle on an overhaul of Passenger. Additionally MRI ruby 1.9 needs some patching to make it Copy on Write friendly, and that hasn’t happened yet.

I’d like to have a look at Rubinius, where garbage collection is done differently. And of course JRuby benefits from the JVM memory handling, which has been optimized to death by many PhDs and all the developer budgets Sun and IBM could spend on it.