Patching Mongrel Cluster Restart

The following is commonly used to restart a Rails application that uses mongrel and the mongrel_cluster gem:

mongrel_rails cluster::restart

This usually works fine, but if you have a very active application that’s getting hammered with requests, it’s not guaranteed that your mongrel instance will stop and start as you expect.  Let’s take a look at the restart method to see why that is:

sudo pico /usr/lib/ruby/gems/1.8/gems/mongrel_cluster-1.0.5/lib/mongrel_cluster/init.rb

Search for restart and look for the run method a few lines under.  Here’s what it looks like:

    def run
      stop
      start
    end

This is a problem because the stop method doesn’t actually make sure the mongrel instance is down before returning.  So the start method will execute immediately after stop returns, which means that if your mongrel instance is hung up for whatever reason, start may be invoked before your mongrel is actually shut down.  Furthermore, this means that by the time your mongrel actually stops, start had already been invoked and failed (you’ll get the “mongrel already started” error).  So then your mongrel eventually stops and will be down until you explicitly run another start.  Not cool.

So, what’s the solution?  I’m not sure what most people do, but I like to patch the mongrel run method with the following:

    def run
#      stop
#      start
      read_options
      @force, @clean = [false, true]
      @ports.each do |port|
        @only = port
        stop
        check_wait
        start
      end
    end
 
    private
      def check_wait(wait_time = 10)
        wait_time.times do
          return unless check_process(@only)
          sleep 1
        end
        log "*Slept #{wait_time} seconds but still not dead, force kill"
        @force = true
        stop
        @force = true
        stop
        @force = false
      end

Just save the file and you’re done.  I’m not sure how a purist would rate this on an elegance scale, but it works like a charm.  Now when you execute  mongrel_rails cluster::restart
and your mongrel happens to be hanging, it will wait wait_time seconds before forcibly shutting it down.  It will also print out a message telling you what’s going on.

If the mongrel(s) of your application are forcefully shut down frequently, you may want to consider running more mongrel instances or optimizing/scaling your application; it’s not ideal to just forcibly kill your mongrel as common practice since it might be in the middle of processing an “important” request (say, some payment request in your application).  I’ve never run into a problem though.

Anyway, I use this patch on every server that I have Rails installed.  Even if you don’t think you’ll need it, doesn’t hurt.  Saves a lot of headache if and when the time comes.