星之一角: Rails with threaded and multi-process server on Heroku

We have been using this setup for a while now, and I'm asked to write a
document for this. I think instead I'll simply write here as a blog post,
so that I could forward to anyone outside the company if I want to
share this setup.

You could also read my other mails which I sent to Heroku:
2238. 04-08 Regarding the recommended Ruby app server
2240. 04-13 Re: Regarding the recommended Ruby app server

A few points I want to make:

* We want requests and responses fully buffered. Heroku didn't do this.

* We want multi-threads to reduce memory usage comparing to multi-processes,
saving us some (might be a lot) money.

* If memories aren't an issue, we would also like multi-processes in order
to utilize CPU resources on a Ruby VM which has GVL. Or, since Rainbows!
seems much more tested than Zbatery, using Rainbows! is still better than
using Zbatery.

* There's no easy way to setup custom database.yml on Heroku, so we'll need
some hacks for it. Heroku's document is a bit wrong. I'll cover this later.

* We might be running out of Postgres connections soon, using PgBouncer
could probably help. We're not yet using this because it's experimental on
Heroku, but we might need it later as we scale.

* Unicorn family doesn't come with Rack handlers, making `rails s` or
`rails server` not usable for them. We'll need a few tweaks to make it
work. It would be much better to not use `rails s` to launch the server,
but some people might be very used to it, so it's still good if we could
make it work.

* Unicorn is using SIGQUIT for gracefully shutdown, however Heroku is
sending SIGTERM to ask for gracefully shutdown. We'll need to translate
this behaviour, otherwise we might drop requests randomly. Heroku's
document for this part is correct.

* We would want a significant smaller backlog for Heroku. The default is
1024, and we're running with 20. With 1024, it would be much harder for
Heroku's load balancers to detect if the underlying dyno is busy or not.
Generally we would want a lower number of this when running multiple
machines.

Ok, let's simply run through the setup. In Gemfile:

gem 'rainbows'     , :require => false
gem 'eventmachine' , :require => false
gem 'rainbows-emtp', :require => false
gem 'rack-handlers', :require => false # make `rails s` work for unicorn

So we're going to use EventMachine for requests/responses buffering,
and rack-handlers for `rails s`. If you're not using `rails s` then you
don't have to use rack-handlers.

As for putting :require => false, it's for reducing app loading time,
and save some memory if we're not using them. (e.g. we don't need them
in rails console, and maybe other rake tasks.) Bundler would try to do some
funny lookup if you don't specify what to require.

In script/rails, or bin/rails, add this line:

require 'rack/handler/rails-server'

Here's the diff:

 APP_PATH = File.expand_path('../../config/application',  __FILE__)
 require File.expand_path('../../config/boot',  __FILE__)
+require 'rack/handler/rails-server'
 require 'rails/commands'

By default, Rack would pick a server with following order:

pick ['thin', 'puma', 'webrick']

I should probably overwrite this in Rack::Handler instead of
trying to patch `rails s`. Anyway, this is how it works now.
I might fix it later, so you don't have to put this line, just
require 'rack-handlers'.
Updated 2013-08-22 21:04 Now this is done in rack-handler 0.7.0

Here's how we really launch the process in Procfile:

web: ruby -Ilib -r bundler/setup -S rainbows -c config/rainbows.rb -p $PORT -E $RACK_ENV -N

We're not doing `bundle exec` because that's generally slower and only for
convenience. `bundle exec` would spawn an extra process, it would be better
to avoid that to reduce launching time.

Here we also use the option -N, which implies that we don't want to use
any of the middleware implies by RACK_ENV.

Note that you would probably want to use Zbatery instead of Rainbows! here
if you're not using Heroku's 2X dynos, which has double memories but also
costs double. This is highly depending on how much memory your application
needs, and how much money you could spend.

To configure ActiveRecord's connection pool:

We need to put this in a file in our initializers.

Rails.application.config.after_initialize do
  ActiveRecord::Base.connection_pool.disconnect!

  ActiveSupport.on_load(:active_record) do
    # This is where Heroku's missing:
    ActiveRecord::Base.configurations =
      Rails.application.config.database_configuration

    config = ActiveRecord::Base.configurations[Rails.env]
    config['reaping_frequency'] = Integer(ENV['DB_REAP_FREQ'] || 10)
    config['pool']              = Integer(ENV['DB_POOL']      || 10)
    ActiveRecord::Base.establish_connection(config)
  end
end

Without the line Heroku's missing, we would end up with inconsistent
database configuration somewhere. I would also blame Rails'
implementation here. It's highly inconsistent internally.
I won't explain more here as Rails changed quickly, things
I knew might be wrong soon.

Note that we could also change DATABASE_URL in heroku config,
but that would make switching database much harder, because
we'll need to do that each time we switch to another database,
or we would end up with default configuration (i.e. pool size of 5)
Heroku should really fix this ultra annoying issue, so that
we don't have to hack it this way, causing potential inconsistent
issues in Rails.

Here comes the config/rainbows.rb:

We're splitting unicorn.rb and rainbows.rb so that we could easily
switch between them. If you don't want to do this, you could merge
the two configs into one.

Updated 2013-08-24 00:54 Now that rainbows-emtp is released, you
could simply do this:

eval File.read('config/unicorn.rb')

require 'rainbows-emtp'

Rainbows! do
  use :EventMachineThreadPool, :pool_size => 10
end

Old approach:

eval File.read('config/unicorn.rb')

Rainbows! do
  use :EventMachine, :em_client_class => lambda{
    require 'eventmachine'
    EM.threadpool_size = 12
    EM.error_handler do |e|
      l.info("Error: EM.error_handler: #{e.inspect} #{e.backtrace.inspect}")
    end
    class ::RainbowsEventMachineThreadPoolClient <
            Rainbows::EventMachine::Client
      def app_call input
        set_comm_inactivity_timeout 0
        @env[RACK_INPUT] = input
        @env[REMOTE_ADDR] = @_io.kgio_addr
        @env[ASYNC_CALLBACK] = method(:write_async_response)
        @env[ASYNC_CLOSE] = EM::DefaultDeferrable.new
        @hp.hijack_setup(@env, @_io)
        @deferred = true
        EM.defer do
          begin
            status, headers, body = catch(:async) {
              APP.call(@env.merge!(RACK_DEFAULTS))
            }
            if nil == status || -1 == status
              @deferred = true
            else
              @deferred = nil
              ev_write_response(status, headers, body, @hp.next?)
            end
          rescue Exception => e
            EM.instance_variable_get(:@error_handler).call(e)
            handle_error(e)
          end
        end
      end
    end

    ::RainbowsEventMachineThreadPoolClient
  }
  worker_connections        Integer(ENV['UNICORN_CONNECTIONS'] || 20)

  client_max_body_size      5*1024*1024 # 5 megabytes
  client_header_buffer_size 8*1024      # 8 kilobytes
end if defined?(Rainbows)

You might wonder, why RainbowsEventMachineThreadPoolClient?
Basically there's no threaded EventMachine model in official Rainbows!.
All we have are CoolioThreadSpawn and CoolioThreadPool. However, from my
last attempt, cool.io would have random assertion failures (though Eric
has never seen it, and I never tried to reproduce it), and Tony has headed
to celluloid-io, putting cool.io away.

I'd tried to put RainbowsEventMachineThreadPoolClient back to Rainbows!,
however there were a few test cases which I didn't successfully make
them pass, namely pipelined requests with large chunk of data, etc.
Since I don't think Heroku would support pipelined requests, and it might
be a bad idea anyway, I guess it's ok to ignore them on Heroku.

You could find some related discussion on Rainbows! mailing list.
See this thread:
[PATCH] close_connection_after_writing only if not deferred, as in cool.io
Or I guess I should put this into a gem...
Probably also checkout:
negative timeout in Rainbows::Fiber::Base

What about config/unicorn.rb:

worker_processes Integer(ENV['UNICORN_WORKERS'] || 2)
preload_app      true

l = ::Logger.new($stderr)
l.level = case ENV['RACK_ENV'] || ENV['RAILS_ENV']
          when 'production'
            ::Logger::WARN
          else
            ::Logger::DEBUG
          end
logger l # Try to always output to $stderr or $stdout for Heroku

timeout 30 # Heroku has 30 seconds limit, no point to have more here.
listen(ENV['PORT'] || 8080,
       :backlog => Integer(ENV['UNICORN_BACKLOG'] || 20))

before_fork do |_, _|
  Signal.trap 'TERM' do
    l.info('Unicorn master intercepting TERM and sending myself QUIT instead')
    Process.kill 'QUIT', Process.pid
  end

  l.info("Disconnecting PostgreSQL")
  ActiveRecord::Base.connection_pool.disconnect!
end

after_fork do |_, _|
  Signal.trap 'TERM' do
    l.info('Unicorn worker intercepting TERM and doing nothing. Wait for master to sent QUIT')
  end
end

Note that we *don't* connect to Postgres in after_fork hook,
because if our hack for database.yml for Heroku didn't work
correctly, we might end up with the default database.yml,
which would be wrong. You might see how I was annoyed by
those behaviours in the past... spending tons of time
figuring it out.

What about PgBouncer?

Ok, here's the last one. Note that we're *not* using it.
I've tried it, but since it looks to me highly experimental,
(read: it's very likely to change without notice) I am a bit
anxious to use it on production. However, if we're running out
of connections, we should still probably use it as there
might not have other alternatives.

Heroku buildpack: pgbouncer

Follow the instructions there as things might change.

All after all

RainbowsEventMachineThreadPoolClient is which the threaded
EventMachine client for Rainbows!. Others are all small tweaks to
make it run smoother.

Additionally, I've also heard that there are some issues for EM.defer.
We could also probably replace EM.defer with Thread.new there,
effectively making it RainbowsEventMachineThreadSpawnClient.
In that case, we might also need a larger connection pool or lower
the number of connections though.

Last note. Since eventmachine itself is not threadsafe, we would
need to wrap around eventmachine calls with EM.schedule or
EM.next_tick, depending on the context.

Some personal thoughts

* I hope I don't have to work with EventMachine anymore.
* Probably I should start doing a celluloid-io client for Rainbows!.
* I hope I don't have to work with Rails anymore.
* Probably I should... no, I don't want to make a Rails clone.
* I've been talking about the above stuffs a thousand times,
hope this would be the last time.

禁止餵食

日期分類

標籤分類

星之一角

2013-08-21

Rails with threaded and multi-process server on Heroku