What have you found for these years?

2011-06-30

Run Ruby tests inside the main process. No shells.

Another thing I don't like about Rails is that they are running tests
in different processes. It tends to do a lot of shell programming,
some commands are even hard coded into the source code.

For example, I don't know if Rails 3 is improved here, but when
you run ./script/console in Rails 2, actually it tries to invoke
an `irb` command, instead of using the ruby process which runs
./script/console.

Here's the background.

I don't like RVM, it tries to play around with various environment
variables. Also, it didn't work on fish, which is the shell I preferred.
So how do I use various version of Ruby? I do have MRI 1.9.2,
MRI 1.8.7, JRuby, and Rubinius installed.

I installed them via Homebrew:

brew install ruby
brew install jruby
brew install rubinius
By running `ruby -S rake` I use MRI 1.9.2 to run my rake tasks.
By running `jruby -S rake` I use JRuby to run my rake tasks.
So on and so forth.

-S is really convenient to let you switch between different rubies no
matter where it is, no magic involved. But this won't work:
`rbx ./script/console` because Rails 2 hard coded `irb`,
and it tries to invoke irb to run the console. By running irb directly,
I ended up with whichever ruby written in the shebang of irb. So I run
rubinius for ./script/console, but ended up with MRI 1.9.2 console.

What if I don't have irb at all? What if I have --program-suffix for irb?
Something like irb19. This is not acceptable, unless we don't have
--program-suffix and others at the first place.

But I can understand that doing shell programming is easier, because
that's what we would do normally. I used to do this in my `rake test`:
sh "#{Gem.ruby} -I lib -S bacon test/test_*.rb"
So that even I am starting a new ruby process, I can start it with
the same ruby which is running this rake task. In the case of
./script/console, it should be "#{Gem.ruby} -S irb ..."

This works for me for a while. But... it seems this is still a bad idea.

The problem is that now bundler is introduced, it's been more and
more complicated... Consider this:
bundle exec rake test
Here we have two ruby scripts... One is bundle, another one is rake.
I can use `rbx -S bundle` to make bundler run on rubinius, but not for
rake! `bundle exec` would use whichever ruby written in the shebang
of rake. Remember? Yes, once again, the same problem as hard
coded irb.

I think JRuby tries to solve this by some kind of hack I don't like...
It peeked into the shell command we invoked, if the program
ended with ruby, then JRuby would fake the shell command by
using existing ruby (JVM) process to run the script.

I think this would solve the performance problem that running Rails
tests would load ruby processes many times, and so do rails many
times, and so do our application many times. But then we can't
run any shell program ended with ruby, for example, sam-ruby.
> jruby -ve 'puts `sam-ruby -v`'
jruby 1.6.2 (ruby-1.8.7-p330) (2011-05-23 e2ea975) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_24) [darwin-x86_64-java]
jruby 1.6.2 (ruby-1.8.7-p330) (2011-05-23 e2ea975) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_24) [darwin-x86_64-java]

Joke aside, this hack might solve this issue, but then I can't run the
above test task:
sh "#{Gem.ruby} -I lib -S bacon test/test_*.rb"
Because JRuby bypasses shells, so the glob didn't work. JRuby tried
to load a file called "test/test_*.rb". I thought this was a JRuby bug,
and reported here:

Kernel#` and Kernel#system didn't correctly glob when using ruby

On the other hand, Rails is not suffering from this, because they
pass all files along in the command, instead of relying on glob as I do.

Good and bad. Good is that it's not suffering from this issue, and
it does not depend on shell, while I do. Bad is that when I have
hundreds of hundreds of test files, it would list ALL of them in
terminal, which is not desired because it's tooooooo much.

2.5 pages of this:
Yuk.

So by any means, calling shells or running different ruby processes is
a bad thing, unless we really have to do so, for example, some operation
has side-effect that there's no way to recover. We can't activate
ActiveRecord 2 and ActiveRecord 3 in one process and we do want to test
if cerealize would run under different versions of ActiveRecord. Another
example is that in rest-graph, I want to test if the user only loads
yajl-ruby, or json, or json_pure, any combination should work. We
can't unload libraries, so this must be done in different processes.
Or we fork?

(Actually JRuby's approach still works under this, because it's
another instance of the JRuby virtual machine, even it's reusing
JVM process. I guess this is the so-called MVM.)

As a result, If we don't really need different processes like above,
we should not invoke shells to run another ruby, instead, just load
everything into memory, and do the job there. This is not hard at all.

So I changed my test task to:
desc 'Run tests in memory'
task :test do
  require 'bacon'
  Bacon.extend(Bacon::TestUnitOutput)
  Bacon.summary_on_exit
  $LOAD_PATH.unshift('lib')
  Dir['test/test_*.rb'].each{ |file| load file }
end
This is essentially what does `bacon` command do. Just replicate it,
and no matter which ruby we are using to run the command, the
test would be run with that ruby.

What if we really want a separate process? For example, we might
want to pass -Xprofile to rubinius (rbx) to use the profiler, and we
only want to see the profile of the test cases, not any rake task.
Then I'll do this: `rbx -S rake test:shell[-Xprofile]`
desc 'Run tests with shell'
task 'test:shell', :RUBY_OPTS do |t, args|
  files = unless defined?(RUBY_ENGINE) && RUBY_ENGINE == 'jruby'
            'test/test_*.rb'
          else
            Dir['test/test_*.rb'].join(' ')
          end

  cmd = [Gem.ruby, args[:RUBY_OPTS],
         '-I', 'lib', '-S', 'bacon', '--quiet', files]

  sh(cmd.compact.join(' '))
end
So that it would work like before invoking a shell with the RUBY_OPTS
we passed. Note that JRuby cannot use glob, so expanding all files is
still needed... Just as Rails.

All codes can be found on my personal gemgem project, which is
my rake tasks for building gems. Previously, I use bones to do this,
but finally I realized that gem tasks should be simple, and controllable.
It should Just Work™, even if it's ugly it's ok.
(sorry bones, after struggling with you a bit, i decided to write my own)

I use git submodule to aggregate them, no need to publish them as
a gem polluting other projects.

Yeah, you can see I still have `gem:install` and others tasks invoking
shells, but they are ok for now. I only rewrite things causing headache...

I am not sure if Rails would accept such solution, but if it did,
it could much reduce my pain working with Rails. Any thought?

0 retries:

Post a Comment

Note: Only a member of this blog may post a comment.



All texts are licensed under CC Attribution 3.0