Kerry Buckley

What’s the simplest thing that could possibly go wrong?

Archive for November, 2009

Managing gems in a Rails project

2 comments

Over the years I’ve tried a number of approaches for managing gem dependencies in a Rails project. Here’s a quick round-up of what I’ve tried, and the pros and cons of each.

Just use what’s on the system

This is probably most people’s default approach when first starting with Rails. Just sudo gem install whatever you need, require the appropriate gems (either in environment.rb or in the class that uses them), and you’re away.

This mostly works OK for small projects where you’re the only developer, but you still need to make sure the right gems are installed on the machine you’re deploying the application to.

Worse, though, is what happens when you come back to the project after a while, various gems have been updated, and things mysteriously don’t work any more. Not only do you have to mess around getting the code to work with the latest gem versions, but you probably don’t even know exactly which versions it used to work with.

Freeze (unpack) gems

I think I first came across this technique in Err the Blog’s Vendor Everything post. The idea is to install copies of all your gems into the project’s vendor/gems directory, meaning that wherever the code is running, you can guarantee that it has the correct versions of all its dependencies.

This got much easier in Rails 2.1, which allowed you to specify all your gems using config.gem lines in environment.rb (you can also put gems only needed in specific environments in the appropriate file, eg you might only want to list things like rspec and cucumber in config/enviroments/test.rb). You can then run sudo rake gems:install to install any gems that aren’t on your system, and rake gems:unpack to freeze them into vendor/rails, and be sure that wherever you check out or deploy the code, you’ll be running the same versions of the gems. There’s even a gems:build task to deal with gems that have native code (but more on that later).

Subsequent versions of Rails have improved on the original rake tasks – dependencies are now handled much better, for example – but there are still a few problems. The main one is the handling of gems that are required by rake tasks in your project, rather than just from your application code.

When you call a rake task in your Rails project, this is more-or-less what happens (I may have got some of the details slightly wrong):

  1. The top-level Rakefile is loaded.
  2. This in turn requires config/boot.rb, but not config/environment.rb.
  3. It then requires some standard rake stuff, and finally tasks/rails (which is part of Rails – specifically railties). This finds and requires all the .rake files in your plugins and your project’s lib/rake directory.

The problems start when you have a task depends on the rails environment task, and also requires a gem which is listed in environment.rb. Because the gem-loading magic only happens when the environment is loaded, the rake task will be blissfully unaware of your frozen gems, and will load them from the system instead.

If the system gem is newer than the frozen one, you get errors like this:

can't activate foo (= 1.2.3, runtime) for [], already activated foo-1.2.4 for []

If you work on two projects that use different versions of a gem like this, you end up having to uninstall and reinstall them as you switch from one to the other, which gets tedious fairly quickly.

Specify gems, but don’t freeze

You can get round the wrong-version problem to some extent by specifying version numbers in environment.yml as ‘>=x.z.y’ (or by not specifying them at all). If you’re doing that, though, there’s not really much benefit in unpacking the gems, and you may as well just use rake gems:install to make sure they’re on the system. Of course the downside of this approach is that you can’t be sure that everyone’s running the exact same versions of the gems. Worse still, you can’t be sure that what’s on your production box matches your development and test environments.

GemInstaller

GemInstaller solves most of the problems with the built-in Rails gem management by running as a preinitializer, meaning it gets loaded before the other boot.rb gubbins.

GemInstaller uses the gems installed on the system rather than freezing them into the project, but because it gets to run first it ensures that the correct versions are used, even if there are newer versions installed. By default it checks your project’s gem list and installs anything that’s missing every time it runs (which is whenever you start a server, run the console, execute a rake task etc). You create a YAML file listing the gems you need (dependencies are handled automatically), and other options such as an HTTP proxy if necessary.

Of course on Unix-like systems, which is most of them (although I hear there are still people developing Rails projects on Windows), gems are generally installed as root. GemInstaller can get round this in two ways – either by setting the --sudo option and setting a rule in /etc/sudoers to allow the appropriate user(s) to run the gem commands as root without having to provide a password, or by using the built-in gem behaviour that falls back to installing in ~/.gem.

Personally I like to keep all my gems in one place, accessible to any user, so I went for the sudo approach. The only problem with this is that it uses sudo for all gem commands, rather than just install or update, which means it runs a sudo gem list every time your app starts up. Depending on the way you have Apache and Passenger set up this may mean granting sudo access to what should be a low-privileged user.

I ended up disabling the automatic updating of gems, and just warning when they’re missing instead. In fact later versions of GemInstaller don’t try to handle the update automatically anyway.

I created a separate script to do the update, which can be run manually, on a post-merge git hook, or as part of the Capistrano deployment task.

Because GemInstaller needs to go out to the network to fetch any new or updated gems, things get a bit more painful (as always) if you are unfortunate enough to be stuck behind a corporate HTTP proxy. Actually it’s easy enough to configure if you’re always behind a proxy, but it gets slightly trickier if your web access is sometimes proxied and sometimes direct. Nothing that can’t be solved of course.

Unfortunately you can still end up with version conflicts if a gem is required by one you have specified, then you explicitly require an older version, but these can usually be resolved by shuffling the order of the gems in geminstaller.yml.

Bundler

Bundler is the newest kid on the gem management block, and looks to have solved pretty much all the problems faced by the other approaches. It’s based on the gem management approach from Merb, and can be used in any Ruby project (not just Rails).

Bundler works by unpacking gems into the project (I recommend using a directory other than the default vendor/gems to avoid confusing Rails – this can be configured by setting bundle_path and bin_path in the Gemfile), but the intention is that you only commit the .gem files in the cache directory to source control. Gems are then installed locally within the project, including any platform-specific native code as well as the commands in bin.

Because Bundler resolves all dependencies up-front, you only need to specify the gems you’re using explicitly, and let it handle the rest, which hopefully means an end to version conflicts at last.

Here’s an example Gemfile:

[ruby]
source ‘http://gemcutter.org’
source ‘http://gems.github.com’
bundle_path ‘vendor/bundled_gems’
bin_path ‘vendor/bundled_gems/bin’

gem ‘rails’, ’2.3.4′
gem ‘bundler’, ’0.6.0′

gem ‘capistrano’, ’2.5.8′
gem ‘capistrano-ext’, ’1.2.1′
gem ‘cucumber’, ’0.4.3′, :except => :production
# [more gems here]

disable_system_gems
[/ruby]

Note the two additional sources (rubyforge.org is configured by default), the path overrides, and the last line, which removes the system gems from the paths, avoiding any potential confusion.

I’ve put this in config/preinitializer.rb to update from the cached gems on startup (this doesn’t hit the network):

[ruby]
$stderr.puts ‘Updating bundled gems…’
system ‘gem bundle –cached’
require “#{RAILS_ROOT}/vendor/bundled_gems/environment”
[/ruby]

To avoid any startup delays after an upgrade, I also call system 'gem bundle --cached' from the after_update_code hook in the capfile.

Finally, to make sure only the .gem files are checked in, add these lines to .gitignore (you’ll still need to explicitly git add the bundled_gems/cache directory):

vendor/bundled_gems
!vendor/bundled_gems/cache

[Update 3 November] Yehuda Katz just posted an article all about Bundler, including features coming in the imminent 0.7 release.

Technorati Tags: , , , , ,

Written by Kerry

November 2nd, 2009 at 5:24 pm

Posted in Rails,Ruby