Categories
git Software

Build and push to upstream git repository in the background

[Updated 11 Feb 2011 – I’m not sure the original version actually worked properly.]

Quite often I’ll make small changes to a project’s code which I’m 99% sure won’t break the build. Of course thanks to Murphy’s Law that probability falls to about 10% if I decide to risk it push the changes upstream without running the full build locally.

Now even if the build only takes a few minutes to run, that’s a few minutes which I can’t spend doing anything else with the code, because if I save a file I can’t be sure whether the original or changed version will be used in the tests. What I’d like to be able to do is to run the build in the background, with changes automatically pushed upstream if everything’s green, or an alert shown if either the build fails or the push is rejected (because someone else has pushed a change while the build was running).

Fortunately with a distributed version control system like git, this is fairly easy.

Firstly, clone the local working repository and set up a remote pointing to the clone:

git clone . ../myproject-staging
git remote add staging ../myproject-staging

Now set up the cloned copy:

cd ../myproject-staging

# Allow pushing into the checked-out branch
git config receive.denyCurrentBranch ignore

# Create a remote for the upstream server you want to push to (change the URI!)
git remote add upstream git@your.upstream.server/your/repository/path.git

Do anything that needs doing to allow the cloned project to build (if you have a test database, create a separate instance to avoid tests interacting with your main workspace), and verify that the build runs.

Now add the following to .git/hooks/post-receive:

#!/bin/bash

cd ..
export GIT_DIR=.git
git reset --hard
nohup .git/hooks/post-reset 1> build_output 2>&1 &

This will reset your checked-out working copy to the pushed master, then run .git/hooks/post-reset in the background. Create that file with the following content:

#!/bin/bash

function error {
  growlnotify -s -t `basename $PWD` -m "$1"
  exit 1
}

# Replace this with your build command
bundle install --local && rake || error "Rake failed"

git push upstream master || error "Git push failed"
growlnotify -s -t `basename $PWD` -m "Built and pushed successfully"

Make sure both these hooks are executable:

chmod u+x .git/hooks/post*-receive

Now, back in your normal working repository you should just be able to type git push staging (add --force if you’ve rebased your normal master and get an error about the push not being a fast-forward), and the project will build in the staging repo. If everything works, it’ll push upstream and you’ll get a Growl notification (assuming you have Growl and growlnotify, or the equivalents for your OS, installed). If it fails, you’ll also get a Growl notification. I’ve made the alerts sticky, so I don’t miss them – if you don’t want that, just remove the -s flag. The build errors will be in the build_output file.

Categories
Software

The Wandering Book

After procrastinating and holding on to the wandering book for much too long, I finally made my entry. You can read it on the site, and here’s the text if you can’t decipher my handwriting:

I work in a large “enterprisey” company where craftsmanship is, unsurprisingly,
not the dominant model of software development. However, a craftsmanship ethos
still exists in some of the remaining pockets of in-house development, and if
anything appears to be strengthening.

For the first few years of my software career, I treated it very much as a day
job. I turned up; I wrote some code (or, for a horrendous proportion of the
time, some documents); I went home. I learned just enough to enable me to
complete whatever I was working on, but no more. I used the tools I was given,
or those I was familiar with, regardless of their suitability for the task at
hand.

Then I started working with someone who was trying cool new stuff. Dependency
injection. Automated testing. Continuous integration. I realised how
out-of-touch I was in my ideas of how software should be developed, and started
trying to put that right. I read about XP and Agile, and started following
blogs and reading articles. I went to XPDay, and turned up at the odd XTC pub
night when I happened be in London on a Tuesday.

More importantly, I started caring about my code. Instead of thousand-line
procedures, I started separating concerns and following OO principles. I wrote
unit tests, and after a while I got the hang of TDD. Month-long manual testing
and bug fixing phases gave way to an automated nightly build and a quick
confidence check, and eventually to something approaching continuous
deployment.

Of course, the more widely I look at what others are doing, the more aware I am
that even when I think I might be doing something reasonably well, there are
always people out there doing it much, much better. I may never approach their
level, but I can learn from their work and thus improve mine. In turn I hope I
can pass something on to others.

So, to finally get to the point, I agree with earlier contributors that
craftsmanship is about caring, learning, practicing and sharing. Underpinning
all this though is community. Not ‘the’ software craftsmanship community, but
the community of other developers you work with; the communities around the
languages and tools you use; the agile, lean and XP communities; the community
of random people you met once at a conference – or maybe have never met at all
– and started following on Twitter.

Without these communities, we become isolated, and may start believing that the
way we currently work is ‘best practice’. We need our fellow professionals to
compare ourselves against and to learn from, otherwise we can lapse into
complacency and let our skills (and our profession) stagnate.

[tags]software,craftsmanship,wanderingbook[/tags]

Categories
Rails Ruby

Memoised remote attribute readers for ActiveRecord

I was recently working with an ActiveRecord class that exposed some attributes retrieved from a remote API, rather than from the database. The rules for handling the remote attributes were as follows:

  • If the record is unsaved, return the local value of the attribute, even if it’s nil.
  • If the record is saved and we don’t have a local value, call the remote API and remember and return the value.
  • If the record is saved and we already have a local value, return that.

Here’s the original code (names changed to protect the innocent):

class MyModel < ActiveRecord::Base
  attr_writer :foo, :bar

  def foo
    (new_record? || @foo) ? @foo : remote_object.foo
  end

  def bar
    (new_record? || @bar) ? @bar : remote_object.bar
  end

  def remote_object
    @remote_object ||= RemoteService.remote_object
  end
end

The remote_object method makes a call to the remote service, and memoises the returned object (which contains all the attributes we are interested in).

I didn't really like the duplication in all these accessor methods – we had more than the two I've shown here – so decided to factor it out into a common remote_attr_reader class method. Originally I had the method take a block which returned the remote value, but that made the tests more complicated, so I ended up using convention over configuration and having the accessor for foo call a remote_foo method.

Here's the new code in the model:

class MyModel < ActiveRecord::Base
  remote_attr_reader :foo, :bar

  def remote_foo
    remote_object.foo
  end

  def remote_bar
    remote_object.bar
  end

  def remote_object
    @remote_object ||= RemoteService.remote_object
  end
end

Here's the RemoteAttrReader module that makes it possible:

module RemoteAttrReader
  def remote_attr_reader *names
    names.each do |name|
      attr_writer name
      define_method name do
        if new_record? || instance_variable_get("@#{name}")
          instance_eval "@#{name}"
        else
          instance_eval "remote_#{name}"
        end
      end
    end
  end
end

To make the module available to all models, I added an initialiser containing this line:

ActiveRecord::Base.send :extend, RemoteAttrReader

Here's the spec for the module:

require File.dirname(__FILE__) + '/../spec_helper'

class RemoteAttrReaderTestClass
  extend RemoteAttrReader
  remote_attr_reader :foo

  def remote_foo
    "remote value"
  end
end

describe RemoteAttrReader do
  let(:model) { RemoteAttrReaderTestClass.new }

  describe "for an unsaved object" do
    before do
      model.stub(:new_record?).and_return true
    end

    describe "When the attribute is not set" do
      it "returns nil" do
        model.foo.should be_nil
      end
    end

    describe "When the attribute is set" do
      before do
        model.foo = "foo"
      end

      it "returns the attribute" do
        model.foo.should == "foo"
      end
    end
  end

  describe "for a saved object" do
    before do
      model.stub(:new_record?).and_return false
    end

    describe "When the attribute is set" do
      before do
        model.foo = "foo"
      end

      it "returns the attribute" do
        model.foo.should == "foo"
      end
    end

    describe "When the attribute is not set" do
      it "returns the result of calling remote_" do
        model.foo.should == "remote value"
      end
    end
  end
end

To simplify testing of the model, I created a matcher, which I put into a file in spec/support:

class ExposeRemoteAttribute
  def initialize attribute
    @attribute = attribute
  end

  def matches? model
    @model = model
    return false unless model.send(@attribute).nil?
    model.send "#{@attribute}=", "foo"
    return false unless model.send(@attribute) == "foo"
    model.stub(:new_record?).and_return false
    return false unless model.send(@attribute) == "foo"
    model.send "#{@attribute}=", nil
    model.stub("remote_#{@attribute}").and_return "bar"
    model.send(@attribute) == "bar"
  end

  def failure_message_for_should
    "expected #{@model.class} to expose remote attribute #{@attribute}"
  end

  def failure_message_for_should_not
    "expected #{@model.class} not to expose remote attribute #{@attribute}"
  end

  def description
    "expose remote attribute #{@attribute}"
  end
end

def expose_remote_attribute expected
  ExposeRemoteAttribute.new expected
end

Testing the model now becomes a simple case of testing the remote_ methods in isolation, and using the matcher to test the behaviour of the remote_attr_reader call(s).

require File.dirname(__FILE__) + '/../spec_helper'

describe MyModel do
  it { should expose_remote_attribute(:name) }
  it { should expose_remote_attribute(:origin_server) }
  it { should expose_remote_attribute(:delivery_domain) }

  describe "reading remote foo" do
    # test as a normal method
  end
end

[tags]ruby,rails,activerecord,metaprogramming,rspec,matcher,refactoring,dry[/tags]

Categories
Rails Ruby

Managing gems in a Rails project

Over the years I’ve tried a number of approaches for managing gem dependencies in a Rails project. Here’s a quick round-up of what I’ve tried, and the pros and cons of each.

Just use what’s on the system

This is probably most people’s default approach when first starting with Rails. Just sudo gem install whatever you need, require the appropriate gems (either in environment.rb or in the class that uses them), and you’re away.

This mostly works OK for small projects where you’re the only developer, but you still need to make sure the right gems are installed on the machine you’re deploying the application to.

Worse, though, is what happens when you come back to the project after a while, various gems have been updated, and things mysteriously don’t work any more. Not only do you have to mess around getting the code to work with the latest gem versions, but you probably don’t even know exactly which versions it used to work with.

Freeze (unpack) gems

I think I first came across this technique in Err the Blog’s Vendor Everything post. The idea is to install copies of all your gems into the project’s vendor/gems directory, meaning that wherever the code is running, you can guarantee that it has the correct versions of all its dependencies.

This got much easier in Rails 2.1, which allowed you to specify all your gems using config.gem lines in environment.rb (you can also put gems only needed in specific environments in the appropriate file, eg you might only want to list things like rspec and cucumber in config/enviroments/test.rb). You can then run sudo rake gems:install to install any gems that aren’t on your system, and rake gems:unpack to freeze them into vendor/rails, and be sure that wherever you check out or deploy the code, you’ll be running the same versions of the gems. There’s even a gems:build task to deal with gems that have native code (but more on that later).

Subsequent versions of Rails have improved on the original rake tasks – dependencies are now handled much better, for example – but there are still a few problems. The main one is the handling of gems that are required by rake tasks in your project, rather than just from your application code.

When you call a rake task in your Rails project, this is more-or-less what happens (I may have got some of the details slightly wrong):

  1. The top-level Rakefile is loaded.
  2. This in turn requires config/boot.rb, but not config/environment.rb.
  3. It then requires some standard rake stuff, and finally tasks/rails (which is part of Rails – specifically railties). This finds and requires all the .rake files in your plugins and your project’s lib/rake directory.

The problems start when you have a task depends on the rails environment task, and also requires a gem which is listed in environment.rb. Because the gem-loading magic only happens when the environment is loaded, the rake task will be blissfully unaware of your frozen gems, and will load them from the system instead.

If the system gem is newer than the frozen one, you get errors like this:

can't activate foo (= 1.2.3, runtime) for [], already activated foo-1.2.4 for []

If you work on two projects that use different versions of a gem like this, you end up having to uninstall and reinstall them as you switch from one to the other, which gets tedious fairly quickly.

Specify gems, but don’t freeze

You can get round the wrong-version problem to some extent by specifying version numbers in environment.yml as ‘>=x.z.y’ (or by not specifying them at all). If you’re doing that, though, there’s not really much benefit in unpacking the gems, and you may as well just use rake gems:install to make sure they’re on the system. Of course the downside of this approach is that you can’t be sure that everyone’s running the exact same versions of the gems. Worse still, you can’t be sure that what’s on your production box matches your development and test environments.

GemInstaller

GemInstaller solves most of the problems with the built-in Rails gem management by running as a preinitializer, meaning it gets loaded before the other boot.rb gubbins.

GemInstaller uses the gems installed on the system rather than freezing them into the project, but because it gets to run first it ensures that the correct versions are used, even if there are newer versions installed. By default it checks your project’s gem list and installs anything that’s missing every time it runs (which is whenever you start a server, run the console, execute a rake task etc). You create a YAML file listing the gems you need (dependencies are handled automatically), and other options such as an HTTP proxy if necessary.

Of course on Unix-like systems, which is most of them (although I hear there are still people developing Rails projects on Windows), gems are generally installed as root. GemInstaller can get round this in two ways – either by setting the --sudo option and setting a rule in /etc/sudoers to allow the appropriate user(s) to run the gem commands as root without having to provide a password, or by using the built-in gem behaviour that falls back to installing in ~/.gem.

Personally I like to keep all my gems in one place, accessible to any user, so I went for the sudo approach. The only problem with this is that it uses sudo for all gem commands, rather than just install or update, which means it runs a sudo gem list every time your app starts up. Depending on the way you have Apache and Passenger set up this may mean granting sudo access to what should be a low-privileged user.

I ended up disabling the automatic updating of gems, and just warning when they’re missing instead. In fact later versions of GemInstaller don’t try to handle the update automatically anyway.

I created a separate script to do the update, which can be run manually, on a post-merge git hook, or as part of the Capistrano deployment task.

Because GemInstaller needs to go out to the network to fetch any new or updated gems, things get a bit more painful (as always) if you are unfortunate enough to be stuck behind a corporate HTTP proxy. Actually it’s easy enough to configure if you’re always behind a proxy, but it gets slightly trickier if your web access is sometimes proxied and sometimes direct. Nothing that can’t be solved of course.

Unfortunately you can still end up with version conflicts if a gem is required by one you have specified, then you explicitly require an older version, but these can usually be resolved by shuffling the order of the gems in geminstaller.yml.

Bundler

Bundler is the newest kid on the gem management block, and looks to have solved pretty much all the problems faced by the other approaches. It’s based on the gem management approach from Merb, and can be used in any Ruby project (not just Rails).

Bundler works by unpacking gems into the project (I recommend using a directory other than the default vendor/gems to avoid confusing Rails – this can be configured by setting bundle_path and bin_path in the Gemfile), but the intention is that you only commit the .gem files in the cache directory to source control. Gems are then installed locally within the project, including any platform-specific native code as well as the commands in bin.

Because Bundler resolves all dependencies up-front, you only need to specify the gems you’re using explicitly, and let it handle the rest, which hopefully means an end to version conflicts at last.

Here’s an example Gemfile:

source 'http://gemcutter.org'
source 'http://gems.github.com'
bundle_path 'vendor/bundled_gems'
bin_path 'vendor/bundled_gems/bin'

gem 'rails', '2.3.4'
gem 'bundler', '0.6.0'

gem 'capistrano', '2.5.8'
gem 'capistrano-ext', '1.2.1'
gem 'cucumber', '0.4.3', :except => :production
# [more gems here]

disable_system_gems

Note the two additional sources (rubyforge.org is configured by default), the path overrides, and the last line, which removes the system gems from the paths, avoiding any potential confusion.

I’ve put this in config/preinitializer.rb to update from the cached gems on startup (this doesn’t hit the network):

$stderr.puts 'Updating bundled gems...'
system 'gem bundle --cached'
require "#{RAILS_ROOT}/vendor/bundled_gems/environment"

To avoid any startup delays after an upgrade, I also call system 'gem bundle --cached' from the after_update_code hook in the capfile.

Finally, to make sure only the .gem files are checked in, add these lines to .gitignore (you’ll still need to explicitly git add the bundled_gems/cache directory):

vendor/bundled_gems
!vendor/bundled_gems/cache

[Update 3 November] Yehuda Katz just posted an article all about Bundler, including features coming in the imminent 0.7 release.

[tags]ruby,rails,gems,rubygems,geminstaller,bundler[/tags]

Categories
git Software

Maintaining a Read-Only svn Mirror of a git Repository

Our team been using git at work for the past couple of years, but there’s now a corporate push to keep everything in a centrally-managed subversion repository. We lost the battle to get corporate approval for git (apparently we’re happy to employ people to write our code that we wouldn’t trust to be able to use a DVCS), but at least we have agreement that we can continue using git as long as we mirror the code into subversion.

This isn’t entirely trivial to do using git-svn (which is really intended for use with subversion acting as the master), but I found a few sets of instructions on the web. The simplest one was this Nano Rails blog post, which is what the steps below are based on.

The end result should be a subversion trunk which is a mirror of the git repository’s master branch. This is only intended to be a one-way mirror – I wouldn’t recommend also trying to commit into subversion and merge those commits back upstream into git.

You need to have git-svn installed (this comes with the default installation, or you can use the +svn option when installing via MacPorts).

Create subversion repository

Create the subversion repository in the usual way, using svnadmin.

Once you’ve got an empty repository to point to (we’ll imagine it’s at http://svn.example.com/foo), you also need to commit an initial version (I also created a trunk directory in this step, in case we later decide to mirror branches too):

svn co http://svn.example.com/foo
cd myproj
svn mkdir trunk
svn commit -m'Created trunk directory'

Once this is done, you can throw away the directory you checked out of subversion.

Set up the subversion remote

This step, and subsequent ones, need to be performed on whichever git repository you want to mirror from.

In our case, we have a central repository running on a local installation of Gitorious. This is a bare repository, which makes things a little tricker, as git-svn requires a working copy. To get round this, we create a clone, which we’ll use as an intermediate step in the mirroring process. If you’re not mirroring a bare repository, you can omit this step.

The repositories we want to mirror are in ~git/repositories, and we’ve created a directory ~git/repositories/svn-mirror where we’ll put the clones. For this example, we’ll use a repository called foo/mainline.git.

Create the clone:

git clone ~git/repositories/foo/mainline.git ~git/repositories/svn-mirror/foo
cd ~git/repositories/svn-mirror/foo

Now add the following to .git/config (with the correct svn URI, of course):

[svn-remote "svn"]
	url = http://svn.example.com/foo/trunk
	fetch = :refs/remotes/git-svn

Now do an initial fetch of the empty subversion remote, and check it out as a new git branch (called svn):

git svn fetch svn
git checkout -b svn git-svn

You can now merge in all your commits from master, and push them to subversion. You’ll probably want to go and make a coffee or something while the dcommit runs – if you haven’t used subversion for a while you’ve probably forgotten just how much slower it is than git.

git merge master
git svn dcommit

To allow pushing to svn from master, rebase master to the svn branch (which can then be deleted):

git checkout master
git rebase svn
git branch -d svn

At this point you should be able to manually update subversion at any time by running git svn dcommit from the master branch.

Automating subversion updates

In theory it should be possible to set up post-receive hooks to push from the gitorious repository to the clone and from there to subversion, but I decided to separate the two by just periodically polling for changes, as we don’t really care about subversion being right up-to-the-minute. To poll hourly, add something like this to the git user’s crontab:

0 * * * * (cd /usr/local/git/repositories/svn-mirror/foo;/usr/local/bin/git pull origin master;/usr/local/bin/git svn dcommit) >>/usr/local/git/gitorious/log/svn-mirror.log 2>&1

[tags]svn, subversion, git, mirror[/tags]

Categories
Ruby Uncategorized

Comments aren’t always evil

I tend to agree that comments are, in most cases, evil (or at least mildly malevolent), but I did come across one of the exceptions to the rule today.

While doing a bit of drive-by refactoring while fixing a bug, I reflexively changed this line:

unless instance_response.nil?

to this:

if instance_response

Then reading the comment above the line, expecting to delete it, it all came flooding back:

# Use instance_response.nil? to check if the HTTParty
# response's inner hash is empty.
# If you use 'if instance_response', it is always true.

Now you could maybe argue that this unexpected behaviour is because httparty uses just a little too much of that old method missing proxy magic (which of course isn’t really magic at all), but that’s not the point of this post.

In the end I pulled it out into a private method to make it clearer what was going on, but decided to leave the comment in.

def self.instance_returned? instance_response
  # Use instance_response.nil? to check if the HTTParty
  # response's inner hash is empty.
  # If you use 'if instance_response', it is always true.
  !instance_response.nil?
end
Categories
Ruby

Naresh Jain’s refactoring teaser

Naresh Jain recently posted a refactoring teaser. The original code was in Java, but I thought I’d have a go at refactoring it in Ruby instead. I deliberately didn’t look at Naresh’s solution beforehand, so mine goes in a rather different direction.

My code’s here (with the specs in the same file for simplicity), and you can step through the history to see the state at each step of the refactoring. The links in the text below take you to the relevant version of the file, and the Δs next to them link to the diffs.

Firstly, I converted the code to Ruby, and made sure the tests (which I converted to RSpec) still passed. It’s a fairly straight conversion, although I made a couple of changes while I was at it – mainly turning it into a module which I mixed into String. I changed the name of the method to phrases, to avoid conflicting with the built-in String#split. The regular expression to split on is much simpler too, because Ruby didn’t understand the original, and I had no idea what it was supposed to do anyway.

Here’s my initial Ruby version:

#!/usr/bin/env ruby
#
#See http://blogs.agilefaqs.com/2009/07/08/refactoring-teaser-part-1/

require 'spec'

module StringExtensions
  REGEX_TO_SPLIT_ALONG_WHITESPACES = /\s+/
 
  def phrases(number)
    list_of_keywords = ""
    count = 0
    strings = split(REGEX_TO_SPLIT_ALONG_WHITESPACES)
    all_strings = single_double_triple_words(strings)
    size = all_strings.size
    all_strings.each do |phrase|
      break if count == number
      list_of_keywords += "'" + phrase + "'"
      count += 1
      if (count < size && count < number)
        list_of_keywords += ", "
      end
    end
    return list_of_keywords
  end
 
  private
 
  def single_double_triple_words(strings)
    all_strings = []
    num_words = strings.size
 
    return all_strings unless has_enough_words(num_words)
 
    # Extracting single words. Total size of words == num_words
 
    # Extracting single-word phrases.
    (0...num_words).each do |i|
      all_strings << strings[i]
    end
 
    # Extracting double-word phrases
    (0...num_words - 1).each do |i|
      all_strings << "#{strings[i]} #{strings[i + 1]}"
    end
 
    # Extracting triple-word phrases
    (0...num_words - 2).each do |i|
      all_strings << "#{strings[i]} #{strings[i + 1]} #{strings[i + 2]}"
    end
    return all_strings
  end
  
  def has_enough_words(num_words)
    num_words >= 3
  end
end

String.send(:include, StringExtensions)

describe StringExtensions do
  it 'finds all phrases' do
    'Hello World Ruby'.phrases(6).should == "'Hello', 'World', 'Ruby', 'Hello World', 'World Ruby', 'Hello World Ruby'"
  end

  it 'returns all phrases when asked for more than exist' do
    'Hello World Ruby'.phrases(10).should == "'Hello', 'World', 'Ruby', 'Hello World', 'World Ruby', 'Hello World Ruby'"
  end

  it 'returns the first n phrases when asked for fewer than exist' do
    'Hello World Ruby'.phrases(4).should == "'Hello', 'World', 'Ruby', 'Hello World'"
  end

  it 'returns the first word when asked for one phrase' do
    'Hello World Ruby'.phrases(1).should == "'Hello'"
  end
end

I didn’t change the specs at all during the refactoring (because I didn’t change the API or add any new public methods or classes), and made sure they all passed at each step.

The first thing Δ I changed was to simplify that big iterator in phrases that loops through the list of phrases, formatting the output string. Basically all this does is to put each phrase in quotes, then stitch them all together separated by a comma and a space. The first of those tasks is a simple map, and the second is a join. The whole method collapses down to this (ruby methods automatically return the result of their last statement):

def phrases(number)
  strings = split(REGEX_TO_SPLIT_ALONG_WHITESPACES)
  all_strings = single_double_triple_words(strings)
  all_strings[0, number].map {|s| "'#{s}'"}.join(', ')
end

Next Δ I remembered that by default String#split splits at whitespace anyway, so I did away with the regular expression. Then Δ I renamed the strings variable to words to make its purpose a little clearer, leaving the phrases method looking like this:

def phrases(number)
  words = split
  all_strings = single_double_triple_words(words)
  all_strings[0, number].map {|s| "'#{s}'"}.join(', ')
end

The section of single_double_triple_words that extracted the single words seemed redundant, as we already had that list – the original words. I removed it Δ, and initialised all_strings to the word list instead (not forgetting to rename single_double_triple_words to match its new behaviour):

module StringExtensions
  def phrases(number)
    words = split
    all_strings = words
    all_strings += double_triple_words(words)
    all_strings[0, number].map {|s| "'#{s}'"}.join(', ')
  end
 
  private
 
  def double_triple_words(strings)
    all_strings = []
    num_words = strings.size
 
    return all_strings unless has_enough_words(num_words)
 
    # Extracting double-word phrases
    (0...num_words - 1).each do |i|
      all_strings << "#{strings[i]} #{strings[i + 1]}"
    end
 
    # Extracting triple-word phrases
    (0...num_words - 2).each do |i|
      all_strings << "#{strings[i]} #{strings[i + 1]} #{strings[i + 2]}"
    end
    return all_strings
  end
  
  def has_enough_words(num_words)
    num_words >= 3
  end
end

That has_enough_words method seemed a bit odd – particularly the way it was only called once, rather than after extracting each set of phrases. I decided it was probably a premature and incomplete attempt at optimisation, and removed it Δ for now.

My next target was the duplication in the blocks that calculate double- and triple-word phrases. First Δ I extracted them into separate methods:

module StringExtensions
  def phrases(number)
    words = split
    all_strings = words
    all_strings += double_words(words)
    all_strings += triple_words(words)
    all_strings[0, number].map {|s| "'#{s}'"}.join(', ')
  end
 
  private
 
  def double_words(strings)
    all_strings = []
    num_words = strings.size
 
    # Extracting double-word phrases
    (0...num_words - 1).each do |i|
      all_strings << "#{strings[i]} #{strings[i + 1]}"
    end
    return all_strings
  end
 
  def triple_words(strings)
    all_strings = []
    num_words = strings.size
 
    (0...num_words - 2).each do |i|
      all_strings << "#{strings[i]} #{strings[i + 1]} #{strings[i + 2]}"
    end
    return all_strings
  end
end

I decided that the num_words variable in the two new methods wasn't really necessary (it was only used once, and I think strings.size expresses intent perfectly clearly), so I inlined it Δ (and the same in triple_words):

def double_words(strings)
  all_strings = []

  # Extracting double-word phrases
  (0...strings.size - 1).each do |i|
    all_strings << "#{strings[i]} #{strings[i + 1]}"
  end
  return all_strings
end

Most of the code in double_words and triple_words was obviously very similar, so I created Δ a general extract_phrases method, and called it from both. The new method uses the start position and length version of Array#[] to extract the appropriate number of words, then Array#join to string them together separated by spaces:

def double_words(strings)
  extract_phrases(strings, 2)
end

def triple_words(strings)
  extract_phrases(strings, 3)
end

def extract_phrases(strings, number_of_words)
  result = []
  (0...strings.size - number_of_words + 1).each do |i|
    phrase = strings[i, number_of_words].join(' ')
    result << phrase
  end
  result
end

At this point double_words and triple_words have become just dumb wrappers around extract_phrases, so I removed them Δ and just called extract_phrases directly:

def phrases(number)
  words = split
  all_strings = words
  all_strings += extract_phrases(words, 2)
  all_strings += extract_phrases(words, 3)
  all_strings[0, number].map {|s| "'#{s}'"}.join(', ')
end

Rather than hardcoding the calls for two and three words, I changed it Δ to use a loop:

def phrases(number)
  words = split
  all_strings = words
  (2..words.size).each do |number_of_words|
    all_strings += extract_phrases(words, number_of_words)
  end
  all_strings[0, number].map {|s| "'#{s}'"}.join(', ')
end

I decided this was the point to put the optimisation back Δ and stop looking for phrases once we had enough:

def phrases(number)
  words = split
  all_strings = words
  (2..words.size).each do |number_of_words|
    break if all_strings.length >= number
    all_strings += extract_phrases(words, number_of_words)
  end
  all_strings[0, number].map {|s| "'#{s}'"}.join(', ')
end

I decided that number_of_words wasn't particularly clear, so changed it Δ to phrase_length, then Δ made the iterator in extract_phrases more ruby-like by using map:

def extract_phrases(strings, phrase_length)
  (0...strings.size - phrase_length + 1).map do |i|
    strings[i, phrase_length].join(' ')
  end
end

I then noticed that I hadn't been consistent in changing strings to words, so fixed that Δ.

Lastly Δ, I decided that even though it was only one line, the code that formats the output deserved pulling out into a method.

Here's my final version of the code:

module StringExtensions
  def phrases(number)
    words = split
    all_strings = words
    (2..words.size).each do |phrase_length|
      break if all_strings.length >= number
      all_strings += extract_phrases(words, phrase_length)
    end
    format_output(all_strings[0, number])
  end
 
  private
 
  def extract_phrases(words, phrase_length)
    (0...words.size - phrase_length + 1).map do |i|
      words[i, phrase_length].join(' ')
    end
  end

  def format_output(phrases)
    phrases.map {|s| "'#{s}'"}.join(', ')
  end
end
Categories
rspec

A couple of rspec mocking gotchas

Just a couple of things that have caused a bit of head-scratching lately when writing RSpec specs using the built-in mocking framework.

Catching StandardError

Watch out if the code you’re testing catches StandardError (of course you’re not catching Exception, right?). Try this:

require 'rubygems'
require 'spec'
 
class Foo
  def self.foo
    Bar.bar
  rescue StandardError
    # do something here and don't re-raise
  end
end
 
class Bar
  def self.bar
  end
end
 
describe 'Calling a method that catches StandardError' do
  it 'calls Bar.bar' do
    Bar.should_receive :bar
    Foo.foo
  end
end

Nothing particularly exciting there. Let’s run it and check that it passes:

$ spec foo.rb 
.

Finished in 0.001862 seconds

1 example, 0 failures

However, what if we change the example to test the opposite behaviour?

describe 'Calling a method that catches StandardError' do
  it 'does NOT call Bar.bar' do
    Bar.should_not_receive :bar
    Foo.foo
  end
end
$ spec foo.rb 
.

Finished in 0.001865 seconds

1 example, 0 failures

Wait, surely they can’t both pass? Let’s take out the rescue and see what’s going on:

class Foo
  def self.foo
    Bar.bar
  end
end
$ spec foo.rb 
F

1)
Spec::Mocks::MockExpectationError in 'Calling a method that catches StandardError does NOT call Bar.bar'
 expected :bar with (no args) 0 times, but received it once
./foo.rb:6:in `foo'
./foo.rb:18:

Finished in 0.002276 seconds

1 example, 1 failure

That’s more like it.

Of course, what’s really happening here is that Spec::Mocks::MockExpectationError is a subclass of StandardError, so is being caught and silently discarded by our method under test.

If you’re doing TDD properly, this won’t result in a useless test (at least not immediately), but it might cause you to spend a while trying to figure out how to get a failing test before you add the call to Foo.foo (assuming the method with the rescue already existed). Generally you can solve the problem by making the code a bit more selective about which exception class(es) it catches, but I wonder whether RSpec exceptions are special cases which ought to directly extend Exception.

Checking receive counts on previously-stubbed methods

It’s quite common to stub a method on a collaborator in a before block, then check the details of the call to the method in a specific example. This doesn’t work quite as you would expect if for some reason you want to check that the method is only called a specific number of times:

require 'rubygems'
require 'spec'

class Foo
  def self.foo
    Bar.bar
    Bar.bar
  end
end

class Bar
  def self.bar
  end
end

describe 'Checking call counts for a stubbed method' do
  before do
    Bar.stub! :bar
  end

  it 'only calls a method once' do
    Bar.should_receive(:bar).once
    Foo.foo
  end
end
$ spec foo.rb 
.

Finished in 0.001867 seconds

1 example, 0 failures

I think what’s happening here is that the mock object would normally receive an unexpected call, causing the expected :bar with (any args) once, but received it twice error that you’d expect. Unfortunately the second call to the method is handled by the stub, so never triggers the error.

You can fix it, but it’s messy:

it 'only calls a method once' do
  Bar.send(:__mock_proxy).reset
  Bar.should_receive(:bar).once
  Foo.foo
end
$ spec foo.rb 
F

1)
Spec::Mocks::MockExpectationError in 'Checking call counts for a stubbed method only calls a method once'
 expected :bar with (any args) once, but received it twice
./foo.rb:23:

Finished in 0.002542 seconds

1 example, 1 failure

Does anyone know a better way?

The full example code is in this gist.

Categories
rspec

Helpful message from rspec

Just came across an interesting error message from rspec. I had a spec that looked like this:

it "should not mass-assign 'confirmed'" do
  Blog.new(:confirmed => true).confirmed.should_not be_true
end

Obviously it failed, as I hadn’t written the code yet, but there was more in the error message than I expected:

..........F

1)
RuntimeError in 'Blog should not mass-assign 'confirmed''
'should_not be  true' not only FAILED,
it is a bit confusing.
It might be more clearly expressed in the positive?
.../spec/models/blog_spec.rb:20:

Finished in 0.06192 seconds

11 examples, 1 failure

In fact, rewriting this as should be_false wouldn’t work, as the expected value is nil. I took the hint though, and rewrote it as should be_nil.

Categories
Agile Software

Design by Example or Test-Driven Development?

Brad Wilson argues the case for renaming TDD to ‘Design by Example’ (or DbE), to emphasise the fact that TDD (and BDD, which is really just TDD done well) are about design rather than testing.

I agree with the intention, but I’m not sure I agree. I’m happy to lose the ‘test’ &ndash it’s losing the ‘development’ that bothers me. I’ve encountered far too many ‘designers’ who haven’t written a line of code in years, and I worry that they’d read ‘design by example’ as simply starting their design document with a few test cases. I’ve already seen TDD redefined as test-driven design and interpreted that way.

Of course for people who already grok TDD this isn’t an issue, but it’s not them that a name change is aimed at. Test-driven development might not be the perfect name, but at least there are already plenty of people who do understand what it really means to correct any misapprehension. The term BDD (which I use myself from time to time) has already started sowing confusion, without adding yet another.

On the other hand, anything that gets people discussing how to get the most from the practice has to be a good thing.