I am BARRY HESS > Blog

Ignoring English Articles With Regular Expressions

During some freelance work, we had a special requirement that I imagine a lot of people run into. A list of items returned from the database needed to be sorted by name while ignoring English articles: “a,” “an,” and “the.” Though I hate them with a passion, I knew regular expressions were the solution.

Imagine the following method in the model needing sorted:

def normalized_name
 self.name.gsub(/^(the|a|an)\s/i, '')

The above regular expression results in a return value of name, sans any leading articles. Since I hate regular expressions, and perhaps you do, too, here’s a run down of each thing defined besides the ‘/’ characters.

^          - Matches the beginning of a line
(the|a|an) - Matches the strings "the", "a", or "an"
i          - Case insensitive.  Ignores case in the string and regular expression.

And there you are. Regardless of capitalization, the normalized_name method will return the original name with English articles stripped off the front.

To do the actual sorting in Ruby on Rails, try the following:

Item.find(:all).sort_by{|item| item.normalized_name}