Ignoring English Articles With Regular Expressions
During some freelance work, we had a special requirement that I imagine a lot of people run into. A list of items returned from the database needed to be sorted by name while ignoring English articles: “a,” “an,” and “the.” Though I hate them with a passion, I knew regular expressions were the solution.
Imagine the following method in the model needing sorted:
def normalized_name
self.name.gsub(/^(the|a|an)\s/i, '')
end
The above regular expression results in a return value of name, sans any leading articles. Since I hate regular expressions, and perhaps you do, too, here’s a run down of each thing defined besides the ‘/’ characters.
^ - Matches the beginning of a line (the|a|an) - Matches the strings "the", "a", or "an" i - Case insensitive. Ignores case in the string and regular expression.
And there you are. Regardless of capitalization, the normalized_name
method will return the original name with English articles stripped off the front.
To do the actual sorting in Ruby on Rails, try the following:
Item.find(:all).sort_by{|item| item.normalized_name}