Nimble Industries

The Cost of Ignoring the Rails Way

T

One of my consulting clients recently asked me to investigate a bug causing an unhandled exception in their production Rails app. The Bugsnag issue looked like countless others any Rubyist has no doubt reviewed before: undefined method `invitation_accepted_at' for nil:NilClass.

This type of error indicates the code was expecting an object to have been returned when it wasn’t. A method was being called on a nil object, in this case a method called invitation_accepted_at (which comes from DeviseInvitable, a gem used to invite users).

I immediately pulled up the offending code and found something similar to the following inside an ActiveRecord model. The model names have been slightly changed for illustration purposes, but the structure is unchanged:

class Company < ApplicationRecord
  def invited_users
    users = []
    invitations.each do |invitation|
      if invitation.person.user.invitation_accepted_at == nil
        users.append(invitation.person.user)
      end
    end
    return users
  end
end

Knowing that we were iterating over associated ActiveRecord objects, I wondered why this was not being done with an association and a SQL join rather than all in memory in Ruby.  Astute readers will see several issues with this method which I will detail below.

I immediately annotated the code using :Gblame in Vim (provided by Fugitive). Then I pressed o on the relevant lines to open up the whole commit. There I found a reference to a Jira ticket that I could mine for more details. I was able to see the original code that this method replaced. It turns out, it originally was an ActiveRecord association and looked something like this:

class Company < ApplicationRecord
  has_many :invited_users, -> { where(invitation_accepted_at: nil) }, through: :user_invitations, source: :user
end

That’s better! That’s what I would expect to see. But why was this changed to in-memory solution? We lost a single SQL query and replaced it with a mess of ugly Ruby. In the process, two issues were introduced:

Bug 1: Assumed Data Integrity

The method assumes that every Person will have a User and when they don’t, the nil exception above is raised. One way to tackle this would be to add yet another conditional inside that each loop. Or to use Ruby’s Safe Navigation Operator, the &. syntax. There are also plenty of reasons to try avoiding using that syntax, as it can be a code smell. The mistake here, though, was the original assumption.

Bug 2: N+1 (Death by a Thousand Queries )

There was another issue introduced by the removal of the ActiveRecord association: an N+1 problem. This issue was causing noticeable performance problems with the page. In fact, it’s worse than an N+1, it’s an (2 * N) + 1 because two joined models are called, the Person and User model.

An N+1 query is when a single query is issued to get a list of things out of the database (the +1) and then another query is issued for every single one of those things (the N). This is a pretty common problem that can quickly affect performance of large (and not so large) lists of data and is something every Rails developer needs to recognize and avoid.

The Rails Way to the Rescue

Fortunately, the fix for this issue is pretty simple. During my git blame and Jira reconnaissance, I discovered that the original code was introduced to change the users that this invited_users method returned. In this data model, an Invitation has two users: a user directly associated with the invitation, and a user associated with the Person object that is, in turn, associated with the Invitation. Here’s a little data model diagram showing, in yellow, the relationships we have to add:

All we need to do here is add another User relationship to the Invitation model. But this one is through the Person model.

class Invitation < ApplicationRecord
  has_one :invited_user, through: :person, source: :user
end

Then, we can add back our invited_users relationship to the Company model but change the source is our new :invited_user relationship on the Invitation model.

class Company < ApplicationRecord
  has_many :invited_users, -> { where(invitation_accepted_at: nil) }, through: :invitations, source: :invited_user
end

Just like that, we’ve fixed the bug. That’s because these relationships will use left outer join which will ignore the broken relationship that causes our bug. A the same time, it will fix our N+1 issue and significantly improve the speed of the page. This particular company had 900 invitations meaning almost 2,000 queries to load a single page. Our fix produces one.

What Did We Learn?

Yes, there are many ways to do things in Rails, and in Ruby in general. Not all ways are equal. Ruby idioms are idioms for a reason, and often many varied reasons. It might be for readability, less chance of introducing a bug, or performance concerns. Sometimes Ruby can be written in a more testable way or improved to make code more approachable.

A gem like Bullet could have caught this automatically if run in the test or development environment. Bullet raises exceptions or logs messages when it detects N+1 queries introduced.

Better code review is needed. On this team, it might be prudent for one the more experienced Ruby developers to review every PR. Having experienced eyes on every diff can help reduce the change of code like this making it intro production.

Nimble Industries