Automatic preloading in Rails: the dream that came true.
Recently, I published an article about “Stop using manual preloading in your Rails application; use this instead.” Many people were interested, but I failed to explain the ultimate solution. Considering that I still see posts about ActiveRecord includes
, I want to elaborate on the idea deeply.
If you haven’t read the original article, please do so. However, this is not required to understand the subject.
Without further ado, let’s get into the topic.
Skip the next section if you are well familiar with N+1 issues.
I’m sure you’ve all heard of N+1 issues here. In case you don’t, putting it simply, I would say:
“The code executes many similar inefficient database queries/HTTP requests/complex calculations.”
Most often, in the Ruby on Rails world, it’s all about database queries so we will stop on that part. Ruby on Rails offers a built-in solution for dealing with N+1 issues regarding fetching associations.
Now, let’s look at the example.
class User < ActiveRecord::Base
has_many :accounts
end
class Account < ActiveRecord::Base
has_many :contacts
end
class Contact < ActiveRecord::Base
end
Somewhere in the code, you want to show your users’ accounts’ contacts.
users = User.all
users.each do |user|
user.accounts.each do |account|
p account.contact
end
end
If you didn’t spot the issue yet, please stop here for a moment, look carefully at the code above, and try to find it.
The problem is that for every user, there will be a query to a database to fetch accounts; moreover, for every account then, there will be a query to fetch its contacts.
As you can see, this chain of calls can grow as a snowball, leading to hundreds or even thousands of requests.
Rails’ solution is to use includes
at the very beginning of the chain. The fixed code would look like this:
users = User.includes(accounts: :contacts).all
# ...
This code will preload all needed data in only three queries: all users, their accounts and contacts.
If you have ever worked with includes
before, please pause here and remember what you didn’t like about it.
To keep the post concise and focused on the elegant solution we are looking at soon, I will shortly share mine:
- you have to accurately and manually keep your beginning point consistent with the rest during the execution. If you no longer need down-the-road
contacts
, you better updateincludes
, too; otherwise, you load extra data for no reason. If you need more data, let’s sayreferrals
, you must updateincludes
, or you get another N+1 issue. The effort required for consistency depends on how much the execution trace is spread along the project, but it isn’t trivial. includes
fetches all the data immediately. Sometimes, we need to show information under the conditions. Following our case, what ifcontacts
should only be displayed for primary accounts? It’s possible to do partial includes by directly callingActiveRecord::Associations::Preloader
, but this way isn’t convenient nor recommended by the guidelines.
Now, when we recall the N+1 problem and its most-popular Rails-way solution for that, let’s look at the proposed standard.
The Ruby on Rails framework is all about a convention and fast delivery. The goal is to focus closely on the business rather than technical aspects. With this in mind, let’s look at the fix provided by includes
.
When I look at it, I wonder: if the only thing needed to avoid the N+1 problem is to type includes
with specified associations, why can’t Rails do that for me? Is it possible?
Gladly, it is! It’s already production-proven and awaits you at zero integration cost.
- Add
gem "ar_lazy_preload”
to yourGemfile
. - Enable auto-preloading globally
ArLazyPreload.config.auto_preload = true
- Remove redundant
includes
Chain your loading with .preload_associations_lazily
If you don’t want to enable auto-preloading globally. For example, User.preload_associations_lazily.all
. Any consequential association loading on every user
(and following loaded records) won’t create an N+1 problem.
Let’s investigate what it does to avoid the N+1 problem. I will also show you when it doesn’t work as a bonus.
There are two main parts that make auto-preloading work:
ArLazyPreload Context
(it has two implementations: one for globally enabled preloading and another for preloading through.preload_associations_lazily
)ArLazyPreload Relation
patch
Context
is an object that stores metadata in with every ActiveRecord
instance. It’s stored in .lazy_preload_context
instance method. The most important metadata is the list of sibling records.
Sibling records are the records of the same class fetched in the same query and conceptually treated as similar records.
users = User.preload_associations_lazily.first(5)
# Records are siblings/Share single Context
users.map(&:lazy_preload_context).uniq.count == 1
other_users = User.preload_assocations_lazily.first(5)
# Records are siblings/Share single Context
other_users.map(&:lazy_preload_context).uniq.count == 1
# But "users" aren't siblings with "other_users"
users.first.lazy_preload_context != other_users.first.lazy_preload_context
In the code above, instances in users
and other_users
groups are sibling records among their groups, but the two groups are not siblings.
Context
also keeps track of a tree of already preloaded associations, but this is unnecessary to understand the main point.
Relation
patches ActiveRecord::Relation
class to look into the Context
when deciding on loading the association. It will preload the association in one query for all records in the context if it exists. New records are properly distributed per referencing instance and cached there as if they would be manually preloaded with includes
. After that, it assigns the context to loaded records, keeping them as siblings so they won’t produce the N+1 problem too.
Does that mean we don’t need to think about the N+1 at all?
Well, yes and no.
The answer depends on how well you know how includes
is working. For example, please, look at the code below and think if it has the N+1 issue.
users = User.includes(:accounts).all
users.each do |user|
user.accounts.where(primary: true) do |account|
p account
end
end
The answer is yes.
I leave it to you to understand why because this is very important and would help you avoid many pitfalls.
There are two main quick solutions:
First way is to replace .where
with in-memory Ruby filtering by select { |account| account.primary == true }
. However, I don’t recommend you this way as it is inefficient.
The second way is to create a new scoped association in User
model and use that instead.
class User < ActiveRecord::Base
has_many :primary_accounts, -> { where(primary: true) }
end
users = User.includes(:primary_accounts)
users.each do |user|
user.primary_accounts do |account|
p account
end
end
This approach works well. However, it is very negotiable due to software design aspects; therefore, it may or may not be accepted in your project.
The same pattern comes when using auto-preloading. You can’t chain associations without declaring a new scoped association to avoid the N+1 issue. It’s simple to do, though.
That’s about it for auto-preloading in ActiveRecord (Rails default ORM). I hope this time I did better in explaining how it works and why you should start using it.
Please consider subscribing!
And don’t forget to share what you think about the topic. Do you consider N+1 issues to be important in your projects? I would be happy to hear from you.