Stop Loading Large Ecto Fields by Default

Hey folks! Let's warm up with a joke:

It is forbidden to laugh out loud in Hawaii.

You have to keep it to a low ha.

A low ha. Aloha. Got it? Good! Let's begin.

The Problem

Let's say you want to store a map in an Ecto.Schema. You add a :map field which Ecto converts into an optimized jsonb column in Postgres. You trust the awesomeness of Postgres and start dumping data into it. Everything works fine.

Until your PM pings you.

Hey. Why is our website so slow now?

Weird.

Everything works fine locally, but your production website now takes seconds to load when it worked just fine yesterday. You scratch your head. Is it maybe the new :map field you added?

Spoiler: It is! 🎉

You realize that only pages that list many records are affected and pages that show a single record are not. Then it hits you: jsonb fields can hold up to ~268MB of data and json fields even up to 1GB! Whenever you list your records, Ecto fetches all this data from the database. Most times, you don't even need it!

So, how can you tell Ecto to not load it by default?

The Solution

Enter, the load_in_query option.

By default, Ecto loads all schema fields whenever it returns a record from a query. So, if you list e.g. the 20 latest records, Ecto will load all their fields - including very large jsonb fields - when you run Repo.all/1.

With load_in_query, you can tell Ecto to not load the field by default, but only if you select it explicitly. Here's how:

defmodule MyApp.MySchema do
  use Ecto.Schema

  schema "my_schemas" do
    field :very_large_map, :map, load_in_query: false

    # load_in_query can be set on any `field`. For example:
    field :very_large_list, {:array, :string}, load_in_query: false
    field :huge_text, :string, load_in_query: false
  end
end

Now, when you run Repo.one/1 or Repo.all/1 on this schema, the very_large_map field will be nil. Since Ecto doesn't have to load potentially gigabytes of data, your website should list records much faster again.

But what if you need the data in very_large_map? Easy, just select the field explicitly, like so:

def list_records(with_very_large_map \\ false) do
  MySchema
  |> maybe_load_map(with_very_large_map)
  |> Repo.all()
end

defp maybe_load_map(query, false), do: query

defp maybe_load_map(query, true) do
  select(query, [schema], %{schema | very_large_map: schema.very_large_map})
end

Now, if you set with_very_large_map: true, Ecto will load the field again.

Conclusion

And that's it! I hope you enjoyed this article! If you want to support me, you can buy my firewall for Phoenix Phx2Ban or my book or video courses (one and two). Follow me on Bluesky or subscribe to my newsletter below if you want to get notified when I publish the next blog post. Until next time! Cheerio 👋