A Bundler problem

Last week, we had a tricky Bundler loading issue at work so I went into a deep dive into the Bundler and RubyGems source code and learned a few things.

The problem

We have a parent process that spawns multiple Sidekiq processes using bundle exec sidekiq. Now we want to setup the Bundler environment in the parent process because we want it to load some dependencies. It seems straightforward but we ran into a very cryptic error message when running the child processes:

Gem::LoadError: ed25519 is not part of the bundle. Add it to your Gemfile.

This was very confusing because ed25519 was definitely part of our Gemfile although it was in a non-default group. And this was working fine before we loaded Bundler in the parent process. How could loading Bundler on one Ruby process affect another Ruby process?

The answer turned out to be simple: Bundler alters environment variables when it is loaded. This is also documented in https://bundler.io/man/bundle-exec.1.html#Shelling-out. So simply wrapping the system call with Bundler.with_original_env fixed the problem.

It took many hours to figure that out but in the process, I also learned more about how Bundler works and how we ended up with that error message.

Activating gems

When we looked at the stacktrace for the error, it pointed us to this line in the net-ssh gem. It calls gem 'ed25519', '~> 1.2'. That seemed odd to me because I’ve never seen this gem method called outside of a Gemfile.

It turns out that RubyGems actually defines Kernel#gem. As the docs say, this activates a specific version of the gem so that it can be required. This basically finds an installed gem that matches the version specification and then adds it to the load path.

With Bundler, this is overridden so that it also checks if the gem is part of the bundle. This explains why we got the error message. But why wasn’t it part of the bundle when it’s there in the Gemfile?

Bundler.setup and Bundler.require

With a Gemfile defined, the declared set of gems can be activated using Bundler.setup. This is usually done by requiring bundler/setup or by running bundle exec. This allows the defined gems to be required by your application. This is meant to be called only once and subsequent calls are no-ops.

Bundler.require is a shortcut that allows you to require gems from the specified group names in one command. If Bundler.setup hasn’t been called yet, it calls setup, passing in the list of group names. This means that only the specified groups will be activated and activating or requiring other gems in the Gemfile will fail.

Knowing that our error started from our Bundler.require call, this suggests that Bundler.setup was not called and we were only activating the gems in the specified groups. But we required bundler/setup before the Bundler.require and we even run the process with bundle exec.

After some more debugging, I found out that Bundler.reset_paths! was being called because ENV['BUNDLE_GEMFILE'] was set. This cleared the setup so that by the time we called Bundler.require, it was setting it up again with only the specified groups.

The reset was happening within the bundle exec wrapper. So the require 'bundler/setup' in our application should have set it up again because we called Bundler.require. That didn’t happen because bundle exec already required the file and in Ruby, requiring the same file is just a no-op. Switching the require into an explicit call to Bundler.setup also fixed the problem.