Binary Pattern Matching in Elixir
Welcome back! First, a joke:
Don’t trust atoms
They make up everything
Now, onto today’s topic: Binary Pattern Matching in Elixir
The Gist
Pattern matching in Elixir is a superpower you probably know about, but chances are that you haven’t encountered its little brother, binary pattern matching yet. Let’s fix that.
In essence, Elixir makes it incredibly easy to deconstruct binaries into their more meaningful parts.
To give you an example: Imagine you’re using Nerves to read temperature and humidity from a sensor. The sensor spec shows how to interpret the bits:
| bit7 | bit6 | bit5 | bit4 | bit3 | bit2 | bit1 | bit0 |
| --- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| temp | 0 | humidity |
Now, before the Nerves-folks crucify me for creating such a nonsensical memory map, let me clarify that I completely made up this table to exactly fit my story-telling needs. So, hush Frank, Gus, and Lars!
With the spec above in hand, you set out to decode the 1s and 0s you receive from the sensor. The pattern match would look like this:
iex> <<temperature::4, _::1, humidity::3>> = <<0b11110111>>
<<247>>
iex> temperature
15
iex> humidity
7
And that’s how easy it is to decode binaries using pattern matching!
Look at how similar the spec and pattern match look like! That’s one of the big advantages of using Elixir for these problems. It’s super readable.
Let’s break down the pattern match into its individual components.
Bitstrings
First, the <<>>
syntax defines a bitstring.
A bitstring
is a contiguous sequence of bits in memory.
However, when you write <<1>>
, Elixir doesn’t create a bitstring with only a single bit but it stores the number as one byte (8 bits) by default.
iex> <<0b00000001>> == <<1>>
true
iex> <<0b10000000>> == <<128>>
true
That means you have to watch out which values you store in the bitstring since values that are larger than a single byte will get truncated.
iex> <<1>> == <<257>>
true
The integer 257
in binary is 0b100000001
, but those are 9 bits. Elixir bitstrings reserve only 8 bits per value by default. Thus, the left-most bit is ignored and only 0b00000001
is stored which is equal to 1
.
To store larger values, you have to specify the bit count with value::size(n)
or value::n
for short:
iex> <<257::9>>
<<128, 1::size(1)>>
iex> <<value::9>> = <<257::9>>
<<128, 1::size(1)>>
iex> value
257
The ::n
notation instructs Elixir to reserve n
bits to store the value. This isn’t how Elixir displays the bitstring though and that might be confusing.
Storing vs Displaying Bitstrings
Elixir fills the reserved bits from the right to store a value, but splits them into bytes from the left when displaying them.
iex> <<257::10>>
<<64, 1::size(2)>>
iex> <<257::11>>
<<32, 1::size(3)>>
iex> <<257::12>>
<<16, 1::size(4)>>
The first part (64
, 32
, 16
) gets smaller as we reserve more bits because of how Elixir stores versus displays bits. When we reserve 10 bits for 257
, Elixir first creates a bitstring of 10 zeros:
# Space added for clarification
00000000 00
Then, Elixir stores the value 257
by filling the bitstring from the right:
Before: 00000000 00
After: 01000000 01
But when it displays the bitstring, Elixir splits the bits into bytes from the left:
# With size: 10
01000000 01
| |
<< 64, 1::2 >>
# With size: 11
00100000 001
| |
<< 32, 1::3 >>
# With size: 12
00010000 0001
| |
<< 16, 1::4 >>
The difference of displaying vs storing a bitstring might be confusing, but in practice the bitstring remains one continuous sequence of bits:
iex> <<0b0100000001::10>> == <<257::10>>
true
Until now, we only looked at how Elixir stores integers in a bitstring, but it supports many more types
, so let’s talk about them next.
Types
When constructing or decoding a bitstring, Elixir works with “segments”. For example, to deconstruct a bitstring into one integer, one float, and two bytes, Elixir matches the first bits against an integer, then a float, then two bytes. If any segment fails to match, it raises a MatchError
.
Segments can have one of 9 types:
integer
- default type if no type is specifiedfloat
- must be 16, 32, or 64 (default) bitsbitstring
- can be nested within another bitstring!bits
- alias forbitstring
binary
- always 8 bits or 1 bytebytes
- alias forbinary
utf8
- default encoding of stringsutf16
utf32
Let’s see how to use these types in a binary pattern match.
Integer
If no type is specified, Elixir expects an integer segment. Integers are 8 bits by default but the size can be adjusted.
iex> <<1::integer, 2>> == <<1, 2>>
true
iex> <<value::integer-10>> = <<130::10>>
<<32, 2::size(2)>>
iex> value
130
iex> <<0b10000000::8>> == <<128>>
true
iex> <<0b0010000000::10>> == <<128::10>>
true
Floats
Floats are 64 bits by default, but can be smaller if specified:
iex> <<value::float>> = <<12.5>>
# 8 Integers * 8 bits = 64 bits
<<64, 41, 0, 0, 0, 0, 0, 0>>
# 0 10000000010 1001000...000
# | | |
# Sign (1 bit. 0 for positive, 1 for negative)
# | |
# Exponent (11 bits with 1023 bias for float64)
# |
# Significand (52 bits)
#
# significand = 1*2^-1 + 0*2^-2 + 0*2^-3 + 1*2^-4
#
# = (-1)^sign x (1 + significand) x 2^(exponent - bias)
# = (-1)^0 x (1 + 0.5625) x 2^(1026-1023) = 12.5
iex> value
12.5
You can specify the size
of the float with the dash
modifier and either the short-hand notation -n
or the full notation size(n)
:
iex> <<value::float-32>> = <<12.5::float-size(32)>>
<<65, 72, 0, 0>>
iex> value
12.5
When working with custom-sized floats, you have to watch out for overflows when the number you want to store is too large for the chosen float size:
# 65_504 is the largest number a float16 can store
iex> <<65_504::float-16>> == <<65_504::float-16>>
true
iex> <<65_504::float-16>> == <<65_505::float-16>>
true
In the second example, we try to store a number larger than the float16
precision limit of 65,504
, but Elixir just cuts off the extra and falls back to the limit of 65,504
. So, unless you work with billions of floats in your application, better stick to the default float64
which is unlikely to lose precision.
Bitstrings
You can match nested bitstrings like this:
iex> <<1, value::bitstring>> = <<1, <<2, 3>> >>
<<1, 2, 3>>
iex> value
<<2, 3>>
Or you can use the alias bits
:
iex> <<1, value::bits>> = <<1, <<2, 3>> >>
<<1, 2, 3>>
iex> value
<<2, 3>>
Binaries
Binaries are a special type of bitstring:
A binary is a bitstring where the number of bits is divisible by 8.
So, every binary is a bitstring, but not every bitstring is a binary. When matching a binary, be careful to assume that the underlying bitstring is divisible by 8 because you will receive a MatchError
otherwise.
If you match a binary at the end of the bitstring, you don’t need to specify its size:
iex> <<1, 2, text::binary>> = <<1, 2, "foo">>
<<1, 2, 102, 111, 111>>
iex> text
"foo"
But to match a binary anywhere else, you must specify its expected size:
iex> <<text::binary-size(3), 1, 2>> = <<"foo", 1, 2>>
<<102, 111, 111, 1, 2>>
iex> text
"foo"
Binaries convert strings to UTF8 by default, so you need to watch out with exact text matching.
iex> <<"foo">> == <<"Foo">>
false
Since binaries are just bytes, you can also use the alias bytes
in the match:
iex> <<text::bytes-size(3), 1, 2>> = <<"foo", 1, 2>>
<<102, 111, 111, 1, 2>>
iex> text
"foo"
iex> <<1, 2, text::bytes>> = <<1, 2, "foo">>
<<1, 2, 102, 111, 111>>
Strings
Strings are binaries (sequences of bytes) and UTF8 encoded by default, meaning each character is one byte.
iex> <<c1::bytes-1, c2::bytes-1, c3::bytes-1>> = <<"bar">>
"bar"
iex> [c1, c2, c3]
["b", "a", "r"]
You can change the encoding to UTF16 or UTF32, which will use 2 or 4 bytes per character:
# With UTF16 encoding
iex> <<char::2-bytes, rest::4-bytes>> = <<"foo"::utf16>>
<<0, 102, 0, 111, 0, 111>>
iex> <<0, "f">> == char
true
iex> <<0, "o", 0, "o">> == rest
true
# And with UTF32 encoding:
iex> <<char::4-bytes, rest::8-bytes>> = <<"bar"::utf32>>
<<0, 0, 0, 98, 0, 0, 0, 97, 0, 0, 0, 114>>
iex> <<0, 0, 0, "b">> == char
true
Bitstrings vs Binaries vs Strings
The hierarchy of strings, binaries, and bitstrings is as follows:
- All strings are binaries and all binaries are bitstrings, but not all bitstrings are binaries and not all binaries are strings.
- Binaries are bitstrings whose bit count is divisible by 8.
- Strings are binaries with UTF8/16/32 encoded characters with one character being either 1, 2, or 4 bytes.
Modifiers
Now, let’s talk about something fun. Elixir gives you superpowers for decoding numbers exactly how you need them through match modifiers
.
Modifiers tell Elixir which extra steps to take when it converts the raw bits into a number. You can tell it to decode signed
or unsigned
numbers and specify the little
or big
endianness of multiple bytes.
For example, here’s how to decode a bitstring into a signed integer without having to do the conversion from unsigned to signed integer yourself:
iex> <<int::signed-integer>> = <<-100>>
<<156>>
iex> int
-100
iex> <<int::unsigned-integer>> = <<-100>>
<<156>>
iex> int
156
Elixir always displays the unsigned value (156
), but the int
variable contains the correct signed value (-100
).
⚠️ Small detour ⚠️
In case you’re interested, this is how you can calculate the signed
integer value of a bitstring by hand:
- Take the bitstring:
+100 => 0b01100100
-100 => 0b10011100
- Check the left-most bit for the sign (positive/negative).
0
means positive,1
means negative
- If positive, simply convert to decimal:
+100 => 0b01100100 => 100
- If negative, invert all bits, convert to decimal, add
1
, and apply negative sign:-100 => 0b10011100 => 0b01100011 => 99 + 1 => 100 => -100
- This is called the two’s complement method.
⚠️ Detour end ⚠️
Endianness
In case you’re not familiar with Endianness, it describes the position of the most significant byte in a sequence of bytes.
But what does this actually mean? Consider this very realistic example: Imagine your partner is at the supermarket and wants to tell you the price of the grapefruits. For completely normal and not at all made-up reasons, you two can only communicate via morse code. You receive the numbers: 4, 9, 9
.
Now, you would assume intuitively that the price per grapefruit is 4.99
, right?
But how can you be sure that your partner didn’t send the digits in reverse order in which case the price would be 9.94
?
Naturally and because you are completely normal people, you discussed beforehand which endianness
your morse code communication would have. You agreed to use big
endianness which means that you send the most-significant bit first. For the price of grapefruit, this means sending the 4
, the largest digit, first. That’s why you know for sure that your partner meant 4.99
and not 9.94
.
You could also have used little
endianness. Then your partner would send the least-significant bit first. So, the sequence would have been 9, 9, 4
. To get the price, you would have to reverse the order since our decimal system puts the most-significant bit first, resulting in the correct price of 4.99
.
Endianness is important when you convert multiple bytes into a single number. By default, Elixir assumes big
endianness.
iex> <<big::big-integer-size(16)>> = <<1, 2>>
<<1, 2>>
iex> big
258
iex> <<little::little-integer-size(16)>> = <<1, 2>>
<<1, 2>>
iex> little
513
You won’t see a difference if you convert only a single byte, since endianness applies only to multi-byte ordering not bit ordering.
iex> <<big::big-integer>> = <<1>>
<<1>>
iex> big
1
iex> <<little::little-integer>> = <<1>>
<<1>>
iex> little
1
Elixir also supports the native
endianness modifier which uses big
or little
depending on your VM and host OS. I suggest you always use big
or little
explicitly unless absolutely necessary.
Modifier Order
One feature of modifiers is that their order doesn’t matter. All these notations are the same:
<<x::little-signed-integer-size(16)>>
<<x::integer-signed-little-size(16)>>
<<x::size(16)-signed-integer-little>>
<<x::16-signed-integer-little>>
If you want to enforce a certain modifier order in your codebase, you can add this nifty Credo check which will check that the modifiers are always ordered like this:
<<x::[endian]-[sign]-[type]-[size]>>
Thank you Noah for recommending me this credo check 💜
Dynamic Match Size
The last feature of binary pattern matching I want to share with you is that you can have dynamically sized matches like this:
iex> <<count::integer, text::binary-size(count)>> = <<26, "Hello there General Kenobi">>
<<26, 72, 101, ... >>
iex> text
"Hello there General Kenobi"
Once you start decoding bitstrings in the real world, you will often encounter the case in which a preceding value defines the length of the value to follow. In our example, the count
defines the length of the string to follow.
This works with all types, including raw bits:
iex> <<count::4, value::size(count)>> = <<0b01001001>>
"I"
iex> count
4
iex> value
9 # or 0b1001
You can also use regular variables to dynamically define match size:
iex> count = 4
4
iex> <<text::binary-size(count)>> = <<"helo">>
"helo"
iex> text
"helo"
This feature is extremely useful if you don’t know the size of a segment up front but receive a “meta” or “header” segment which indicates the dynamic size of the segment to follow.
Configure IEx
Elixir displays bitstrings as UTF8 characters whenever possible, which is useful if you actually work with text, but can get in your way when you work with numbers.
Luckily, you can configure IEx to always print out the “raw” values like this:
iex> <<54, 77>>
"6M"
iex> IEx.configure(inspect: [base: :decimal, charlists: :as_lists, binaries: :as_binaries])
:ok
iex> <<54, 77>>
<<54, 77>>
I learned this trick from Geoffrey Lessel’s great talk about working with bits at ElixirConf US 2025. Thank you Geoffrey 💜
Conclusion
And that’s it! I hope you enjoyed this article! If you want to support me, you can buy my firewall for Phoenix Phx2Ban or my book or video courses (one and two). Follow me on Bluesky or subscribe to my newsletter below if you want to get notified when I publish the next blog post. Until next time! Cheerio 👋