The 3 Biggest Wins When Using Alpine as a Base Docker Image
If you want to shrink your Docker images, have your services start faster and be more secure then try Alpine out.
It’s no secret by now that Docker is heavily using Alpine as a base image for official Docker images. This movement started near the beginning of 2016.
Fast forward to today and nearly every official Docker image has a tag for Alpine.
You don’t just wake up one morning and decide to make a sweeping change like that. Especially not when the previous official base image of choice was Debian – which is well known for being super solid. What would cause such a move?
By the way, if you want, you can still use the Debian version of official Docker images today.
# Why Alpine?
Alpine describes itself as:
Small. Simple. Secure. Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and busybox.
That sounds like it could be interesting, but what does that really mean for you and me, or anyone who uses Docker on a regular basis?
# The Main Benefit Is Shrinkage
In most other contexts (such as doing laundry), “shrinkage” is a pretty bad thing, but in the world of Docker, you should look forward to it because it means your Docker images will be smaller.
The Dockerized version of Alpine 3.6 weighs in at 3.98MB.
For comparison, here’s how Alpine compares to other popular distributions of Linux:
DISTRIBUTION | VERSION | SIZE |
Debian | Jessie | 123MB |
CentOS | 7 | 193MB |
Fedora | 25 | 231MB |
Ubuntu | 16.04 | 118MB |
Alpine | 3.6 | 3.98MB |
Wow, check out the difference in size. Alpine is about 30x smaller than Debian.
The Docker Hub has handled a ton of pulls. By investigating its public API we can see that Debian has gotten 35,555,107 pulls and Alpine has gotten 135,136,475 pulls at the time of this article.
Are all of these pulls resulting in every byte transferred? Probably not, but your guess is as good as mine. At the very least, it puts things into perspective.
Estimated costs for transferring Debian and Alpine ~35 million times over S3:
DISTRIBUTION | GIGS TRANSFERRED | S3 TRANSFER COST |
Debian | 4,373,278 | $416,310.72 |
Alpine | 141,509 | $17,832.06 |
So just to transfer Debian vs Alpine ~35 million times at a cost of S3’s pricing calculator, there’s a difference of nearly $400,000 USD.
I know you’re probably not operating at that scale (neither am I), but there is a real savings when it comes to transfer costs in the cloud at all levels of scale.
Being able to cut your image size down by over 100MB is a big deal.
In real world web applications which have lots of packages installed, I still see about a 2x or 3x savings in final image size with Alpine, so it’s not only useful in micro-benchmarks. The ~100MB savings is static regardless of what’s being built into your image.
# Alpine Is Fast
Cost isn’t the only win when dealing with smaller Docker images.
Let’s say that you wanted to pull down a Docker image and install curl. How long would this take with Debian vs Alpine?
I’m going to perform 2 tests here:
The first test will be on a fresh box where I need to download the base Docker image onto the system using a 30MB home grade cable connection and install curl.
The second test will be the same as the above, except the system will already have the base Docker image on the machine before installing curl.
Test 1: Downloading the Base Image on a Fresh Machine
Debian
time docker run --rm debian sh -c "apt-get update && apt-get install curl"
real 0m27.928s
user 0m0.019s
sys 0m0.077s
So we’re looking at about 28 real life seconds for it to pull down Debian,
run an apt-get update
and then install curl.
Alpine
time docker run --rm alpine sh -c "apk update && apk add curl"
real 0m5.698s
user 0m0.008s
sys 0m0.037s
On the other hand, with Alpine, it finished about 5x faster. Waiting 28 vs 5 seconds is no joke. That’s a substantial amount of time.
Think about how lame it is to wait for your programming tests to finish in 30 seconds or 5.
Test 2: The Machine Already Has the Docker Image Downloaded
Debian
time docker run --rm debian sh -c "apt-get update && apt-get install curl"
real 0m9.170s
user 0m0.000s
sys 0m0.031s
Clearly my 30MB cable connection is the bottleneck, but it still took over
9 seconds just to run apt-get update
and install curl.
Alpine
time docker run --rm alpine sh -c "apk update && apk add curl"
real 0m3.040s
user 0m0.017s
sys 0m0.008s
On the other hand, Alpine zipped through it in 3 seconds flat. That’s about a 3x improvement.
What Does That Mean for You?
When pulling down new Docker images onto a fresh server, you can expect the initial pull to be quite a bit faster on Alpine. The slower your network is, the bigger the difference it will be.
If you’re in a position where you have auto-scaling in place and are spinning up A LOT of servers then this is a pretty big deal. It means your servers will be ready to accept traffic at a faster rate.
If you’re not spinning up a lot of servers then the speed benefit goes way down, but hey, you’re still saving over 100MB in data transfer and storage costs.
# Alpine is Secure
Another perk of being much smaller in size is that the surface area to be attacked is much less.
When there’s not a lot of packages and libraries on your system, there’s very little that can go wrong.
A few years ago there was a nasty Bash exploit that let an attacker gain control over your machine if you were afflicted by what they named “ShellShock”.
Alpine was immune to that attack because Bash isn’t installed by default.
Batteries Available but Not Included
Also, most distributions run a ton of services by default. The ps aux
output
on a fresh Debian or Ubuntu system is a mile long. This might be reasonable for
a non-Docker set up, but chances are your Dockerized application doesn’t need
most of what’s started by default.
Alpine takes a much different approach. It doesn’t start too much up by default and expects you to only start the things you need. This is perfect for a Dockerized application.
Use the Best Tool for the Job
Alpine has also taken a strong stance on security in general. The dev team isn’t afraid to swap out certain packages for a more secure variant of it. For example they replaced OpenSSL with LibreSSL.
I really like this quote from the above link:
While OpenSSL is trying to fix the broken code, libressl has simply removed it.
I think that really sums up Alpine. It truly lives by what it promises, which is to be a small and secure Linux distribution. It’s the perfect combo to use with Docker when used as a base Docker image.
What do you think about Alpine?