Beware of Dangling Backslashes in RUN Commands in Your Dockerfile
This can cause pull errors that may lead you down the wrong path or cause fun and exciting errors with anything you're running afterwards.
I saw an interesting question posted in Docker’s IRC channel the other day and it caught me off-guard a little. Someone posted their Dockerfile and said it wouldn’t build.
I’m going to post their Dockerfile here for reference. The only things I’ve
done are comment lines that run npm
commands since we don’t have their source
code. This will let us build the image locally without missing file errors.
For the full effect I’m going to leave their formatting, comments and overall choices fully intact. Again, this is their Dockerfile which was posted by them (not me):
# ---- Base Node ----
FROM node:lts-alpine AS base
# install node
RUN apk add --no-cache tini
# set working directory
WORKDIR /app
# Set tini as entrypoint
ENTRYPOINT ["/sbin/tini", "--"]
# copy project file
COPY . .
# ---- Dependencies ----
FROM base AS dependencies
# install node packages
RUN npm set progress=false && npm config set depth 0
# copy production node_modules aside
# install ALL node_modules, including 'devDependencies'
WORKDIR /app
RUN apk add python3 g++ make
RUN apk update && apk upgrade && \
apk add --no-cache bash git openssh \
# ---- Test ----
# run linters, setup and tests
FROM dependencies AS test
COPY . ./app
#RUN npm install
#RUN npm run copy
# --- Build ---
FROM test AS build
#ENTRYPOINT ["npm","start"]
ENTRYPOINT ["echo", "Hello"]
If you try to build this with docker image build -t dangling .
you’ll be
greeted with:
[+] Building 0.5s (4/4) FINISHED docker:default
=> [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 813B 0.0s
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> ERROR [internal] load metadata for docker.io/library/test:latest 0.3s
=> [auth] library/test:pull token for registry-1.docker.io 0.0s
------
> [internal] load metadata for docker.io/library/test:latest:
------
ERROR: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
At first glance you might think that’s strange. Why is it trying to pull test
from the Docker Hub since near the bottom we have a build stage named test
with FROM dependencies AS test
?
With multi-stage builds, you can reference FROM mystage
where mystage
is
whatever you named it AS
from a previous stage. It will use that instead of
trying to pull the image from a Docker registry.
The above is indicating that the test
stage was not created but that’s not
super obvious if you haven’t worked a lot with multi-stage builds.
It’s Not Related to Reserved Keywords
You may look at the above error and decide to rename FROM dependencies AS test
to FROM dependencies AS test2
and then also make the same change to
FROM test2 AS build
.
The reasoning here is that programming languages often have reserved keywords.
If you’ve never named a build stage literally test
before, you may take a
guess that it’s a reserved keyword and try to change that just to see what
happens.
If you make this change you’ll get the same error except it’ll reference
test2
. So nope, test
isn’t reserved. You can use that name as a stage.
It’s Not Related to the Build Stage
When debugging, generally the goal is to find the root cause of the issue. You can often do that by eliminating pieces from the equation.
My next thought process was if the error is related to referencing the test
stage let’s just remove it, so I commented out FROM test AS build
and
everything below it.
That let the image mostly build but it produced this error:
[+] Building 11.1s (12/13) docker:default
=> [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 816B 0.0s
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/node:lts-alpine 0.2s
=> [base 1/4] FROM docker.io/library/node:lts-alpine@sha256:3482a20c97e401b56ac50ba8920cc7b5b2022bfc6aa7d4e4c231755770cf892f 0.0s
=> [internal] load build context 0.1s
=> => transferring context: 816B 0.0s
=> CACHED [base 2/4] RUN apk add --no-cache tini 0.0s
=> CACHED [base 3/4] WORKDIR /app 0.0s
=> [base 4/4] COPY . . 0.1s
=> [dependencies 1/5] RUN npm set progress=false && npm config set depth 0 1.7s
=> [dependencies 2/5] WORKDIR /app 0.2s
=> [dependencies 3/5] RUN apk add python3 g++ make 6.1s
=> ERROR [dependencies 4/5] RUN apk update && apk upgrade && apk add --no-cache bash git openssh FROM dependencies AS test 2.4s
------
> [dependencies 4/5] RUN apk update && apk upgrade && apk add --no-cache bash git openssh FROM dependencies AS test:
0.659 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
0.861 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
1.169 v3.18.3-126-g844291c27ba [https://dl-cdn.alpinelinux.org/alpine/v3.18/main]
1.169 v3.18.3-131-g4de6b900df1 [https://dl-cdn.alpinelinux.org/alpine/v3.18/community]
1.169 OK: 20063 distinct packages available
1.526 OK: 275 MiB in 47 packages
1.596 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
1.846 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
2.206 ERROR: unable to select packages:
2.206 AS (no such package):
2.206 required by: world[AS]
2.206 dependencies (no such package):
2.206 required by: world[dependencies]
2.206 FROM (no such package):
2.206 required by: world[FROM]
2.206 test (no such package):
2.206 required by: world[test]
------
ERROR: failed to solve: executor failed running [/bin/sh -c apk update && apk upgrade && apk add --no-cache bash git openssh FROM dependencies AS test]: exit code: 4
Oh neat, now we’re getting somewhere.
At this point the problem was solved, someone had already posted the answer in IRC. Were you able to spot it?
The above error is because apk
(Alpine’s package manager) is trying to
literally install a package called FROM dependencies AS test
because the
trailing \
in apk add --no-cache bash git openssh \
. Clearly that package
doesn’t exist.
Since the RUN
command is executing in the context of a shell, it’s doing
shell behavior which is to treat lines broken up with \
as a single line.
Using \
to break up long commands into multiple lines is a common thing to
do.
It Was the Dangling Backslash
Specifically, it’s this line apk add --no-cache bash git openssh \
. If you
remove that \
and build the image it will work.
This is another great example where once you know the problem the solution is easy.
# Can Hadolint Lint This?
The first thing I tried after that was to run the Dockerfile through Hadolint. That’s a really nice linting tool for Docker and I’ve written about it in the past.
It didn’t catch that. Parsing that would be tricky since there’s so many ways a backslash can be at the end of the line and still be valid. Indentation of the next line is also valid with any number of spaces.
One possibility is if you’re within a RUN
command and the line ends with \
and the next non-comment word in the file is a valid Docker instruction then
produce a lint warning that you have a dangling backslash.
For example in the above case, we had \
followed by new lines, a few comments
but then the next “real” word was FROM
which is a valid Docker instruction.
It feels parse’able.
I haven’t given a huge amount of thought to this but it feels like an option to explore. I’ve opened an issue on GitHub for this to get more feedback.
Here’s an isolated Dockerfile that demonstrates the problem:
FROM debian:bookworm-slim AS base
RUN echo "Base..."
FROM base AS dependencies
RUN apt-get update -y \
&& apt-get install -y \
curl \
git \
FROM dependencies AS test
RUN echo "Test..."
FROM test AS build
CMD ["echo", "Done!"]
The video below builds the image and shows the errors along with the debug process.
# Demo Video
Timestamps
- 0:11 – Someone had this issue on IRC
- 0:46 – Demoing the first error and going over the Dockerfile
- 2:09 – Was test a reserved keyword? Nope
- 2:38 – Reducing variables to help find the root cause
- 4:01 – Identifying the root cause
- 5:03 – This was an interesting issue
- 5:40 – Maybe Hadolint could detect this in the future
- 6:34 – Can it be a lint rule? I opened an issue on GitHub
Have you ever had this happen to you in a Dockerfile? Let us know below!