Beware of Dangling Backslashes in RUN Commands in Your Dockerfile
This can cause pull errors that may lead you down the wrong path or cause fun and exciting errors with anything you're running afterwards.
I saw an interesting question posted in Docker’s IRC channel the other day and it caught me off-guard a little. Someone posted their Dockerfile and said it wouldn’t build.
I’m going to post their Dockerfile here for reference. The only things I’ve done are comment lines that run
npm commands since we don’t have their source code. This will let us build the image locally without missing file errors.
For the full effect I’m going to leave their formatting, comments and overall choices fully intact. Again, this is their Dockerfile which was posted by them (not me):
# ---- Base Node ---- FROM node:lts-alpine AS base # install node RUN apk add --no-cache tini # set working directory WORKDIR /app # Set tini as entrypoint ENTRYPOINT ["/sbin/tini", "--"] # copy project file COPY . . # ---- Dependencies ---- FROM base AS dependencies # install node packages RUN npm set progress=false && npm config set depth 0 # copy production node_modules aside # install ALL node_modules, including 'devDependencies' WORKDIR /app RUN apk add python3 g++ make RUN apk update && apk upgrade && \ apk add --no-cache bash git openssh \ # ---- Test ---- # run linters, setup and tests FROM dependencies AS test COPY . ./app #RUN npm install #RUN npm run copy # --- Build --- FROM test AS build #ENTRYPOINT ["npm","start"] ENTRYPOINT ["echo", "Hello"]
If you try to build this with
docker image build -t dangling . you’ll be greeted with:
[+] Building 0.5s (4/4) FINISHED docker:default => [internal] load build definition from Dockerfile 0.1s => => transferring dockerfile: 813B 0.0s => [internal] load .dockerignore 0.1s => => transferring context: 2B 0.0s => ERROR [internal] load metadata for docker.io/library/test:latest 0.3s => [auth] library/test:pull token for registry-1.docker.io 0.0s ------ > [internal] load metadata for docker.io/library/test:latest: ------ ERROR: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
At first glance you might think that’s strange. Why is it trying to pull
test from the Docker Hub since near the bottom we have a build stage named
FROM dependencies AS test?
With multi-stage builds, you can reference
FROM mystage where
mystage is whatever you named it
AS from a previous stage. It will use that instead of trying to pull the image from a Docker registry.
The above is indicating that the
test stage was not created but that’s not super obvious if you haven’t worked a lot with multi-stage builds.
It’s Not Related to Reserved Keywords
You may look at the above error and decide to rename
FROM dependencies AS test to
FROM dependencies AS test2 and then also make the same change to
FROM test2 AS build.
The reasoning here is that programming languages often have reserved keywords. If you’ve never named a build stage literally
test before, you may take a guess that it’s a reserved keyword and try to change that just to see what happens.
If you make this change you’ll get the same error except it’ll reference
test2. So nope,
test isn’t reserved. You can use that name as a stage.
It’s Not Related to the Build Stage
When debugging, generally the goal is to find the root cause of the issue. You can often do that by eliminating pieces from the equation.
My next thought process was if the error is related to referencing the
test stage let’s just remove it, so I commented out
FROM test AS build and everything below it.
That let the image mostly build but it produced this error:
[+] Building 11.1s (12/13) docker:default => [internal] load build definition from Dockerfile 0.1s => => transferring dockerfile: 816B 0.0s => [internal] load .dockerignore 0.1s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/library/node:lts-alpine 0.2s => [base 1/4] FROM docker.io/library/node:lts-alpine@sha256:3482a20c97e401b56ac50ba8920cc7b5b2022bfc6aa7d4e4c231755770cf892f 0.0s => [internal] load build context 0.1s => => transferring context: 816B 0.0s => CACHED [base 2/4] RUN apk add --no-cache tini 0.0s => CACHED [base 3/4] WORKDIR /app 0.0s => [base 4/4] COPY . . 0.1s => [dependencies 1/5] RUN npm set progress=false && npm config set depth 0 1.7s => [dependencies 2/5] WORKDIR /app 0.2s => [dependencies 3/5] RUN apk add python3 g++ make 6.1s => ERROR [dependencies 4/5] RUN apk update && apk upgrade && apk add --no-cache bash git openssh FROM dependencies AS test 2.4s ------ > [dependencies 4/5] RUN apk update && apk upgrade && apk add --no-cache bash git openssh FROM dependencies AS test: 0.659 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz 0.861 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz 1.169 v3.18.3-126-g844291c27ba [https://dl-cdn.alpinelinux.org/alpine/v3.18/main] 1.169 v3.18.3-131-g4de6b900df1 [https://dl-cdn.alpinelinux.org/alpine/v3.18/community] 1.169 OK: 20063 distinct packages available 1.526 OK: 275 MiB in 47 packages 1.596 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz 1.846 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz 2.206 ERROR: unable to select packages: 2.206 AS (no such package): 2.206 required by: world[AS] 2.206 dependencies (no such package): 2.206 required by: world[dependencies] 2.206 FROM (no such package): 2.206 required by: world[FROM] 2.206 test (no such package): 2.206 required by: world[test] ------ ERROR: failed to solve: executor failed running [/bin/sh -c apk update && apk upgrade && apk add --no-cache bash git openssh FROM dependencies AS test]: exit code: 4
Oh neat, now we’re getting somewhere.
At this point the problem was solved, someone had already posted the answer in IRC. Were you able to spot it?
The above error is because
apk (Alpine’s package manager) is trying to literally install a package called
FROM dependencies AS test because the trailing
apk add --no-cache bash git openssh \. Clearly that package doesn’t exist.
RUN command is executing in the context of a shell, it’s doing shell behavior which is to treat lines broken up with
\ as a single line. Using
\ to break up long commands into multiple lines is a common thing to do.
It Was the Dangling Backslash
Specifically, it’s this line
apk add --no-cache bash git openssh \. If you remove that
\ and build the image it will work.
This is another great example where once you know the problem the solution is easy.
Can Hadolint Lint This?
It didn’t catch that. Parsing that would be tricky since there’s so many ways a backslash can be at the end of the line and still be valid. Indentation of the next line is also valid with any number of spaces.
One possibility is if you’re within a
RUN command and the line ends with
\ and the next non-comment word in the file is a valid Docker instruction then produce a lint warning that you have a dangling backslash.
For example in the above case, we had
\ followed by new lines, a few comments but then the next “real” word was
FROM which is a valid Docker instruction. It feels parse’able.
I haven’t given a huge amount of thought to this but it feels like an option to explore. I’ve opened an issue on GitHub for this to get more feedback.
Here’s an isolated Dockerfile that demonstrates the problem:
FROM debian:bookworm-slim AS base RUN echo "Base..." FROM base AS dependencies RUN apt-get update -y \ && apt-get install -y \ curl \ git \ FROM dependencies AS test RUN echo "Test..." FROM test AS build CMD ["echo", "Done!"]
The video below builds the image and shows the errors along with the debug process.
- 0:11 – Someone had this issue on IRC
- 0:46 – Demoing the first error and going over the Dockerfile
- 2:09 – Was test a reserved keyword? Nope
- 2:38 – Reducing variables to help find the root cause
- 4:01 – Identifying the root cause
- 5:03 – This was an interesting issue
- 5:40 – Maybe Hadolint could detect this in the future
- 6:34 – Can it be a lint rule? I opened an issue on GitHub
Have you ever had this happen to you in a Dockerfile? Let us know below!