Deleting Specific Lines in a File with sed or yq
We'll go over deleting 1 or more lines that match a regex as well as deleting specific lines by number reference.
Prefer video? Here’s a demo video going over what’s listed below in a bit more detail.
Recently I found myself wanting to delete a number of Docker volume related
lines from a docker-compose.yml
file.
Ultimately I ended up using yq
since it was the most maintainable and least
brittle solution since I was dealing with YAML but sed
is a general purpose
solution not limited to YAML and it was a fun exercise in using it. We’ll cover
both solutions.
# Going Over the Use Case
Here’s a portion of the Docker Compose YAML file that I was dealing with. The whole file can be found in the example app on GitHub:
x-app: &default-app
build:
context: "."
target: "app"
args:
- "UID=${UID:-1000}"
- "GID=${GID:-1000}"
- "FLASK_DEBUG=${FLASK_DEBUG:-false}"
- "NODE_ENV=${NODE_ENV:-production}"
# ...
volumes:
- "${DOCKER_WEB_VOLUME:-./public:/app/public}"
x-assets: &default-assets
# ...
volumes:
- ".:/app"
services:
postgres:
# ...
volumes:
- "postgres:/var/lib/postgresql/data"
redis:
# ...
volumes:
- "redis:/data"
volumes:
postgres: {}
redis: {}
The goal here is to remove all of the volumes
properties as well as each
volume line within that property.
# Using sed
sed
can be used for a lot of things such as finding and replacing text in a
file. It also supports being able to delete lines.
We can choose to delete lines by line number or by regex pattern.
Deleting by number is much shorter but it’s also more brittle because if the file changes then the line numbers will get shifted and it will start deleting the wrong lines.
Deleting by regex requires more upfront work but it’s a bit better since if you refactor the file everything will continue to work as long as you’re not modifying your volumes in which case you’ll need to remember to update your regex patterns.
Phase 1: Deleting by line
If we use the partial YAML file above we can delete the first instance of
volumes
with sed "11d" docker-compose.yml
since that’s line 11.
You can also delete a range of lines such as sed "11,12d" docker-compose.yml
which will delete both the volumes
line and - "${DOCKER_WEB_VOLUME:-./public:/app/public}"
under it which is exactly what we
want.
sed
also supports having multiple expressions if you separate them with:
;
That allows us to delete everything we want with sed "11,12d;16,17d;22,23d;27,28d;30,32d" docker-compose.yml
. This isn’t the
easiest thing in the world to maintain. There’s a lot of floating numbers that
could easily change if you did a minor adjustment to your source file.
But it does leave us with what we want:
x-app: &default-app
build:
context: "."
target: "app"
args:
- "UID=${UID:-1000}"
- "GID=${GID:-1000}"
- "FLASK_DEBUG=${FLASK_DEBUG:-false}"
- "NODE_ENV=${NODE_ENV:-production}"
# ...
x-assets: &default-assets
# ...
services:
postgres:
# ...
redis:
# ...
Phase 2: Deleting by regex
sed "/^ volumes:$/d" docker-compose.yml
deletes the first volumes
property
which isn’t too bad if you have 1 or 2 lines to delete.
But it starts to get a little unwieldy when you have a bunch of lines with regex patterns that are a little involved.
Here’s what deleting all of the files we want looks like:
sed '/^ volumes:$/d;/^ - "${DOCKER_WEB_VOLUME:.*"$/d;/^ volumes:/d;/^ - ".:\/app"$/d;/^ volumes:$/d;/ - "postgres:\/var\/lib\/postgresql\/data"$/d;/^ volumes:$/d;/^ - "redis:\/data"$/d;/^volumes:$/d;/^ postgres: {}$/d;/^ redis: {}$/d' docker-compose.yml
If you carefully look at all of that, it’s capturing specific lines of text that match a regex. The regex patterns are including a lot of each line’s text to avoid capturing false positives. We’re trading verbosity for correctness.
You can technically clean this up a little bit by using a custom sed delimiter to avoid escaping the backslashes in certain paths such as:
sed '/^ volumes:$/d;/^ - "${DOCKER_WEB_VOLUME:.*"$/d;/^ volumes:/d;\|^ - ".:/app"$|d;/^ volumes:$/d;\| - "postgres:/var/lib/postgresql/data"$|d;/^ volumes:$/d;\|^ - "redis:/data"$|d;/^volumes:$/d;/^ postgres: {}$/d;/^ redis: {}$/d'
A few of them are using |
as the delimiter since they have paths with a /
.
To do that you need to escape the first one with \
. That’s why we see ;\|^ - ".:/app"$|d
.
That still leaves a lot to be desired in readability and being able to hop in to modify one of those expressions. Fortunately we can break things up into multiple lines as seen below:
sed \
-e '/^ volumes:$/d' \
-e '/^ - "${DOCKER_WEB_VOLUME:.*"$/d' \
-e '/^ volumes:/d' \
-e '\|^ - ".:/app"$|d' \
-e '/^ volumes:$/d' \
-e '\| - "postgres:/var/lib/postgresql/data"$|d' \
-e '/^ volumes:$/d;\|^ - "redis:/data"$|d' \
-e '/^volumes:$/d' \
-e '/^ postgres: {}$/d' \
-e '/^ redis: {}$/d' \
docker-compose.yml
-e
is the expression we want to run, sed lets us pass in multiple
expressions.
If you were putting this into a script, that’s not the end of the world. In my opinion it’s a million times better than having all of that on 1 line.
Depending on your use case, this might still be a little brittle tho. If you
ever add a volume you need to go back and modify this sed command. That may or
may not be ok. If it’s not ok, maybe you should reach for yq
, we’ll see what
that looks like soon.
Phase 3: Inline Editing the File
If we run sed -i
instead of sed
it will overwrite the file being passed in
with the new contents that have the lines deleted instead of printing it to
STDOUT.
That’s what I wanted since ultimately I was running the modified file.
-i
on its own works on native Linux or more specifically the GNU version of
sed but it doesn’t work on macOS out of the box in the same way as it does with
the GNU version, although you can manually install the GNU version of sed on
macOS.
I was ok with that since this only ran on native Linux. If having both macOS
and Linux support was a requirement and you can’t install the GNU version of
sed on macOS you can always write a little wrapper script to alter the sed
command depending on what uname -o
returns. There’s lots of examples on
StackOverflow if you Google around.
There we go, we have a general purpose solution using sed
that works for all
text files, not just YAML. That’s nice for a zero external dependency solution.
# Using yq for This Use Case
If you’re dealing with a lot of YAML and want to delete a bunch of paths then I
think it’s worth at least considering to use yq
. It works the same on Linux
and macOS.
The end game solution for that one looks like this:
yq -i "del(.x-app.volumes,.x-assets.volumes,.services.postgres.volumes,.services.redis.volumes,.volumes)" docker-compose.yml
It’s pretty much a 1 liner. We only need to provide the paths of the properties
we want to delete which is less brittle than a regex. This is nice because even
if we added a few new bind mounted volumes to our app they would be deleted
when we delete the parent volumes
property which means we don’t need to
update our yq
command.
You can download yq
from https://github.com/mikefarah/yq. That’s the self
contained Go binary version. There’s also a Python version available at
https://github.com/kislyuk/yq.
The Go version will reformat your file, for example in our case it leaves us with:
x-app: &default-app
build:
context: "."
target: "app"
args:
- "UID=${UID:-1000}"
- "GID=${GID:-1000}"
- "FLASK_DEBUG=${FLASK_DEBUG:-false}"
- "NODE_ENV=${NODE_ENV:-production}"
x-assets: &default-assets
# ...
volumes:
- ".:/app"
services:
postgres: {}
redis: {}
There’s been a GitHub issue open for years to allow it to preserve formatting
but it doesn’t look like it’ll happen. The Python version allows to retain
formatting but it also means you need to pip
install it since it has quite a
few dependencies.
For my use case the Go version was ok since I only wanted to run the newly outputted file, I didn’t commit it back to version control but your mileage may vary.
In any case, there’s a bunch of ways to handle this. That should get you going! The demo video below goes over running the commands we covered above.
# Demo Video
Timestamps
- 0:05 – The use case
- 0:47 – Delete lines by line number
- 5:06 – Delete lines by regular expression
- 8:05 – Customize the sed delimiter
- 9:31 – Breaking up our 1 line sed command into multiple lines
- 10:29 – Maybe using yq
- 13:13 – Delete lines with yq
- 14:51 – Does anyone want to try a solution with perl?
What’s your favorite way to delete lines in a file? Let me know below.