Learn Docker With My Newest Course

Dive into Docker takes you from "What is Docker?" to confidently applying Docker to your own projects. It's packed with best practices and examples. Start Learning Docker →

Deleting Specific Lines in a File with sed or yq

blog/cards/deleting-specific-lines-in-a-file-with-sed-or-yq.jpg

We'll go over deleting 1 or more lines that match a regex as well as deleting specific lines by number reference.

Quick Jump: Going Over the Use Case | Using sed | Using yq for This Use Case | Demo Video

Prefer video? Here’s a demo video going over what’s listed below in a bit more detail.

Recently I found myself wanting to delete a number of Docker volume related lines from a docker-compose.yml file.

Ultimately I ended up using yq since it was the most maintainable and least brittle solution since I was dealing with YAML but sed is a general purpose solution not limited to YAML and it was a fun exercise in using it. We’ll cover both solutions.

Going Over the Use Case

Here’s a portion of the Docker Compose YAML file that I was dealing with. The whole file can be found in the example app on GitHub:

x-app: &default-app
  build:
    context: "."
    target: "app"
    args:
      - "UID=${UID:-1000}"
      - "GID=${GID:-1000}"
      - "FLASK_DEBUG=${FLASK_DEBUG:-false}"
      - "NODE_ENV=${NODE_ENV:-production}"
  # ...
  volumes:
    - "${DOCKER_WEB_VOLUME:-./public:/app/public}"

x-assets: &default-assets
  # ...
  volumes:
    - ".:/app"

services:
  postgres:
    # ...
    volumes:
      - "postgres:/var/lib/postgresql/data"

  redis:
    # ...
    volumes:
      - "redis:/data"

volumes:
  postgres: {}
  redis: {}

The goal here is to remove all of the volumes properties as well as each volume line within that property.

Using sed

sed can be used for a lot of things such as finding and replacing text in a file. It also supports being able to delete lines.

We can choose to delete lines by line number or by regex pattern.

Deleting by number is much shorter but it’s also more brittle because if the file changes then the line numbers will get shifted and it will start deleting the wrong lines.

Deleting by regex requires more upfront work but it’s a bit better since if you refactor the file everything will continue to work as long as you’re not modifying your volumes in which case you’ll need to remember to update your regex patterns.

Phase 1: Deleting by line

If we use the partial YAML file above we can delete the first instance of volumes with sed "11d" docker-compose.yml since that’s line 11.

You can also delete a range of lines such as sed "11,12d" docker-compose.yml which will delete both the volumes line and - "${DOCKER_WEB_VOLUME:-./public:/app/public}" under it which is exactly what we want.

sed also supports having multiple expressions if you separate them with: ;

That allows us to delete everything we want with sed "11,12d;16,17d;22,23d;27,28d;30,32d" docker-compose.yml. This isn’t the easiest thing in the world to maintain. There’s a lot of floating numbers that could easily change if you did a minor adjustment to your source file.

But it does leave us with what we want:

x-app: &default-app
  build:
    context: "."
    target: "app"
    args:
      - "UID=${UID:-1000}"
      - "GID=${GID:-1000}"
      - "FLASK_DEBUG=${FLASK_DEBUG:-false}"
      - "NODE_ENV=${NODE_ENV:-production}"
  # ...

x-assets: &default-assets
  # ...

services:
  postgres:
    # ...

  redis:
    # ...

Phase 2: Deleting by regex

sed "/^ volumes:$/d" docker-compose.yml deletes the first volumes property which isn’t too bad if you have 1 or 2 lines to delete.

But it starts to get a little unwieldy when you have a bunch of lines with regex patterns that are a little involved.

Here’s what deleting all of the files we want looks like:

sed '/^  volumes:$/d;/^    - "${DOCKER_WEB_VOLUME:.*"$/d;/^  volumes:/d;/^    - ".:\/app"$/d;/^    volumes:$/d;/      - "postgres:\/var\/lib\/postgresql\/data"$/d;/^    volumes:$/d;/^      - "redis:\/data"$/d;/^volumes:$/d;/^  postgres: {}$/d;/^  redis: {}$/d' docker-compose.yml

If you carefully look at all of that, it’s capturing specific lines of text that match a regex. The regex patterns are including a lot of each line’s text to avoid capturing false positives. We’re trading verbosity for correctness.

You can technically clean this up a little bit by using a custom sed delimiter to avoid escaping the backslashes in certain paths such as:

sed '/^  volumes:$/d;/^    - "${DOCKER_WEB_VOLUME:.*"$/d;/^  volumes:/d;\|^    - ".:/app"$|d;/^    volumes:$/d;\|      - "postgres:/var/lib/postgresql/data"$|d;/^    volumes:$/d;\|^      - "redis:/data"$|d;/^volumes:$/d;/^  postgres: {}$/d;/^  redis: {}$/d'

A few of them are using | as the delimiter since they have paths with a /. To do that you need to escape the first one with \. That’s why we see ;\|^ - ".:/app"$|d.

That still leaves a lot to be desired in readability and being able to hop in to modify one of those expressions. Fortunately we can break things up into multiple lines as seen below:

sed \
  -e '/^  volumes:$/d' \
  -e '/^    - "${DOCKER_WEB_VOLUME:.*"$/d' \
  -e '/^  volumes:/d' \
  -e '\|^    - ".:/app"$|d' \
  -e '/^    volumes:$/d' \
  -e '\|      - "postgres:/var/lib/postgresql/data"$|d' \
  -e '/^    volumes:$/d;\|^      - "redis:/data"$|d' \
  -e '/^volumes:$/d' \
  -e '/^  postgres: {}$/d' \
  -e '/^  redis: {}$/d' \
  docker-compose.yml

-e is the expression we want to run, sed lets us pass in multiple expressions.

If you were putting this into a script, that’s not the end of the world. In my opinion it’s a million times better than having all of that on 1 line.

Depending on your use case, this might still be a little brittle tho. If you ever add a volume you need to go back and modify this sed command. That may or may not be ok. If it’s not ok, maybe you should reach for yq, we’ll see what that looks like soon.

Phase 3: Inline Editing the File

If we run sed -i instead of sed it will overwrite the file being passed in with the new contents that have the lines deleted instead of printing it to STDOUT.

That’s what I wanted since ultimately I was running the modified file.

-i on its own works on native Linux or more specifically the GNU version of sed but it doesn’t work on macOS out of the box in the same way as it does with the GNU version, although you can manually install the GNU version of sed on macOS.

I was ok with that since this only ran on native Linux. If having both macOS and Linux support was a requirement and you can’t install the GNU version of sed on macOS you can always write a little wrapper script to alter the sed command depending on what uname -o returns. There’s lots of examples on StackOverflow if you Google around.

There we go, we have a general purpose solution using sed that works for all text files, not just YAML. That’s nice for a zero external dependency solution.

Using yq for This Use Case

If you’re dealing with a lot of YAML and want to delete a bunch of paths then I think it’s worth at least considering to use yq. It works the same on Linux and macOS.

The end game solution for that one looks like this:

yq -i "del(.x-app.volumes,.x-assets.volumes,.services.postgres.volumes,.services.redis.volumes,.volumes)" docker-compose.yml

It’s pretty much a 1 liner. We only need to provide the paths of the properties we want to delete which is less brittle than a regex. This is nice because even if we added a few new bind mounted volumes to our app they would be deleted when we delete the parent volumes property which means we don’t need to update our yq command.

You can download yq from https://github.com/mikefarah/yq. That’s the self contained Go binary version. There’s also a Python version available at https://github.com/kislyuk/yq.

The Go version will reformat your file, for example in our case it leaves us with:

x-app: &default-app
  build:
    context: "."
    target: "app"
    args:
      - "UID=${UID:-1000}"
      - "GID=${GID:-1000}"
      - "FLASK_DEBUG=${FLASK_DEBUG:-false}"
      - "NODE_ENV=${NODE_ENV:-production}"
x-assets: &default-assets
  # ...
  volumes:
    - ".:/app"
services:
  postgres: {}
  redis: {}

There’s been a GitHub issue open for years to allow it to preserve formatting but it doesn’t look like it’ll happen. The Python version allows to retain formatting but it also means you need to pip install it since it has quite a few dependencies.

For my use case the Go version was ok since I only wanted to run the newly outputted file, I didn’t commit it back to version control but your mileage may vary.

In any case, there’s a bunch of ways to handle this. That should get you going! The demo video below goes over running the commands we covered above.

Demo Video

Timestamps

  • 0:05 – The use case
  • 0:47 – Delete lines by line number
  • 5:06 – Delete lines by regular expression
  • 8:05 – Customize the sed delimiter
  • 9:31 – Breaking up our 1 line sed command into multiple lines
  • 10:29 – Maybe using yq
  • 13:13 – Delete lines with yq
  • 14:51 – Does anyone want to try a solution with perl?

What’s your favorite way to delete lines in a file? Let me know below.

Never Miss a Tip, Trick or Tutorial

Like you, I'm super protective of my inbox, so don't worry about getting spammed. You can expect a few emails per month (at most), and you can 1-click unsubscribe at any time. See what else you'll get too.



Comments