Updated on March 3, 2020 in #linux

Using sed Range Patterns, grep and tr to Parse a Changelog File

using-sed-range-patterns-grep-and-tr-to-parse-a-changelog-file.jpg

In this video, we'll pipe together a few Unix tools to parse out changes from a specific release in a Markdown based changelog file.

Quick Jump:

I’m a big fan of using the command line to solve problems that require parsing text from files. By the end of this video you’ll know how to break down text parsing problems and how to use a few Unix tools together to solve problems like the one covered in this video.

In this specific video, the goal is to parse a CHANGELOG.md file so that you can input a specific release tag and get a back a list of bullet points associated to that release. This information could then be sent to Slack, email or whatever you want as part of a CI / CD pipeline.

Since we’ll cover both the “why” and the “how”, you’ll see how to apply the same strategies to your specific text related problems not just the one in the video, although who knows, you might wind up wanting to do the same thing I’m doing here with your changelog file.

# Building Up the Command Pipeline

The Command

# Original script used in the video, which has a subtle bug when using tr.
sed -n "/^## v1.9.2$/,/^## /p" CHANGELOG.md \
  | grep -E "^(-|\s+)" \
  | tr "-" ">" \
  | sed ":a $!N;s/\n[ \t]\+/ /;ta P;D"

# A revised version of the script that is more strict on replacing - with >
# only if the line starts with -. This prevents replacing a hyphen that happens
# to exist in the middle of the line or anywhere else in the bulleted item.
sed -n "/^## v1.9.2$/,/^## /p" CHANGELOG.md \
  | grep -E "^(-|\s+)" \
  | sed "s/^-/>/" \
  | sed ":a $!N;s/\n[ \t]\+/ /;ta P;D"

Timestamped Table of Contents

1:10 – First order of business? Break down the problem and find patterns
3:45 – Thinking about edge cases
4:59 – Downloading the CHANGELOG file to follow along if you want
5:57 – Beginning to solve the problem by using a sed range pattern
10:49 – Using grep to only get lines that are bullet points
11:37 – Using tr to transform bullet points into Markdown quotes
12:36 – Dealing with bullets that are hard wrapped into multiple lines
14:28 – Introducing an alien sed command to pull up non-hyphen lines
16:41 – The command line is super helpful for solving various text problems

Reference Links

What type of problems have you solved with sed and grep? Let me know below.

Like you, I'm super protective of my inbox, so don't worry about getting spammed. You can expect a few emails per year (at most), and you can 1-click unsubscribe at any time. See what else you'll get too.

Learn Docker With My Newest Course