Using Diff, Process Substitution and Pipes to Solve a Real Problem
In this video, we'll use diff, find, sed, grep, cut and sort together to write a custom Jekyll tag checking Shell script.
In this specific case I wanted to make sure that the 150+ tags I have referenced on my Running in Production podcast site all map back to valid tag files on disk. This script helped me find a few tags that would have thrown a 404 due to typos or had missing information.
After watching this video you’ll be able to apply the same strategies back to your own use cases and issues. That’s the beauty of using Unix pipes. You can combine a bunch of different independent tools to solve unique problems as you come across them.
# Going Over Everything
Timestamps
- 1:06 – It’s so easy to make a typo or forget a tag name
- 2:51 – Creating a Shell script to automate looking for tag mismatches
- 3:21 – The TL;DR on how the diff command works
- 3:41 – Using Bash Process Substitution to diff the output of 2 commands
- 5:27 – High level overview of the tools used to compare tags in 2 different ways
- 6:03 – Using sed range patterns
- 7:03 – Breaking down each piped tool on the command line (sed, grep, cut, sort)
- 9:03 – Using basename to get the file name without extension for many files
- 10:44 – The end result is having 2 lists of sorted tags to diff
- 10:48 – Evaluating the exit code of a command to determine if the tags match
- 11:52 – More Shell scripting to make sure each tag has every property filled out
- 12:57 – Make it work, make it nice and then make it fast
Code
We go over more than what’s included below, but here’s the snippet associated to diffing the output of 2 commands. The rest is on GitHub:
diff --color -u <(for file in $(find _posts -type f)
do
sed -n "/^tags:$/,/^title:/p" "${file}" | grep "^ - " | cut -d '"' -f 2
done | sort -u) <(basename -s .md tags/*.md | sort -u)
Reference links
- https://runninginproduction.com
- https://nickjanetakis.com/blog/displaying-database-results-across-multiple-columns-with-1-line-of-css
- https://nickjanetakis.com/blog/diff-selections-files-directories-and-git-history-with-vim
- https://www.gnu.org/software/bash/manual/bash.html#Process-Substitution
- https://nickjanetakis.com/blog/using-sed-range-patterns-grep-and-tr-to-parse-a-changelog-file
- https://github.com/nickjj/runninginproduction.com
What types of problems have you solved with these tools? Let me know below!