Delete Lines That Match a Pattern or the Opposite Pattern Using sed

This could be useful to remove unwanted lines in a file or output, such as processing a file and writing out a new file to analyze.
The TL;DR is:
- Matching a pattern:
sed "/PATTERN/d" your_file - Matching the opposite (invert) pattern:
sed "/PATTERN/!d" your_file(notice the!)
For example, here’s how to delete all lines that don’t start with a space:
sed "/^ /!d" demo_file
# A Real World Example
I did this recently to write a one off script for a client to help identify all of the spots in their code base where they were referencing CodeIgniter config items. It’s a 10+ year old code base with thousands of config item references that had over a dozen different call styles.
So I wrote a script that used grep to scan their code base using a
combination of greedy and more specific regular expression patterns.
Most config items are referenced with this pattern $this->config->item(.*)
but in practice there were over a dozen different call styles.
I started with a very greedy match like config-> to identify a bunch,
including false positives. Then I tightened it up. Here’s a few example call
style patterns:
$CI->config->item(.*)
$this->config->item(.*)
$this->ci->config->item(.*)
$this->CI->config->load(.*)
Long story short I wrote both the greedy and specific matches to separate files using a format that looks similar to this:
Matching: <INSERT_PATTERN>
/tmp/some_file.php:100: $this->config->item('hello');
/tmp/some_file.php:531: $this->config->item('world');
/tmp/another_file.php:72: $this->CI->config->load('nice');
Realistically it included the counts of each pattern but that’s not important here.
Then I used sed, sort and diff to compare both files and show the result.
This let me quickly see the difference between the greedy and specific matches
to get all of the non-false positives.
#!/usr/bin/env bash
# Usage example: ./diff-config-calls greedy_file specific_file
set -o errexit
set -o pipefail
set -o nounset
file_a="${1}"
file_b="${2}"
sed "/^ /!d" "${file_a}" | sort > "${file_a}.processed"
sed "/^ /!d" "${file_b}" | sort > "${file_b}.processed"
diff --color --unified "${file_a}.processed" "${file_b}.processed"
rm "${file_a}.processed" "${file_b}.processed"
That produced an easy to read diff. Without writing these types of scripts it would have been ridiculously tedious to go through over a million lines of code with thousands of results.
The next step was programmatically converting those CodeIgniter config references into Laravel since this client was switching to Laravel one component at a time but that goes beyond the scope of this post, however that’s why I wanted to remove all false positives and ensure I caught all of the legit config references.
The video below goes into running the TL;DR examples.
# Demo Video
Timestamps
- 0:09 – Going over the TL;DR example
- 1:12 – A real world use case
What use cases have you applied this to in the past? Let me know below.