Learn Docker With My Newest Course

Dive into Docker takes you from "What is Docker?" to confidently applying Docker to your own projects. It's packed with best practices and examples. Start Learning Docker →

Delete Lines That Match a Pattern or the Opposite Pattern Using sed

blog/cards/delete-lines-that-match-a-pattern-or-the-opposite-pattern-using-sed.jpg

This could be useful to remove unwanted lines in a file or output, such as processing a file and writing out a new file to analyze.

Quick Jump: A Real World Example | Demo Video

The TL;DR is:

  • Matching a pattern: sed "/PATTERN/d" your_file
  • Matching the opposite (invert) pattern: sed "/PATTERN/!d" your_file (notice the !)

For example, here’s how to delete all lines that don’t start with a space:

  • sed "/^ /!d" demo_file

A Real World Example

I did this recently to write a one off script for a client to help identify all of the spots in their code base where they were referencing CodeIgniter config items. It’s a 10+ year old code base with thousands of config item references that had over a dozen different call styles.

So I wrote a script that used grep to scan their code base using a combination of greedy and more specific regular expression patterns.

Most config items are referenced with this pattern $this->config->item(.*) but in practice there were over a dozen different call styles.

I started with a very greedy match like config-> to identify a bunch, including false positives. Then I tightened it up. Here’s a few example call style patterns:

$CI->config->item(.*)
$this->config->item(.*)
$this->ci->config->item(.*)
$this->CI->config->load(.*)

Long story short I wrote both the greedy and specific matches to separate files using a format that looks similar to this:

Matching: <INSERT_PATTERN>
     /tmp/some_file.php:100:   $this->config->item('hello');
     /tmp/some_file.php:531:   $this->config->item('world');
     /tmp/another_file.php:72: $this->CI->config->load('nice');

Realistically it included the counts of each pattern but that’s not important here.

Then I used sed, sort and diff to compare both files and show the result. This let me quickly see the difference between the greedy and specific matches to get all of the non-false positives.

#!/usr/bin/env bash
# Usage example: ./diff-config-calls greedy_file specific_file

set -o errexit
set -o pipefail
set -o nounset

file_a="${1}"
file_b="${2}"

sed "/^     /!d" "${file_a}" | sort > "${file_a}.processed"
sed "/^     /!d" "${file_b}" | sort > "${file_b}.processed"

diff --color --unified "${file_a}.processed" "${file_b}.processed"

rm "${file_a}.processed" "${file_b}.processed"

That produced an easy to read diff. Without writing these types of scripts it would have been ridiculously tedious to go through over a million lines of code with thousands of results.

The next step was programmatically converting those CodeIgniter config references into Laravel since this client was switching to Laravel one component at a time but that goes beyond the scope of this post, however that’s why I wanted to remove all false positives and ensure I caught all of the legit config references.

The video below goes into running the TL;DR examples.

Demo Video

Timestamps

  • 0:09 – Going over the TL;DR example
  • 1:12 – A real world use case

What use cases have you applied this to in the past? Let me know below.

Never Miss a Tip, Trick or Tutorial

Like you, I'm super protective of my inbox, so don't worry about getting spammed. You can expect a few emails per month (at most), and you can 1-click unsubscribe at any time. See what else you'll get too.



Comments