Learn Docker With My Newest Course

Dive into Docker takes you from "What is Docker?" to confidently applying Docker to your own projects. It's packed with best practices and examples. Start Learning Docker →

Combine grep and sed to Recursively Replace Text in a Pattern of Files

combine-grep-and-sed-to-recursively-replace-text-in-a-pattern-of-files.jpg

Finding and replacing strings across files is a common thing to do, here's how to send a filtered list of files to xargs and sed.

Quick Jump:

Over the last 10 years I’ve linked to a number of tweets in this blog and I always used https://twitter.com but as you know their link changed to https://x.com.

I still called Twitter Twitter but we can be good citizens of the web and update all backlinks to goto x.com to avoid a needless redirect on Twitter’s side.

There’s a bunch of ways to do this replacement of text ranging from using your code editor’s find / replace to the command line using (find | grep | rg) + xargs + (sed | perl). In this post we’ll focus on a few combos of command line tools.

I use ripgrep in Vim all the time but on the command line I tend to use grep to do the “real” work, but I sometimes use ripgrep to do quick checks because its out of the box experience works nicely with less typing.

It’s funny because it reminds me of making purchases online. I have a computer and phone but if I purchase something big I always use my computer but I have no problem browsing around on my phone. I use grep to make purchases but ripgrep to browse.

# Exploring

Before doing any destructive action like writing new files I like see the results first:

rg "https://twitter.com"

The above can be done with grep too, but I have an .ignore file to have ripgrp automatically ignore a bunch of directories that I would have had to exclude with grep.

Here’s the same thing with grep:

grep "https://twitter.com" --exclude-dir=public/ --exclude-dir=published/ --recursive .

Both commands returned 85 matches. For example:

content/blog/2021-06-08-using-ffmpeg-to-get-an-mp3s-duration-and-4-ways-to-get-the-file-size.md
89:- <https://twitter.com/nickjanetakis/status/1398668116872343552>

It’s just a list of files and where that match was found. It’s good to quickly see them.

Previewing Changes

Ripgrep has a neat feature to do replacements but it doesn’t write the changes to disk. It’s a quick way to get an idea of how something will look before you do a real replace:

rg "https://twitter.com" --replace "https://x.com"

Using the above example output, now it shows this:

content/blog/2021-06-08-using-ffmpeg-to-get-an-mp3s-duration-and-4-ways-to-get-the-file-size.md
89:- <https://x.com/nickjanetakis/status/1398668116872343552>

This goes back to using ripgrep to browse around. It’s quick and easy.

# Recursive Find and Replace

It’s really common to use find, xargs and sed together to do this. You can use find to produce a list of files and then perform the string replacement with sed while xargs sits in the middle to input each file into sed as arguments.

We’re going to do the same thing here except swap out find with grep or ripgrep. The benefit is we can do the heavy lifting to filter the list of files with a tool that’s well equipped for that such as grep.

Technically in this case find would have worked well because it’s basically only searching through a bunch of files in a specific directory but if you have more advanced filtering requirements grep can help a lot here.

Getting Only Files

The first thing we need to do is only return a list of files that have matches. This is what we’ll pipe into sed as file names. We can’t have the actual strings be a part of that:

rg "https://twitter.com" --files-with-matches

Using our above example, that produces this:

content/blog/2021-06-08-using-ffmpeg-to-get-an-mp3s-duration-and-4-ways-to-get-the-file-size.md

The same can be done with grep:

# I used -R instead of --recursive to fit this on 1 line, they do the same thing
grep "https://twitter.com" --exclude-dir=public/ --exclude-dir=published/ -R --files-with-matches .

Performing the Find and Replace

The basic idea is we have a new line separated list of files coming in from either ripgrep or grep, then we use xargs to convert them as arguments which are sent to sed:

rg "https://twitter.com" --files-with-matches | xargs sed -i "s|https://twitter.com|https://x.com|g"

We used | instead of / to delimit sed’s find and replace to avoid needing to escape // a few times in both https:// references.

Linux vs OpenBSD version of sed:

The above sed -i command won’t work on macOS. You can use sed -i "" instead.

Alternatively you can use perl -i -pe "s|old|new|g" which works in both environments. When writing shell scripts I tend to use the perl approach due to that.

The video below runs everything to see how it all works.

# Demo Video

Timestamps

  • 0:55 – Use case
  • 1:23 – Finding strings with ripgrep
  • 2:10 – Finding strings with grep
  • 4:08 – Previewing replacements with ripgrep
  • 6:21 – Getting a list of files to replace
  • 7:08 – Getting a list of matched files with ripgrep
  • 7:37 – Getting a list of matched files with grep
  • 8:12 – Piping it into xargs and sed
  • 10:11 – Counting the results before and after
  • 11:18 – Using perl for a cross OS way to do inline replacements

What will you use this on next? Let me know below.

Never Miss a Tip, Trick or Tutorial

Like you, I'm super protective of my inbox, so don't worry about getting spammed. You can expect a few emails per year (at most), and you can 1-click unsubscribe at any time. See what else you'll get too.



Comments