Combine grep and sed to Recursively Replace Text in a Pattern of Files
Finding and replacing strings across files is a common thing to do, here's how to send a filtered list of files to xargs and sed.
Over the last 10 years I’ve linked to a number of tweets in this blog and I
always used https://twitter.com
but as you know their link changed to
https://x.com
.
I still called Twitter Twitter but we can be good citizens of the web and update all backlinks to goto x.com to avoid a needless redirect on Twitter’s side.
There’s a bunch of ways to do this replacement of text ranging from using your code editor’s find / replace to the command line using (find | grep | rg) + xargs + (sed | perl). In this post we’ll focus on a few combos of command line tools.
I use ripgrep in Vim all the time but on the command line I tend to use grep to do the “real” work, but I sometimes use ripgrep to do quick checks because its out of the box experience works nicely with less typing.
It’s funny because it reminds me of making purchases online. I have a computer and phone but if I purchase something big I always use my computer but I have no problem browsing around on my phone. I use grep to make purchases but ripgrep to browse.
# Exploring
Before doing any destructive action like writing new files I like see the results first:
rg "https://twitter.com"
The above can be done with grep too, but I have an .ignore
file to have
ripgrp automatically ignore a bunch of directories that
I would have had to exclude with grep.
Here’s the same thing with grep:
grep "https://twitter.com" --exclude-dir=public/ --exclude-dir=published/ --recursive .
Both commands returned 85 matches. For example:
content/blog/2021-06-08-using-ffmpeg-to-get-an-mp3s-duration-and-4-ways-to-get-the-file-size.md
89:- <https://twitter.com/nickjanetakis/status/1398668116872343552>
It’s just a list of files and where that match was found. It’s good to quickly see them.
Previewing Changes
Ripgrep has a neat feature to do replacements but it doesn’t write the changes to disk. It’s a quick way to get an idea of how something will look before you do a real replace:
rg "https://twitter.com" --replace "https://x.com"
Using the above example output, now it shows this:
content/blog/2021-06-08-using-ffmpeg-to-get-an-mp3s-duration-and-4-ways-to-get-the-file-size.md
89:- <https://x.com/nickjanetakis/status/1398668116872343552>
This goes back to using ripgrep to browse around. It’s quick and easy.
# Recursive Find and Replace
It’s really common to use find, xargs and sed together to do this. You can use find to produce a list of files and then perform the string replacement with sed while xargs sits in the middle to input each file into sed as arguments.
We’re going to do the same thing here except swap out find with grep or ripgrep. The benefit is we can do the heavy lifting to filter the list of files with a tool that’s well equipped for that such as grep.
Technically in this case find would have worked well because it’s basically only searching through a bunch of files in a specific directory but if you have more advanced filtering requirements grep can help a lot here.
Getting Only Files
The first thing we need to do is only return a list of files that have matches. This is what we’ll pipe into sed as file names. We can’t have the actual strings be a part of that:
rg "https://twitter.com" --files-with-matches
Using our above example, that produces this:
content/blog/2021-06-08-using-ffmpeg-to-get-an-mp3s-duration-and-4-ways-to-get-the-file-size.md
The same can be done with grep:
# I used -R instead of --recursive to fit this on 1 line, they do the same thing
grep "https://twitter.com" --exclude-dir=public/ --exclude-dir=published/ -R --files-with-matches .
Performing the Find and Replace
The basic idea is we have a new line separated list of files coming in from either ripgrep or grep, then we use xargs to convert them as arguments which are sent to sed:
rg "https://twitter.com" --files-with-matches | xargs sed -i "s|https://twitter.com|https://x.com|g"
We used |
instead of /
to delimit sed’s find and replace to avoid needing
to escape //
a few times in both https://
references.
Linux vs OpenBSD version of sed:
The above sed -i
command won’t work on macOS. You can use sed -i ""
instead.
Alternatively you can use perl -i -pe "s|old|new|g"
which works in both
environments. When writing shell scripts I tend to use the perl approach due to
that.
The video below runs everything to see how it all works.
# Demo Video
Timestamps
- 0:55 – Use case
- 1:23 – Finding strings with ripgrep
- 2:10 – Finding strings with grep
- 4:08 – Previewing replacements with ripgrep
- 6:21 – Getting a list of files to replace
- 7:08 – Getting a list of matched files with ripgrep
- 7:37 – Getting a list of matched files with grep
- 8:12 – Piping it into xargs and sed
- 10:11 – Counting the results before and after
- 11:18 – Using perl for a cross OS way to do inline replacements
What will you use this on next? Let me know below.