Updated on September 24, 2024 in #deployment

Using NGINX Regex Capture Groups to Redirect URL Paths

using-nginx-regex-capture-groups-to-redirect-url-paths.jpg

This can be handy if you change your URL structure and want to make sure your old URLs still work and indexes get updated.

Quick Jump:

Prefer video? Here it is on YouTube.

Over the last ~10 years or so I’ve made a number of changes to this blog’s URLs. I try to be a good citizen of the web and avoid breaking URLs because you never know who might be linking back to you.

I use NGINX as my web server and set up a few 301 redirects. These are permanent redirects that let search engines and others know that your content has moved to a new location and HTTP clients such as browsers will auto-redirect to it.

Capture groups help a lot here because they let you “capture” part of a regular expression match and put it into a variable that you can use outside of the match.

In this case, that would be taking part of the incoming URL path and performing a redirect to another URL path but it’s not limited to redirects.

This idea of a capture group belongs to regular expressions and is well supported in many programming languages and tools, NGINX happens to support it.

Changing a Part of Your URL Path

Early on I tried to prematurely optimize my URLs thinking I would have courses, books and other digital goods which could be described as “products” but that wasn’t worth it. I’ve kept it simple and just use courses because that’s what I have.

That means redirecting any /products/ URL to /courses/, including /products/ as well as /products/hello-world.

location ~ ^/products/(.*)$ {
  return 301 /courses/$1$is_args$args;
}

location ~ ^/products/(.*)$ matches a regular expression
The capture group is defined with (.*) in the location line, the paranthesis starts and ends the capture group and the regex inside is what gets stored in a variable we’ll use on the next line
The $1 variable on the 2nd line contains what was captured
$is_args$args is unrelated but makes sure any query string params are included

Changing Paginated URLs

I converted my blog from Jekyll to Hugo and with Jekyll I had my paginated blog pages at /blog/page2, /blog/page3, etc. but with Hugo the URL structure was different. It uses blog/page/2, /blog/page/3, etc..

This is similar to the first example except it shows we can match on any type of regular expression we want, in this case only digits rather than everything.

location ~ ^/blog/page(\d+)/?$ {
  return 301 /blog/page/$1$is_args$args;
}

(\d+) uses a capture group that only matches 1 or more digits
Everything else is the same as the previous example

You can test the above against this site with:

$ curl -v https://nickjanetakis.com/blog/page3 2>&1 | grep "< Location:"
< Location: https://nickjanetakis.com/blog/page/3

By default curl won’t auto-redirect to the destination but if you include -v it shows you where it will redirect. You can use -L or --location to have it auto-redirect like a browser.

Multiple / Named Capture Groups

For my personal sites I don’t have any examples of using multiple capture groups at once but you could have something like ^/issues/(.*)/comments/(.*)$ and then you can access your captures with $1 and $2 in the order they are specified.

If you find yourself getting confused by using $1 or numbered references you can name them instead. Here’s the first example but using a named capture group:

location ~ ^/products/(?<product>.*)$ {
  return 301 /courses/$product$is_args$args;
}

?<product> lets you define a name of your choosing
$product is the same as $1 except it has a custom name

The video below shows how some of this works.

# Demo Video

Timestamps

1:01 – Redirecting parts of a URL path
3:35 – Adjusting paginated blog post pages
4:58 – Named captures and multiple captures

What type of problems have you solved with capture groups? Let me know below.

Like you, I'm super protective of my inbox, so don't worry about getting spammed. You can expect a few emails per year (at most), and you can 1-click unsubscribe at any time. See what else you'll get too.

Learn Docker With My Newest Course