Converting My 500+ Page Blog from Jekyll to Hugo
This was a fun adventure that took less time than I thought it would have. Here's a bunch of things I learned along the way.
Prefer video? Here it is on YouTube. It covers a bit more detail than this post since I open both site in a code editor and jump around.
I originally wrote this right after migrating to Hugo but I waited 6+ months to post this to make sure everything was in good working order and stable. It’s been over 6 months and everything is good to go. I’m really happy with how it turned out.
I’ve mostly left this post untouched in its original form but I added little callouts to validate some of my assumptions back then and also add a few new details.
Hugo and Jekyll are both popular static site generators.
One does not simply wake up and decide they want change tech stacks on their 9 year old site with 500+ posts and 300+ drafts. I think before we get into the nitty gritty technical details, answering the “why?” is important here. If you don’t care, that’s ok, skip around.
Was it because Jekyll was too slow? Not entirely
Full site builds with Jekyll took 40 seconds and --incremental
builds took
about 4 seconds. That’s on an i5 3.2ghz / 16 GB of memory / SSD workstation I
put together back in 2014. I mean, that’s not blazing fast but it’s wasn’t
terrible.
With that said, Hugo does full builds in ~3 seconds and incremental builds in 80ms.
That gives near instant live reload when writing posts which makes a huge difference from waiting 4 seconds. It’s something I noticed right away in a good way and if I knew how big of an impact that was, I would have considered switching earlier.
The main reason is because I traveled to Portugal recently for a Docker Captains summit and getting Jekyll set up on my Chromebook running Linux so I can publish a few blog posts while traveling was a pain.
It was a pain because I was stuck using an old version of Jekyll-Assets which in turn locked me into an older version of Jekyll which in turn locked me into using an old version of Ruby 2.7. Jekyll along with live reload was never something I was able to get to work inside of Docker with this combination of versions too.
Combine that with using an unsupported Linux distro (GalliumOS) which didn’t make it easy to get specific versions of specific files to compile Ruby gems that require C dependencies.
I felt like I dodged 20 bullets just getting Jekyll to build successfully on that Chromebook and I have experience working with Rails so Ruby is not foreign to me. That’s a pain I don’t want to continue dealing with, especially when I travel more.
So then the decision was, do I try to refactor my way out of Jekyll-Assets so I can update Jekyll and Ruby? I use that plugin because it digests your assets by adding a SHA-256 hash to their file names. This lets me cache them with nginx. I can’t not have that feature.
I didn’t find an alternative solution so it was time to consider approaching this from ground zero. Let me take every pain point I encountered and see if I can solve everything. Yep, the “grand rewrite”. In this case it made sense.
I always knew Hugo was fast and quite popular. I even toyed with the idea of switching to it 2-3 years ago but I didn’t have enough of a compelling reason.
I do know Docker’s documentation is built with it which gave me a pretty big confidence boost that it can be successfully used on a complex project. I didn’t even bother looking at other solutions because I know having a super fast statically compiled binary solution is pretty much as good as it gets.
I will say this. Having the idea of switching lingering in the back of my mind for all that time definitely created a mental tax. Every time it popped up into my thoughts it made me think I’m slacking off or avoiding something when I didn’t do it.
I decided it’s time and went into hardcore learning mode. About 30 hours later my whole site was converted with a combination of Python scripts, shell scripts and manual checks. I broke that up over the course of a few days.
That mental tax vanished and it felt great.
By the way, I had no idea it was going to take ~30 hours. I thought this would have taken way longer which was also why I was reluctant to start. That and there were too many unknowns that I first had to figure out and as I’m sure you know having unknowns is a great way to talk yourself out of doing something.
# Solving the Bigger Problems Before I Start
Step one is getting a lay of the land. What do I even need to change or solve from a technical sense before I invest serious time into this? That seems like a reasonable first step.
It’s not just a matter of converting a few Jekyll template tags to Hugo. I had a few custom Jekyll plugins that I wrote or was using.
For example, one of them scans all links and adds target="_blank"
to all
external links. An external link is any absolute URL that doesn’t match my
configured domain.
Another one pulls tweets in from Twitter so I can embed tweets.
There were also other features I built using frontmatter and built-in template logic such as generating the “Quick Jump” that you see in most of my posts. Also, I don’t link this anywhere on my site but I do have a photo gallery too, which is built up from YAML data files and template logic.
Another technical challenge was ensuring my URLs don’t change for both my pages and assets. That means I have to digest assets in the same exact format as Jekyll-Assets or this migration cannot happen since I’m not going to 301 redirect ~1,000+ images.
Also, I encountered a lot of resistance with how Hugo deals with trailing
slashes with URLs. You can either use “ugly” URLs with .html
in them and then
redirect them with nginx or another web server or you can use “pretty” URLs but
every URL will end with a trailing slash, such as /about/
instead of
/about
.
In my opinion that was a deal breaker for me. From a technical perspective I did not want:
- All requests to get 301’d
- All of my pages having a trailing slash
- Alternatively do some post-processing on the static HTML files to remove the slash
I actually broke this rule and did the entire migration without solving the URL problem first because I mentally agreed that no matter what I’m going to figure out a solution.
Sheer determination is a powerful thing. Fortunately I did figure out how to work around this with Hugo once I got into the thick of it. We will cover that in the details later.
As for the other bigger challenges, I did confirm they will be possible to pull off.
There’s a billion other smaller technical hurdles but the above were the big ones. Ok, we’re really doing this. Time to get started with the migration!
Just kidding.
Go for a Fresh Look or Keep the Same Theme?
My theme is something I built with vanilla Bootstrap many years ago. I don’t NOT like it but it would be nice to change eventually.
Another interesting bigger technical challenge was Jekyll-Assets allows me to separate my SCSS and JS into separate files through Sprockets and then it bundles everything.
I’ve already been using esbuild and Tailwind for every other project for years and decided I wanted to use that with Hugo too but I didn’t want to convert all of my SCSS and JS.
I ultimately decided trying to change my theme during this move is too much at once. I can do this incrementally. Get everything going on Hugo with my existing theme and then consider switching after it’s stable.
I ended up taking the non-minified but concatenated single CSS and JS files from Jekyll and plopped it straight into my esbuild set up without Tailwind for now. That was pretty painless, everything “just worked”. Then esbuild minifies it for production builds.
Ok cool, now let’s begin the migration of everything else.
# High Level Feature Scan
I know I had to convert 500+ blog posts from Jekyll’s Liquid template language to Hugo’s Go template language but in my mind that’s the easy part. I knew I could script out a 90% automated solution with Python in a few hours. I didn’t think about this part until I identified the features I need to convert over.
For this feature dump, I didn’t care about ordering or prioritizing them, I just wanted them out of my head so I dumped them into a plain text list.
This wasn’t meant to be a perfect or complete list. It was to give me a rough idea of the scope of the project. I can always add to it later as I go. I time boxed myself to 30 minutes:
- Pretty URLs without trailing slashes
- Show N latest posts
- Pagination
- Generate a table of contents (“Quick Jump”)
- Filter posts by tag
- Ability to organize draft posts in a separate directory
- Ability to see drafts and future posts in development
- Embed YouTube videos
- Embed tweets
- Digest CSS, JS and most images
- Ability to not digest certain images (such as the gallery images)
- Minify final HTML for production builds
- Live reload for local development
- External links open in a new browser window
- Split out content types (blog, courses, etc.)
- Content types can have their own layout
- Ability to toggle features on a per page basis
- Comments, table of contents, different newsletter form, affiliate link callout, etc.
- Use template helpers for cross links to ensure I don’t have dead links
- Ability to segment template logic into partials, etc.
- Load data from YAML files
- Sitemap
- RSS feed
- robots.txt
- Twitter and OpenGraph tags
- Favicon at various sizes
What this exercise let me confirm was there’s not a whole lot there which is uncharted territory. Most of these things are straight forward where I could check the docs or internet for the specific syntax for what I needed to do.
It was one of my first “wins” of the project. This is doable, it’s going to happen. I like wins like this because it puts me into a different type of mindset. It’s like the hard part is done and now it’s just implementation details.
Spoiler alert, some of these implementation details were really tricky to figure out but I learned a lot along the way!
# Hello World with Hugo
The first thing I did was get the plumbing of the project up and running.
That means getting Hugo and esbuild running in Docker with Docker Compose. Fortunately this was a snap. I used my other Dockerized starter apps as inspiration.
Hugo Is Fast
One fun problem I had here was esbuild produces files in public/
and I
configured Hugo to read assets in from that directory.
When using depends_on
with Docker Compose, I made sure esbuild starts before
Hugo but if Hugo finishes starting before esbuild outputs its files then Hugo
will error out due to certain assets not existing while they’re referenced.
To resolve this, I added this to my Docker entrypoint script:
assets_dir="public/assets"
until [ -f "${assets_dir}/css/app.css" ] && [ -f "${assets_dir}/js/app.js" ]; do
sleep 0.25
done
The above ensures the assets produced by esbuild are available on disk before Hugo’s server even begins to start.
I know there’s a chance this could be an infinite loop but I mean, it’s my personal blog here. I know those files will exist eventually and if they don’t then Hugo won’t start and I know something is up.
In the 6+ months of using this set up, that code has executed hundreds if not 1,000+ times and it was never a problem. Works for me!
Once I got a basic set up running then I started to learn Hugo. I’ll spare you the details but it’s how I research and learn everything else. Basically Googling for specific problems and tinkering while optimizing for short feedback loops of having a question and trying out a solution.
One nice perk of this migration is I already have the answer book in front of me with Jekyll. I kept my Jekyll server running locally and constantly compared Hugo’s output to it to ensure it at least looks the same. This was wildly useful.
I started with the home page. That uncovered some challenges of working with Hugo.
I don’t think it’s worth walking through literally everything I learned while converting each page, so let’s initially focus more on comparing Jekyll’s terms to Hugo’s. We can cover some of the more interesting features and challenges afterwards.
# Comparing Terms for Hugo vs Jekyll
Each bullet is labeled with what it’s called in Hugo.
- Layouts
- These exist in both, even being able to use blocks too
- Partials
- Jekyll uses
{% include myfile.html %}
- Hugo uses
{{ partial "myfile.html" }}
- Jekyll uses
- Shortcodes
- You can think of these as creating custom logic you can use in your content
- Embedding YouTube, creating links, removing code duplication, etc.
- With Jekyll you’d typically use
include
for this too
- You can think of these as creating custom logic you can use in your content
- Functions
- Hugo has a bunch of functions you can call in your templates
- Trim strings, math, hashing, etc.
- Jekyll calls these “tags” through Liquid and there’s a bunch too
- Hugo has a bunch of functions you can call in your templates
- Frontmatter
- This exists in both, Hugo supports YAML, TOML or JSON frontmatter
- I plan to stick with YAML, it works and looks great for this sort of thing
- Content
- You have “content” pages, this is where your Markdown files go
- A “blog post” is just a type of content, there is no special
_posts
directory
- Config
- Both support config files and environment specific configs (dev, prod, etc.)
There’s a number of other things like assets, static files and data files which are named about the same in both. There’s also the concept of themes in both, but I am not using a third party theme in either case.
By the way, there’s all sorts of other Hugo specific terms related to content management I’m not listing here since they aren’t the focus of this post. You can find them in their docs.
# Note Worthy Comparisons
I don’t think it’s worth covering everything like how to define a variable in Liquid vs Go or using a for in loop vs a range to loop over something. You can easily Google these things!
Let’s focus on a few basic but useful features and a number of more interesting things.
File Based Dates with YYYY-MM-DD Prefixes
With Jekyll this works out of the box because it’s built into Jekyll when you
create “posts”. You can also configure Jekyll to hide the date from the URL by
setting permalink: "/blog/:slug"
. Of course that’s a personal preference but
that’s the combo I use.
Hugo doesn’t enforce this because it has no special notion of what a “post” is but you can get the same effect pretty easily by setting this in your config:
frontmatter:
date: [":filename", ":default"]
That will let your files sit on disk like YYYY-MM-DD-hello-world.md
and with
no further configuration your URLs will be /hello-world/
(we’ll talk about
the trailing slash later).
When you have 500+ posts, having them sorted naturally with YYYY-MM-DD is very helpful. Otherwise you have no idea when something was created if you’re glancing your files.
Also having the date in there lets you fuzzy match on it. There’s been many times where I knew I wrote something in let’s say the summer of 2022 but I forgot the title so I began searching for 2022-07 to see what pops up.
Separate Directory for Your Drafts
With Jekyll this works out of the box because “posts” are a special thing and a “draft” is a special type of post.
Hugo by default lets you set draft: true
in your frontmatter but that expects
both your live posts and drafts exist in the same directory on disk.
I didn’t want them to be combined because I like quickly seeing which of my posts are live or drafts. I also very often fuzzy find files with “drafts” in their name.
This is surprisingly easy to do with Hugo but it was tricky to figure out what features of Hugo needed to be used to achieve this.
All you have to do is create a content/drafts/_index.md
file and then within
that file add:
---
build:
# Ensure Hugo doesn't produce a /drafts/ URL and associated static HTML file.
render: "never"
cascade:
# We want drafts to be classified as content type "blog".
type: "blog"
# We want all of these content files to be set as a draft.
draft: true
---
Pretty cool, build
lets you configure various build options and cascade
will apply these settings to all pages of this content type.
When you’re writing a “draft” or a “blog” post nothing changes there. You don’t
need to mark anything as a draft in each post or worry about it. You can
promote a post from draft to live by moving it from content/drafts/
to
content/blog/
on disk.
When you’re looping over your blog posts, it will mix in both together in the same way that Jekyll does. When building your production site you would omit including drafts and future posts but include them in development. Done!
They are omit by default. In your development config you can have:
buildDrafts: true
buildFuture: true
Removing Trailing Slashes from URLs
Jekyll was great in this regard. You can choose if you wanted a trailing slash or not in your URL based on how you organized your files.
If you had hello.html
in the root of your project it wouldn’t get a trailing
slash but if you had hello/index.html
then it would. Straight forward and
intuitive.
With Hugo it’s not like that. You can configure uglyURLs
to be true or false.
If true then you’ll have hello.html
else it will be /hello/
.
However, it’s possible to workaround this:
In all of your templates, anytime
you output a URL (Permalink, RelPermalink, URL, etc.) you can pipe it to
strings.TrimSuffix "/"
such as {{ .RelPermalink | strings.TrimSuffix "/" }}
.
Then you can override the rel
and relref
shortcodes to do the same. These
are used for creating verified cross links. For example create
layouts/shortcodes/ref.html
and then add {{ (ref . .Params) | strings.TrimSuffix "/" -}}
to the file.
Now you can use pretty URLs without a trailing slash.
Hugo’s server in development will redirect to the version of the page with a trailing slash but nginx or another web server won’t do that. That works for me, it ensures there’s no 301s happening in production.
In my case I did use to have /blog/
with a trailing slash but I decided to
consistently not use trailing slashes in the end. In that case I did configure
nginx to 301 redirect /blog/
to /blog
so old back links continue to work.
All of the links within my site use /blog
so the redirect is avoided. I
consider this a victory.
Custom Digested File Names
All digesting was handled by Jekyll-Assets and it has a specific format of
filename-digest.extension
, so hello.jpg
will become hello-abc123.jpg
where abc123
is a SHA-256 checksum.
Hugo calls this process a “fingerprint” and it’s built-into the tool. You can
do resources.Get "/assets/css/app.css" | fingerprint
to fingerprint a
resource.
By default it will use filename.digest.extension
, so hello.jpg
will become
hello.abc123.jpg
where abc123
is a SHA-256 checksum. You can also adjust
the hashing algorithm to MD5 or others if needed.
The issue is at the moment there’s no way to customize the delimiter. I want to
use -
not .
so I opened an
issue as a proposal and it’s
scheduled for a future version of Hugo.
In the meantime I wanted this feature today because not being able to customize this is a deal breaker due to 1,000+ images being digested with a different file name.
With some help of the community, we figured out a solution. I rolled it up into
an image.html
partial as seen below:
{{ $fingerprintDelimiter := "-" }}
{{ $src := print "/assets/" .src }}
{{ $height := .height }}
{{ $width := .width }}
{{ $class := .class }}
{{ with resources.Get $src }}
{{ $sha256 := sha256 .Content }}
{{ $ext := path.Ext .Name }}
{{ $pathWithoutExt := replaceRE `\.[^.]+$` "" .Name }}
{{ $pathWithoutExt = strings.TrimLeft "/" $pathWithoutExt }}
{{ $fullPathWithFingerprint := printf "%s%s%s%s" $pathWithoutExt $fingerprintDelimiter $sha256 $ext }}
{{ with resources.Copy $fullPathWithFingerprint . }}
<img src="{{ .RelPermalink }}" {{ with $class }} class="{{ $class }}"{{ end }} height="{{ $width | default .Height }}" width="{{ $width | default .Width }}" alt="{{ index ((split .Name "/") | last 1) 0 }}" >
{{ end }}
{{ end }}
I also introduced a shortcode for it so I can use it in both layouts and content files.
For example, here’s how to use it: {{< image src="blog/some-image.jpg" >}}
That will produce this output:
<img src="/assets/blog/some-image-de45a50b6ecdd1e11fc0dbd5dd2619145ad3c47efa4b527e3a9813a28ca24f01.jpg" height="980" width="1607" alt="some-image.jpg">
Once a solution is in Hugo I’d still have the partial to make it easy to reference images and nothing will change in any layout or content files. All of the logic around copying, calculating a hash and assembling a file name will go away. Yay for deleting code.
For now this isn’t too bad because this complexity is contained in 1 file and its implementation can change without affecting other files.
If I were starting a new blog today I wouldn’t worry about using the custom delimiter because then I’d accept Hugo’s defaults. I would still make the partial though. It’s worth having that abstraction.
Syntax Highlighting
I have Jekyll configured to use the Rouge highlighter. All I did was find a specific theme’s CSS file and dropped it into my project. It’s compatible with any Pygments theme too.
Hugo uses Chroma. Hugo makes it easy to pick an existing style. For example, I use:
markup:
highlight:
style: "monokailight"
The problem is by default the $
and #
characters have a dark background
with dark red text which clashes hard with the style. I had to customize it.
To do that I did:
hugo gen chromastyles --style=monokailight
- Copy the output to your clipboard
- Paste the contents of that into my existing
app.css
file - Check the page source to see which class needs to be overwritten
- In my case it was only the
.err
class - Update the colors to whatever you want for that class
- Add
noClasses: false
as a new config option under thestyle
option- This instructs Hugo to use the custom CSS file instead of the
style
- This instructs Hugo to use the custom CSS file instead of the
There’s lots of styles to choose from but I haven’t found one I really like yet. For now Monokai Light works well enough.
Assorted Template Functions
I don’t want to include too many basic examples but here’s a few interesting ones.
Latest 12 blog posts
On my home page I list out a dozen recent posts but I insert a clearfix
div
every 3 entries to make sure things are broken up into 3 columns. Remember, I
didn’t want to re-do much with my CSS and I know this problem can be solved
[with CSS](/blog/displaying-database-results-across-multiple-columns-with-1-line-of-css.
In Jekyll, I had this:
{% for post in site.posts limit: 12 %}
<div class="col-md-4">
<!-- Ommitting for brevity -->
</div>
{% assign divisible_result = forloop.index | modulo: 3 %}
{% if divisible_result == 0 %}
<div class="clearfix"></div>
{% endif %}
{% endfor %}
In Hugo, I went with:
{{ range $index, $post := sort (where .Site.RegularPages "Type" "blog") "File.Path" "desc" | first 12 }}
<div class="col-md-4">
<!-- Ommitting for brevity -->
</div>
{{ if modBool (add $index 1) 3 }}
<div class="clearfix"></div>
{{ end }}
{{ end }}
One of the interesting differences is Liquid’s loop index starts at 1 where as Go’s starts at 0. Initially I wasn’t getting anywhere near the results I wanted to see.
A general takeaway here is seeing what a value is pretty much enables cheat
codes. As soon as I printed the loop’s index I immediately saw what was
happening and I had to add 1 to Go’s $index
variable.
As a bonus, the above code snippet also shows to sort files and get the latest
N results. I used File.Path
because I’m using YYYY-MM-DD
prefixed files.
Dealing with apostrophes and plus symbols
While testing the output of both tools I noticed that Jekyll output a regular
apostrophe, double quotes and various other symbols in the source code for
both the <title>
and <meta name="description" content="">
tags.
However, Hugo used the ASCII code such as '
for '
. Keep in mind this
was only in the page’s source code. The browser tab’s title had the normal
character.
Given how important these 2 tags are for SEO purposes, I didn’t want to risk having them change. I don’t think it would have an effect but I noticed most other big sites didn’t use the ASCII code. They output the real character.
This turned out to be pretty tricky to solve.
The docs mention you can do .Title | safeHTML
but that does not work. You
will get the ASCII code in the output.
Instead you have to do:
{{ printf "<title>%s</title>" $title | safeHTML }}
I still don’t understand why fully but it’s something related to Go’s template parsing.
For the description, I ended up with:
{{ $description := trim (replace .Description "\n" " ") " " | chomp | printf "content=%q" | plainify | safeHTMLAttr }}
<meta name="description" {{ $description }}>
Not all of that is related to ASCII codes but it normalizes whitespace, removes trailing new lines, removes HTML tags and declares this a safe HTML attribute.
I set all of my descriptions with frontmatter and I wanted a robust way to handle a bunch of different ways to define that. For example:
description: |
Hello world.
That leaves a trailing new line because you didn’t use |-
so chomp
gets rid
of that.
I also output the description on most pages too and once in a while I have HTML tags such as a link. I don’t want those links included in the meta description.
In any case, both of those problems are now solved.
Assorted Content Differences
Here’s a few unexpected changes I had to make after switching to Hugo. Thankfully I caught all of these before I pushed anything live.
Leading white space within HTML tags
If you had both of these in a Markdown file, Jekyll treated them the same:
<div>
<p>Hmm</p>
</div>
<div>
<p>Hmm</p>
</div>
In both cases your page’s output would contain “Hmm” where it’s wrapped in a
<p></p>
tag within the source code.
Hugo will render the 2nd example differently. It will literally output
<p>Hmm</p>
and it ends up being wrapped in a <pre></pre>
tag.
Technically that adheres to the CommonMark spec because it has 4 leading spaces, still that was kind of unexpected to me because I don’t have the spec memorized and Jekyll auto-magically handled this difference.
In attempts to find these I ran grep -ER "^\s{4}<" content
. This found a
number of false positives but it was still good enough to manually go through
them in a few minutes to fix a couple of posts that had this issue.
Nested numbered lists
With Jekyll I had lists set up like this where the nested list had 2 spaces of indentation:
1. A
- X
2. B
- Y
That would output things as I wanted, like this:
- A
- X
- B
- Y
However in Hugo, the output looked like this:
- A
- X
- B
- Y
Yikes. The fix there is to use 4 spaces instead of 2.
It gets more fun. In my 120+ SRE skills blog post I had over 100 items in a numbered list.
On the 100th item if you use 4 spaces for the indented list you end up with this:
- A - X
It pulls it up to the same line. The fix there is for the 100th item and higher you need to use 6+ spaces instead of 4. Good times.
This one was annoying to find with grep. I ended up just looking for all posts
with any form of list with a pattern of ^\s{2}-
and adjusted the few posts
that needed it.
Blog Tags
There’s a bunch of ways to implement tags in Jekyll and I suppose Hugo too. In
both cases I wanted a way to define tags: ["docker", "flask"]
as frontmatter
in my posts. That’s what I did previously in Jekyll and then I wrote various
logic to handle outputting that.
With Hugo I created a “taxonomy” which really means having a
content/tags/_index.md
file and then a bunch of specific tag related files in
that directory such as content/tags/docker/_index.md
.
That file only has frontmatter, such as:
---
slug: "docker-tips-tricks-and-tutorials"
title: "Docker"
description: |
I've been using Docker since 2014. Along the way I've picked up a bunch of
Docker experience and best practices. Here's what I learned.
params:
heroTitle: "Docker Tips, Tricks and Tutorials"
---
Finally in Hugo’s config I put:
taxonomies:
tag: "tags"
And now you can visit /blog/tag/docker-tips-tricks-and-tutorials on my site. I wanted to keep the same URL as I did with Jekyll.
For displaying tags such as what I do on my blog’s index page you can loop over your tags with {{ range $name, $taxonomy := .Site.Taxonomies.tags }}
.
At the top of every individual blog post I list out the tags such as #docker, #flask
, for that you can loop over a post’s tags with {{ range $index, $tag := .GetTerms "tags" }}
.
Looping over Data Files
Both tools let you have data files, such as YAML files and then you can loop over them to do whatever you want in your templates.
For example you might have something like:
- title: "My Course"
url: "..."
- title: "Another Course"
url: "..."
Then you can loop over this data structure and generate a page. That’s exactly what I do for my courses page.
For Jekyll, in the course’s page that reads this file you can do {% for course in site.data.courses %}
. With Hugo, it’s about the same {{ range $index, $course := .Site.Data.courses }}
.
Where it gets tricky with Hugo is when you want to dynamically pull out a specific entry from the data file based on a variable (such as the page’s slug). This didn’t come up for me with courses but it did with my image gallery’s photos.
With Jekyll, you can do {% for gallery in site.data.galleries[page.slug] %}
,
that’s similar to how you can access a dictionary’s value with a variable key
in a lot of languages.
With Hugo, I did {{ range $index, $gallery := (index .Site.Data.galleries .Slug) }}
. I don’t even remember where I found this, maybe it was tucked away
deep in a GitHub issue. All I know is it wasn’t straight forward (at least not
to me).
Pagination
This one probably isn’t worth mentioning only because it’s pretty easy in both set ups. Jekyll has a plugin and it’s built into Hugo. The docs have good examples for both.
Using Hugo’s Render Hooks
Hooks can be used to adjust the output of rendering Markdown to HTML. I used them to help solve 2 problems. Opening external links in a new window and also to add named anchors for headings.
A big caveat here is this only applies to Markdown content. If you have links in your layouts, those will not get this hook applied. With my custom Jekyll plugin, it affected all links from anywhere since it ran before it output the final HTML. It didn’t matter where it came from.
It’s not the end of the world since most of my links are in content files but it’s for sure worth bringing this up. It does bug me though because it means I need to manually remember to modify links in layouts depending on where they go.
External links
Hugo has a link hook. All you have to do is create this file
layouts/_default/_markup/render-link.html
and then put in the code you want
to use for creating links.
{{- $u := urls.Parse .Destination -}}
<a href="{{ .Destination | safeURL }}"
{{- with .Title }} title="{{ . }}"{{ end -}}
{{- if and $u.IsAbs (not (in .Destination .Page.Site.BaseURL)) }} target="_blank"{{ end -}}
>
{{- with .Text | safeHTML }}{{ . }}{{ end -}}
</a>
{{- /* chomp trailing newline */ -}}
I mostly grabbed that from the docs.
I won’t bother posting the Jekyll solution since it’s a custom Jekyll plugin. It parses the HTML using Nokogiri and inserts the attribute when applicable.
Named anchors
I really like it when content oriented sites make it easy to link somewhere to a specific part of a page, such as a heading.
You can find this feature on my site too. I add a #
link next to each <h3>
heading.
Hugo has a hook for headings too, you can create this file
layouts/_default/_markup/render-heading.html
and then put in the code you
want to use to modify your headings.
{{ if eq .Level 3 }}
<p><a name="#{{ .Anchor | safeURL }}"></a></p>
<h{{ .Level }} id="{{ .Anchor | safeURL }}">
<a href="#{{ .Anchor | safeURL }}" style="position: absolute; left: -12px;">#</a>
{{ .Text | safeHTML }}
</h{{ .Level }}>
{{ else }}
<h{{ .Level }}>{{ .Text | safeHTML }}</h{{ .Level }}>
{{ end }}
In my case I only apply this to <h3>
headings. You can modify that if you
want.
The end result is nice. When I’m writing a post all I have to do is add ### Hello
and the hook will add the named anchor and the #
link.
In Jekyll I did this in a much worse way. It sort of combos with the idea of creating the “Quick Jump” table of contents which is coming up next.
Table of Contents
In Jekyll I combined the idea of named anchor links and the “Quick Jump” feature together. It was something I implemented a year after I started my blog and didn’t think it through in the best way possible. However, it has worked all these years so I never changed it.
Each page that has it has frontmatter defined like this:
toc:
- "My Title"
- "Another Title"
Near the top of the page, it loops through this to display them as the quick jump.
Then within the page itself I reference a specific title like this:
<a name="{{ page.toc[0] | slugify }}"></a>
### {{ page.toc[0] }}
The [0]
is the index of the toc
list. Ultimately it was a lot of copy /
pasting between headings because I rarely typed it out. This wasn’t ideal
because if I wanted to re-order headers I had to shift things around.
Looking back it could have been easier with a custom Jekyll plugin that did something similar to the heading hook from Hugo.
With Hugo there is a built-in .TableOfContents
method which outputs all
of your headers in a <ul id="TableOfContents">
. All I do is render that as
the quick jump and use CSS to add the little pipe separator between <li>
items.
Then I define my headings like normal with ### My Title
and there’s no
frontmatter involved except to set toc: true
or toc: false
which defaults
to true
so I don’t have to set it in most pages.
This also allows separating the quick jump and the named anchor links. For
example my about page has no quick jump but it has
headings with named anchor links. In this case I have toc: false
set.
It’s worth pointing out by default Hugo configures this TOC to start with
<h2>
and end with <h3>
. My page layout includes an <h1>
and <h2>
but
these don’t get included in the TOC. All of my individual pages have the
“biggest” heading as <h3
> so it works out in the end.
You can read how to customize the start and end levels in their docs.
Sitemap and RSS Feed
For Jekyll I added the Jekyll-Sitemap plugin and that was it.
Hugo will create a Sitemap by default which was nice but I did have to
customize it to remove trailing slashes from the URLs. That involved creating
this file layouts/_default/sitemap.xml
, grabbing the default
template
from GitHub and then doing a strings.TrimSuffix "/"
on all of the
.Permalink
references.
One note worthy difference is Jekyll will include all pagination pages such as
/blog/page/3
in the sitemap where as Hugo does not. It’s debatable on if you
should even include them since they can be crawled. I did not find an easy way
to include them with Hugo so I left them out.
It’s been 6+ months and I didn’t notice any difference in traffic after making the switch which indicates nothing was negatively impacted from an SEO perspective.
For the RSS feed I created a custom atom.xml
file for both Jekyll and
Hugo. I grabbed the default RSS
template
for Hugo on GitHub.
It loops over all blog posts and grabs the latest 25 entries. The formatting and implementation of the file is the same in both tools minus templating logic.
When it came to only including blog posts I modified the default template to
use {{- $pages = where $pctx.RegularPages "Type" "blog" }}
.
It’s worth pointing out 2 Hugo specific changes.
First, I wanted to make sure Hugo outputs an atom.xml
file since that’s what
I had with Jekyll. I did that by adding this to my Hugo config file:
outputFormats:
RSS:
mediatype: "application/rss"
baseName: "atom"
Second, it’s worth pointing out that by default Hugo will create separate RSS feeds for different areas of your site (section, taxonomies and your content pages). I didn’t want that behavior. I only wanted 1 file.
In that case you can add this to your Hugo config file:
outputs:
home: ["html", "rss"]
section: ["html"]
taxonomy: ["html"]
term: ["html"]
That only generates a root RSS feed and ignores generating files for the others.
Twitter and YouTube Embeds
With Jekyll for Twitter I used this
plugin. That lets you
embed tweets with {% twitter https://x.com/USER/status/TWEET_ID %}
.
With Hugo there’s a built-in shortcode: {{< x user="USER" id="TWEET_ID" >}}
By the way to even produce the above output I had to put </*
and */>
inside
the squiggly braces to avoid processing them. There is no {% raw %}
tag like
Liquid. The author of Hugo describes this as “special comment out shortcode
syntax” he made
up.
Also, one cool feature with Hugo’s twitter implementation was it showed a warning in the terminal when it couldn’t retrieve a tweet since the user no longer existed. The Jekyll plugin cached the results so I haven’t pulled it from Twitter in likely years.
With Jekyll for YouTube I just took the embed code from YouTube and turned it
into an include
to reference it. Calling it looks like: {% include youtube.html uid="YOUTUBE_ID" %}
With Hugo there’s a built-in shortcode {{< youtube "YOUTUBE_ID" >}}
. You
can customize it with a number of
settings. Not too
shabby.
I have hundreds of YouTube embeds and I didn’t want to do those by hand. I ended up writing a tiny bit of Python code to do the replacement:
# I shortened the variable name of "content" to "s" so it fits in 1 line here:
s = re.sub(r"{% include embed.html type=\"youtube\" uid=\"(.*)\" %}", r'{{< youtube "\1" >}}', s)
s = re.sub(r"{% include embed.html type='youtube' uid='(.*)' %}", r'{{< youtube "\1" >}}', s)
I did it in 2 passes to catch both double and single quote variants. I know it could have been optimized with a few “or” conditions in the regex but this was a 1 off script and it finishes in less than 1 second. Regex capture groups are useful from time to time.
Then I grepped both code bases and confirmed I got the correct counts of each tag to double check my work. There was a lot of that sort of thing done whenever I used Python or sed to do replacements.
# Double Checking Your Work
I kept a running TODO list on what to check or convert.
It’s not meant to be an exhaustive list since I only started it half way through the migration, I also removed a bunch of items from this post related to the topics we covered above.
I wanted to do everything in my power to ensure nothing was different after the migration unless I made an explicit decision for it.
- Remove unnecessary whitespace in content files
- Remove trailing new lines in content files
- Replace YouTube embeds
- Replace captures (a Liquid feature to make more “complicated” variables)
- Replace image references from Jekyll-Assets to Hugo
- Replace
post_url
references torelref
- Remove
.site.url
references in favor of relative URLs (minus exceptions) - Trim and chomp new lines where it makes sense
- Remove trailing slashes from
/blog/
,/courses/
and/galleries/
- Remove all
{% raw %}
tags - Ensure all static files exist (robots.txt, fonts, favicons, non-blog images, etc.)
- Confirm newsletter still works
- Gallery images should not be digested
- Make sure Disqus still works (I can pull up old comments for older posts)
- Diff the sitemap to make sure it includes everything I want
- Make sure social sharing links still work
- Make sure Twitter cards and OpenGraph tags are correct
- Confirm the number of posts in Hugo matches what’s in Jekyll
- Double check published directory to make sure all Hugo tags are processed
Quality Control
Lastly I used this as an opportunity to clean up some inconsistencies in my posts around formatting and overall quality control. I did not manually check everything in every page. I spot checked a few pages after running a bunch of scripts and commands.
I didn’t include most of the Python scripts and grep / sed commands here. A lot of it is pretty specific to my site’s content and personal Jekyll set up. A takeaway for you might be to consider doing the same.
If you have specific questions I’d be happy to answer them in the comments.
Programmatically Generate New YAML Frontmatter
When you have 500+ pages, going back and manually updating the frontmatter to match whatever new format you want is way too tedious.
My goal was to convert the old frontmatter of this:
layout: 'post'
tags: ['dev-business', 'dev-environment']
has_affiliate_link: true
card: 'blog/cards/the-tools-i-use.jpg'
title: "The Tools I Use"
description: "Here's a list of software and hardware that I use on a regular basis as a developer and video creator. I will be keeping it updated."
toc:
- "OS"
- "Code Editor and Terminal"
- "Notable Apps"
- "Computer, Desk and Phone"
- "Recording and Music"
Into new frontmatter that looks like this:
tags: ["dev-business", "dev-environment"]
title: "The Tools I Use"
description: |
Here's a list of software and hardware that I use on a regular basis as a
developer and video creator. I will be keeping it updated.
params:
card: "the-tools-i-use.jpg"
affiliateLink: true
Here’s some improvements:
- Single vs double quotes are consistent
- The
layout
andtoc
fields are removed - Top level fields are separated with new lines in an easier to skim way
- Non-Hugo specific fields are namespaced to
params
- The card in the new frontmatter no longer includes
blog/cards/
in the path
There’s other changes too, such as changing a field name from bottom_cta
to
newsletter
. That field wasn’t included in the above example.
Here’s some zero dependency Python code you can re-purpose for your set up. I’ve marked up the code with extra comments to better explain it:
import os
import re
import yaml
# Create a mini-template for how I want the new frontmatter to look.
hugo_frontmatter = """
tags: $tags
title: "$title"
description: |
$description
params:
card: "$card"
newsletter: "$newsletter"
affiliateLink: $affiliateLink
"""
# Regex to match frontmatter.
regex = re.compile(r"---([\S\s]*)---", re.MULTILINE)
# Loop over all blog posts.
dir = os.fsencode("./content/blog")
for file in os.listdir(dir):
filename = os.fsdecode(file)
if filename.endswith(".md"):
with open(f"./content/blog/{filename}") as f:
content = f.read()
frontmatter = re.findall(regex, content)[0]
# This protects against multiple frontmatter matches in case you use ---
# later on in a post which is unrelated to its frontmatter, such as <hr>.
frontmatter = frontmatter.split("---", 1)[0]
yaml_content = yaml.safe_load(frontmatter)
replaced_content = content
has_bottom_cta = False
has_affiliate_link = False
# Start building up the new frontmatter from the old frontmatter.
if yaml_content.get("tags"):
tags = []
for tag in yaml_content["tags"]:
tag_str = '"' + tag + '"'
tags.append(tag_str)
new_frontmatter = new_frontmatter.replace("$tags", f'[{", ".join(tags)}]')
if yaml_content.get("bottom_cta"):
has_bottom_cta = True
new_frontmatter = new_frontmatter.replace("$newsletter", yaml_content["bottom_cta"])
if yaml_content.get("has_affiliate_link"):
has_affiliate_link = True
new_frontmatter = new_frontmatter.replace("$affiliateLink", str(yaml_content["has_affiliate_link"]).lower())
if yaml_content.get("title"):
new_frontmatter = new_frontmatter.replace("$title", yaml_content["title"])
if yaml_content.get("card"):
new_frontmatter = new_frontmatter.replace("$card", yaml_content["card"])
if yaml_content.get("description"):
new_frontmatter = new_frontmatter.replace("$description", f'{yaml_content["description"]}')
if not has_bottom_cta:
new_frontmatter = new_frontmatter.replace(' newsletter: "$newsletter"\n', "")
if not has_affiliate_link:
new_frontmatter = new_frontmatter.replace(" affiliateLink: $affiliateLink\n", "")
# Your new frontmatter will be included here.
replaced_content = replaced_content.replace(frontmatter, new_frontmatter)
Out of ~500 posts this updated every single one without issues. This was one of those times where spending an hour or whatever it was to get this automated and ironed out was faster than doing it manually.
I kept printing the new_frontmatter
until I was sure it was fully working.
Once it was working I modified the Python code to save the file with the
replaced_content
.
Updating Hugo Versions with Confidence
Given I was stuck using an old version of Jekyll I haven’t changed versions in many years. With Hugo this is going to be a different story. I imagine I’ll update reasonably often.
While Hugo has an extensive test suite, I feel a little uneasy updating to a newer version and not having any insight on if my page’s output changed. It would be like shipping application code to production without any tests.
I’m a big fan of the run script approach for adding project specific aliases or functions.
I have a ./run publish
command which builds a production version of the site.
It doesn’t push it anywhere, that just locally builds the site.
I added this code at the end of that command:
local backup_path="/tmp/published/"
if [ -n "${backup}" ]; then
rm -rf "${backup_path}"
cp -a published/ "${backup_path}"
fi
The basic idea is if I ./run publish --backup
it will copy the final output
to a temp directory. Now if I update Hugo to a new version and ./run publish
again I can diff the published/
(new version) and /tmp/published/
(old
version) directories to see if anything changed.
You can think of this as 1 big end-to-end test that’s really fast.
That gives me peace of mind that something didn’t malfunction and will also help me understand what is changing between Hugo versions. I’ve already updated from v1.30.0 to v1.31.0 using this approach and saw no changes which is the outcome I was expecting.
Over the last 6+ months I’ve done this a few times. It’s all good! I am currently running v0.144.2 but that will be upgraded over time.
This strategy also helped me uncover that Hugo injects a meta tag on your home
page with its version such as <meta name="generator" content="Hugo 0.144.2">
.
You can disable that by adding disableHugoGeneratorInject: true
in your Hugo
config file. I disabled it since I don’t like the idea of leaking internal
implementation details of how the site was generated, although technically in
this case it’s only static files, it’s a matter of principle. I have the
version defined as a Docker build arg so it’s easy to see in development.
I even ended up opening a pull request on Docker’s documentation to remove that tag. I don’t think it’s well known that Hugo does this.
Detecting default templates I’ve changed
Another thing to consider is I locally copied a few default Hugo pagination, RSS and sitemap templates and customized them. When updating Hugo versions it’ll be important to detect if I need to update those templates due to something changing.
I created (3) run script functions to check them:
# This is a private helper function.
function _diff_templates {
local local_path="layouts/${1:-}"
local remote_path="${2:-}"
local tag="${3:-master}"
local remote_url="https://raw.githubusercontent.com/gohugoio/hugo/${tag}/tpl/tplimpl/embedded/templates/${remote_path}"
diff --color -u <(curl --silent "${remote_url}") "${local_path}"
}
function diff:pagination {
_diff_templates "partials/pagination.html" "_partials/pagination.html" "${1:master}"
}
function diff:rss {
_diff_templates "_default/rss.xml" "rss.xml" "${1:master}"
}
function diff:sitemap {
_diff_templates "_default/sitemap.xml" "sitemap.xml" "${1:master}"
}
Now when I run ./run diff:sitemap
it checks my local template vs what’s on
Hugo’s master branch. I can also run ./run diff:sitemap vXXX
if I want to
check that remote tag instead of master when I’m doing a version update.
Here’s an example output:
$ run diff:sitemap
--- /dev/fd/63 2024-08-05 08:52:14.751901664 -0400
+++ layouts/_default/sitemap.xml 2024-08-03 15:47:12.785479481 -0400
@@ -4,19 +4,19 @@
{{ range where .Pages "Sitemap.Disable" "ne" true }}
{{- if .Permalink -}}
<url>
- <loc>{{ .Permalink }}</loc>{{ if not .Lastmod.IsZero }}
+ <loc>{{ .Permalink | strings.TrimSuffix "/" }}</loc>{{ if not .Lastmod.IsZero }}
<lastmod>{{ safeHTML ( .Lastmod.Format "2006-01-02T15:04:05-07:00" ) }}</lastmod>{{ end }}{{ with .Sitemap.ChangeFreq }}
<changefreq>{{ . }}</changefreq>{{ end }}{{ if ge .Sitemap.Priority 0.0 }}
<priority>{{ .Sitemap.Priority }}</priority>{{ end }}{{ if .IsTranslated }}{{ range .Translations }}
<xhtml:link
rel="alternate"
hreflang="{{ .Language.LanguageCode }}"
- href="{{ .Permalink }}"
+ href="{{ .Permalink | strings.TrimSuffix "/" }}"
/>{{ end }}
<xhtml:link
rel="alternate"
hreflang="{{ .Language.LanguageCode }}"
- href="{{ .Permalink }}"
+ href="{{ .Permalink | strings.TrimSuffix "/" }}"
/>{{ end }}
</url>
{{- end -}}
When I see that, I know there’s my expected changes by removing trailing slashes. If I saw new output there I would know I need to go back and update my local template.
Additional Quality Control After 6+ Months
Over the last 6+ months I ran into a few interesting things not covered above.
Running out of disk space on my dev box
My dev box is 10+ years old. My WSL 2 instance is on my primary Windows drive which is a 256 GB SSD. Long story short I ran out of disk space mid-build. I knew not to push it because I saw the error but still adding more layers of protection is worth it.
This can be a problem because the site will incrementally be built. If you run out of space mid-way, you could end up with a really broken site.
In my run script to publish
, I added this condition to prevent that. The
amount will need to be adjusted for your site but I know for sure mine doesn’t
require more than 1 GB to build:
# /c in this case is my drive's mount point.
if [ "$(df --output=avail /c | tail -n 1)" -lt 1000000 ]; then
echo "Aborting, there's not enough disk space to complete this build!"
exit 1
fi
Parallel build race conditions
Hugo will do parallel processing to build your site across all of your CPU
cores by default and I ran into what I think are a number of race conditions
with this where I ended up having partial link tags which were broken because
relref
didn’t work correctly. This resulted in broken links and YouTube
embeds.
This is a scary one because there’s no warnings or errors thrown by Hugo. You can only notice it by happening to see the page’s output.
I was able to reproduce it many times in development and production builds but I cannot reproduce it consistently so the author of Hugo didn’t accept the bug report.
I ended up setting the HUGO_NUMWORKERMULTIPLIER=1
env variable when I do
production builds to disable parallel processing. I’ve done hundreds of builds
with this set up and it never happened again as far as I know. Builds are
already really fast.
I added both of these checks to my publish
script. Both grep commands look
for different types of failures that I’ve seen first hand. Each path is a
location where links can be present:
local paths=("published/blog" "published/confirmed" "published/courses" "published/galleries" "published/newsletter" "published/podcast" "published/subscribed" "published/unsubscribed" "published/work-together")
# Note to reader: replace () with [], I had to replace it for this blog post
# otherwise my site wouldn't build because the pattern was being found. :D
if grep -R "\(.*\)/" ${paths[@]}; then
echo
echo "Aborting, this build produced broken Markdown links!"
exit 1
fi
# I know this can be improved but it runs in ~20ms on 10+ year old hardware.
#
# Note to reader: replace 5 with %, I had to replace it for this blog post
# otherwise my site wouldn't build because the pattern was being found again!
if grep -R "5{" ${paths[@]} | grep -v "5{\$fg" | grep -v "5{}" | grep -v ";5{"; then
echo
echo "Aborting, this build produced broken Markdown links!"
exit 1
fi
I don’t know if it’s 100% fool-proof but I haven’t gotten any reports from anyone saying they found something busted on my site and neither regex has returned a match since I disabled parallel builds. For the first month or so of builds I spot checked at least 20 pages per deploy and found nothing out of the ordinary.
I’m not happy about this workaround but I feel ok that things are building ok.
# Safely Performing the Switch Live
I sync’d everything to a separate directory on my server so I didn’t overwrite
what’s there. I also created a separate nginx vhost file on a separate
hugo.nickjanetakis.com
sub-domain so I could privately test things without
touching my real site.
I even went as far as installing nginx locally to pre-test a few new redirect rules.
By the way, one difference with Hugo and pretty URLs is it generates an
index.html
file inside of a directory named after your content. For example
blog/hello-world/index.html
. Jekyll would have generated
blog/hello-world.html
.
To account for that I adjusted my nginx try_files
to try_files $uri/index.html $uri.html $uri $uri/ =404;
so it can find them.
Once I was happy everything worked I backed up my original nginx virtual host
file, copied over the new one and renamed the server_name
to point to my root
domain. Then I deleted the A record for hugo
because forgetting to do that in
the past led to very bad things.
At this point I was live and if I wanted to revert, it would have taken a few seconds by switching the nginx config back to the original.
I didn’t have to revert. Everything worked on the first try.
The only issue I’m aware of was I forgot to diff my atom.xml
file like I did
for the sitemap. Thankfully a reader of the blog emailed me and let me know
about a day after it went live.
The Hugo version included all pages where as the Jekyll version only included the latest 25 blog posts. Needless to say he was wondering what happened when a bunch of non-blog post pages appeared in his feed.
I fixed that very quickly and it’s been smooth sailing as far as I can tell. If you find something that looks off, please let me know!
In the last 6+ months I haven’t noticed anything else being off. The amount of preparation and treating this as a detail oriented problem really paid off in the end.
# Final Thoughts
In the end I’m happy I made the switch and would do it again but I can’t say that I’m 100% happy with Hugo. Like any tech choice, it’s nearly impossible to find something that’s perfect. I think Hugo has enough pros that it’s worth using if you have a decent sized site.
I didn’t have any real experience with Go prior to this. I appreciate the explictness of its templating language but at the same time it’s really verbose at times and has a few confusing (to me) rules and foot guns. I generally prefer Liquid over it for now but I’m willing to give it a chance.
I also like how Jekyll is less restrictive on how and where you can use certain features. For example Hugo doesn’t let you use partials in your content files, shortcodes are only available in your content files and render hooks only apply to content files.
What this really boils down to is you have to duplicate a number of things between layouts and content files and also sometimes create both a shortcode and partial just so you can call the shortcode which in turn calls the partial due to this restriction.
Sometimes you can’t use either. For example my nav bar has links and I have
to hard code URLs like /blog
and /about
as the href=""
value because
you can’t use relref
in a layout.
Lastly Jekyll lets you define frontmatter in your layouts which makes it very easy to make derivative layouts that are slightly different. Hugo doesn’t let you do that so you have to take more convoluted approaches to solve the same problem.
This really reminds me of Ruby’s “sharp knives” philosophy. You can write clean code and if you need to reach for something sharp to solve a specific problem the language has ways to do it and trusts that you know what you’re doing.
Go has a different philosophy and it leaks into the tools being created with it. Hugo feels like the author made certain design decisions and you have to stick to them how they are.
In Hugo’s defense it does have a lot of features and I haven’t deeply explored each one. There’s no way I would call myself an expert. It’s also a very well thought out project and I can appreciate all of the hard work, experience and design that went into creating it.
My only concern with Hugo is if I want to do something custom, I’m not sure it’ll be possible with what’s available.
For example on my Running in Production podcast site, when I display the show note topics in a list I include clickable timestamps which seek straight to that second in the audio player. That’s automated by a Jekyll plugin.
That site is also made with Jekyll. Each podcast post has Markdown that looks like this:
## Topics Include
- 7:13 -- The process to build a custom piece of hardware
- 12:33 -- 3D printing a custom case
- 15:16 -- The assembly process and selling about 150-200ish devices a month
I wrote a tiny Jekyll plugin to convert the above into what you see on the site:
# Developed by Nick Janetakis - https://nickjanetakis.com
#
# Usage in layouts: {{ content | audio_seek }}
require "jekyll"
require "nokogiri"
def hh_mm_ss_to_seconds(human_time)
# This accepts hh:mm:ss, mm:ss or ss.
human_time.split(":").map { |a| a.to_i }.inject(0) { |a, b| a * 60 + b}
end
module Jekyll
module AudioSeek
def audio_seek(content)
doc = Nokogiri::HTML.fragment(content)
# Stop if we could't parse with HTML.
return content unless doc
# Note to reader: I had to replace / with | in (2) spots in the line below
# to prevent the site building regex from blocking this. This is the first
# time a false positive came up in 9 months. I will come up with a more
# long term solution but I will leave this in to demonstrate how that
# original regex could have false positives.
doc.xpath('h2[contains(text(), "Topics Include")]|following-sibling::ul[1]|li').each do |li|
original_inner_html = li.inner_html
split_delimiter = " – "
original_inner_html_split = original_inner_html.split(split_delimiter)
seek_time_human = original_inner_html_split.first
seek_time_seconds = hh_mm_ss_to_seconds(seek_time_human)
rest = original_inner_html_split.last
seek_link = "<a href='#' data-audio-seek='#{seek_time_seconds}'
class='audio-seek'>#{seek_time_human}</a>"
new_inner_html = "#{seek_link}#{split_delimiter}#{rest}"
li.inner_html = new_inner_html
end
doc.to_s
end
end
end
Liquid::Template.register_filter(Jekyll::AudioSeek)
At a high level, all that’s doing is:
- Look for that “Topics Include” heading
- Grab the next
<ul>
so this logic only applies to topics - Split on
--
- Convert the
7:33
into seconds - Insert a link with a data attribute around the
7:33
text
If anyone is an expert with Hugo, I’d love to see how you can do the same with Hugo.
At the time of this post, there’s a render hook for links and headings but not lists but you’d still only want to scope this transformation to a specific list, not all of them. I know it can be solved with JavaScript but I don’t want to do that logic client side.
The video below mostly goes over this post with a little more color and I do jump into the blog’s code base from time to time.
# Video
Timestamps
- 0:37 – Why did I switch?
- 6:21 – Refactor or grand rewrite?
- 9:49 – Solving the big problems before I start
- 16:04 – Switching themes too?
- 18:46 – High level features
- 27:44 – Hello world, Hugo is fast
- 32:14 – Learning Hugo quickly
- 33:53 – Hugo vs Jekyll terms
- 41:39 – YYYY-MM-DD filenames but not URLs
- 44:45 – Splitting posts and drafts
- 49:28 – Removing trailing slashes
- 53:06 – Custom file digests
- 56:45 – Syntax highlighting
- 59:11 – Latest N posts
- 1:02:06 – Apostrophes and symbols
- 1:05:30 – Beware of leading spaces
- 1:10:04 – Customized blog tags
- 1:13:04 – Looping over data files
- 1:17:01 – Render hook for external links
- 1:20:06 – Render hook for named anchor headings
- 1:24:17 – Table of contents
- 1:29:47 – Sitemap
- 1:32:46 – Twitter embeds
- 1:35:42 – YouTube embeds
- 1:37:23 – A checklist to double check your work
- 1:40:56 – Generate new frontmatter with Python
- 1:44:31 – Updating Hugo with confidence
- 1:47:08 – Detecting default template changes
- 1:50:13 – Running out of disk space
- 1:52:17 – Parallel build race conditions
- 1:56:36 – Safely switching your live site
- 2:01:17 – Final thoughts
- 2:04:58 – Not sure about custom behaviors
Did you migrate to Hugo? How did it go? Let me know below.