3 Ways to Get a File's Extension When Writing a Shell Script
2 of the options are POSIX compliant while one of them depends on Bash. You can pick one based on your preference.
Let’s start with the Bash specific solution:
Here’s how to get the file extension:
# The space is important, alternatively you can write this as: ${file:(-3)}"
$ file="hello.csv" ; echo "${file: -3}"
csv
If you wanted to get the filename instead you can do:
# We're using 4 here instead of 3 since we want to exclude the dot.
$ file="hello.csv" ; echo "${file:0:-4}"
hello
The above is handy if you don’t know what the file extension is but you know it will always be 3 characters long. In my opinion this is pretty readable but it’s not flexible if you want to support extensions that can have a varied amount of characters.
POSIX compliant solution with no external process calls:
$ file="hello.csv" ; echo "${file##*.}"
csv
$ file="hello.csv" ; echo "${file%%.*}"
hello
If you have a file extension with multiple dots such as .tar.gz
you can
have the option of grabbing just the tar.gz
or gz
:
$ file="hello.tar.gz" ; echo "${file##*.}"
gz
# Notice we're using # instead of ##.
$ file="hello.tar.gz" ; echo "${file#*.}"
tar.gz
$ file="hello.tar.gz" ; echo "${file%%.*}"
hello
# The same pattern as above applies here with %.
$ file="hello.tar.gz" ; echo "${file%.*}"
hello.tar
The above is more flexible than the Bash solution. It supports file extensions
with an unknown length. It’s also more portable since it’s POSIX compliant. The
downside is it’s less readable and the syntax is one of those things you might
find yourself Googling for if you rarely use it. I always forget if it’s .*
or *.
when I do this sort of thing a few times a year.
Still, I prefer this solution and would typically default to it for most use cases.
POSIX compliant solution using basename
:
If all you’re interested in is the filename you can do this:
$ basename "hello.csv" .csv
hello
If you want to get the file extension then you already know it because you had
to reference it when using basename
for the 2nd argument. You could always
put that into a variable and use either the extension or filename as you see
fit.
If you don’t mind the extra process call this is pretty clean. The benefit of
this approach over the first POSIX compliant solution is you have a path such
as /tmp/hello.csv
,it will extract out hello
where as the first one will
return /tmp/hello
. Then again, maybe you want to keep the path in which case
the other option works. It’s up to you!
The video below shows the different solutions in action.
# Demo Video
Timestamps
- 0:29 – Bash specific way to get the extension or file name
- 2:11 – POSIX compliant solution with native shell
- 4:47 – Another POSIX compliant solution using basename
Which one do you prefer? Let me know below.