Thursday, August 20, 2009

How to Bulk Rename Files in Linux (Terminal or GUI)

If you have a directory of files that you would like to bulk rename, you can use the rename command from the terminal.

UPDATE: I believe the Perl-based rename command is only available on Debian-based Linux distros, but there are instructions on adding it to other distros below.

The syntax for the rename command is:

rename [ -v ] [ -n ] [ -f ] perlexpr [ files ]

-v means "verbose" and it will output the names of the files when it renames them. It is a good idea to use this feature so you can keep track of what is being renamed. It is also a good idea to do a test run with -n which will do a test run where it won't rename any files, but will show you a list of files that would be renamed.

The "perlexpr" part of the command is a Perl expression. Don't panic yet...

The "rename" command in action

Here is an example of the rename command:

rename -n ’s/\.htm$/\.html/’ *.htm

The -n means that it's a test run and will not actually change any files. It will show you a list of files that would be renamed if you removed the -n. In the case above, it will convert all files in the current directory from a file extension of .htm to .html.

If the output of the above test run looked ok then you could run the final version:

rename -v ’s/\.htm$/\.html/’ *.htm

The -v is optional, but it's a good idea to include it because it is the only record you will have of changes that were made by the rename command as shown in the sample output below:

$ rename -v 's/\.htm$/\.html/' *.htm 3.htm renamed as 3.html 4.htm renamed as 4.html 5.htm renamed as 5.html 

The tricky part in the middle is a Perl substitution with regular expressions, highlighted below:

rename -v ’s/\.htm$/\.html/’ *.htm

Tip: There is an intro to Perl regular expressions here.

Basically the "s" means substitute. The syntax is s/old/new/ — substitute the old with the new.

A . (period) has a special meaning in a regular expression — it means "match any character". We don't want to match any character in the example above. It should match only a period. The backslash is a way to "escape" the regular expression meaning of "any character" and just read it as a normal period.

The $ means the end of the string. \.htm$ means that it will match .htm but not .html.

It's fairly basic — substitute .htm with .html:

's/\.htm$/\.html/'

The last part of the command, highlighted below, means to apply the rename command to every file that ends with .htm (the * is a wildcard).

rename -v ’s/\.htm$/\.html/’ *.htm

Other Examples

Maybe you have a digital camera that takes photos with filenames something like 00001234.JPG, 00001235.JPG, 00001236.JPG. You could make the .JPG extension lowercase with the following command executed from the same directory as the images:

rename -v 's/\.JPG$/\.jpg/' *.JPG

Here is the output of the above command:

$ rename -v 's/\.JPG$/\.jpg/' *.JPG 00001111.JPG renamed as 00001111.jpg 00001112.JPG renamed as 00001112.jpg 00001113.JPG renamed as 00001113.jpg 

That is simple enough, as it is similar to the .html example earlier. You could also bulk rename them with something descriptive at the beginning like this:

Tip: Before trying more complicated renaming like in the example below, do a test run with the -n option as described at the beginning of this tutorial.

rename -v 's/(\d{8})\.JPG$/BeachPics_$1\.jpg/' *.JPG

That will change filenames that have the pattern ########.JPG (8 numbers and capital .JPG) to something like BeachPics_########.jpg (the same 8 numbers and changing the extension to lowercase .jpg). Here is a test run with the -n option:

$ rename -n 's/(\d{8})\.JPG$/BeachPics_$1\.jpg/' *.JPG 00001111.JPG renamed as BeachPics_00001111.jpg 00001112.JPG renamed as BeachPics_00001112.jpg 00001113.JPG renamed as BeachPics_00001113.jpg 

Here's a quick breakdown of the Perl substitution with the regular expression above.

The highlighted section below means to count 8 digits. The parentheses mean to save those 8 digits for later because they are going to be used again in the second half of the substitution:

's/(\d{8})\.JPG$/BeachPics_$1\.jpg/'

In the highlighted section below, it adds the string BeachPics, and underscore, and then the text in parentheses from the first half of the substitution. $1 will insert the string from the first set of parentheses that it finds — in this case the 8 digits. If you have more than one set of parentheses you can access the second set with the Perl variable $2 and so on.

's/(\d{8})\.JPG$/BeachPics_$1\.jpg/'

Final Refinement

The following variation would make even cleaner-looking filenames. See if you can figure out how it works:

$ rename -n 's/\d{5}(\d{3})\.JPG$/BeachPics_$1\.jpg/' *.JPG 00000123.JPG renamed as BeachPics_123.jpg 00000124.JPG renamed as BeachPics_124.jpg 00000125.JPG renamed as BeachPics_125.jpg

No comments:

Post a Comment