Filename Sanitizer (v0.6)

Summary: This script was designed to help ease the migration of data from a Mac or Linux PC to a FAT32 flash drive, which in turn would be compatible with Windows. It does this by attempting to find any questionable characters in any of the file names, and replacing them with hyphens ‘-‘. It also removes leading and trailing spaces, along with trailing periods.

Requirements:

  • find
  • sed
  • tac (coreutils)

Code:

#!/bin/bash

LOG="${HOME}/.FilenameSanitizer.log"
TMPLOG="/tmp/FilenameSanitizer-`whoami`"

##Write Date & Time to Log File
echo "-------------------------------------" > "${LOG}"
echo "`date +%B %d, %Y`" >> "${LOG}"
echo "-------------------------------------" >> "${LOG}"

USAGE="Usage: ${0} -s "

echo "FilenameSanitizer"
echo "Scripted by Blake Johnson"
echo "http://www.simplescripts.net/"
echo

while getopts 's:h' OPTION; do
	case ${OPTION} in
		s) SOURCE="${OPTARG}";;
		h) echo "${USAGE}" >&2
			exit 2;;
		?) echo "Unknown option "-${OPTARG}"." >&2
			echo ${USAGE};;
		:) echo "Option "-${OPTARG}" needs an argument." >&2
  			echo ${USAGE}
  			exit 2;;
  		*) echo ${USAGE}
  			exit 2;;
  	esac
done

if [ -z "$SOURCE" ]
then
	echo "One or more parameters are missing."
	echo "${USAGE}"
	exit 2
fi



##This is what the script is looking for...
##1. Illegal Characters
##2. Leading Spaces
##3. Trailing Spaces and periods.
find "${SOURCE}" 
-regex ".*[][%+\"*;:?|,=]+[^/]*$" 
-o -regex ".*/[ ]+[^/]*$" 
-o -regex ".*[ .]+$" 
| tac > "${TMPLOG}" 2>"${LOG}" ##In most cases, reversing the results requires only one run of this script.

UpperBound=$(cat "${TMPLOG}" | wc -l)  ##Count the number of results found.

if [ "${UpperBound}" = "0" ]  ##Check to make sure that 'find' found some files that need to be renamed.
then
	echo "There is nothing to rename."
	exit 1
else
	echo "${UpperBound} file(s) found."
	i=0
	e=0
	s=0
	while [ "${i}" -lt "${UpperBound}" ]
	do
		(( i += 1 ))
		TMPLINE=$(sed -n "${i}p" "${TMPLOG}")
		InputLocation="${TMPLINE%/*}"
		InputName="${TMPLINE##*/}"
	
		NewName="${InputName}"
		NewName=$(echo -n "${NewName}" | sed 's/^ *//' | sed 's/[ .]*$//')  ##Strip leading spaces and trailing spaces / periods.
		NewName=$(echo -n "${NewName}" | sed "s/[][%+\"*;:?|,=]/-/g") ##Replace illegal characters.
		
		if [ "${NewName}" == "" ]  ##Check to make sure the new file name is not blank.
		then
			NewName="Untitled - ${i}"
		fi
		
		OutputPath="${InputLocation}/${NewName}"
		
		if [ "${OutputPath}" == "${TMPLINE}" ] ##Check to make sure that the name was actually changed.
		then
			echo -e ""${TMPLINE}" [ 33[1;31mNot Renamed33[0m ]"
			exit
		fi
		
		if [ -f "${OutputPath}" ] ##Check to make sure that the new name won't overwrite an existing file.
		then
			OutputPath="${OutputPath}.RENAME_${i}"
		fi
		
		echo -n "Renaming "${TMPLINE}" to "${OutputPath}"... "
		mv "${TMPLINE}" "${OutputPath}" >> "${LOG}" 2>&1
		
		if [ "$?" == 0 ] ##Check to see if the 'mv' command failed or not.
		then
			echo -e "[ 33[1;32mRenamed33[0m ]"
			(( s += 1 ))
		else
			echo -e "[ 33[1;31mNot Renamed33[0m ]"
			(( e += 1 ))
		fi
		echo
		unset NewName OutputPath TMPLINE
	done
	
	echo "${s} / ${i} file(s) renamed successfully."
	echo "${e} file(s) could not be renamed."
fi

rm "${TMPLOG}" >> "${LOG}"
exit 0

Changes:
v0.6: Simplified the search command. Added some more questionable characters. Added ability to set parameters at the command line. Displays usage line upon error.
v0.5: Name changed to “Filename Sanitizer”. Simplified the replacement of illegal characters. Parses the results in reverse order. Added check for duplicate names. Commented code a bit.
v0.4: Search results are now written to a file instead of variable, fixes bug with trailing spaces, removes trailing periods, removes leading and trailing spaces, counts and shows success rate.
v0.3: Name changed to Filename Cleanser. Added more questionable characters to search for.
v0.2: Errors are logged, exit status of each rename command, and some other revisions.
v0.1: Initial release.

Bugs / Unimplemented Features:

  • File names with repeating periods _may_ end up disappearing off the face of the planet. (I don’t know why, or how.)