CLI “magic”

Another day, another question. A friend of mine is working on his thesis, and wanted to replace all instances of a term, throughout a range of files. The problem could be formulated:

For all files of type X, search through them, replacing every instance of foo with bar.

In this particular case, the search term needed to be capitalized. So “foo” needed to be “FOO”. Why? Not my place to speculate, and not important to the problem, or the solution.

Building upon previous experiences, sed was called back into service.

$ sed -i 's/foo/FOO/gi' FILE

does the job. But on one file only. Time to widen the comfort zone a bit. I normally don’t use loops in the shell, mostly due to the fact that I haven’t taken the time to learn the syntax, but also out of a good portion of respect for them. Whatever command executed, is magnified by the use of loops. They should always be handled with a great deal of respect.

Personally I can live with some manual labor (i.e. executing the same command over and over feeding a new parameter every time) as long as I know that I can count on the command. It endows me with a sense of control. But my friend chose to believe in his version control system, and that his disks wouldn’t fail, that his backups wouldn’t be magnetically erased, that the big GNU in the sky (or whatever $DEITY he believes in) would have his back, and that I am competent enough to write a bash-script which would work according to specification.

Ballsy, stupid but ballsy ;)

So off I went to the Internet, searching for the dark incantation I would need to have the command executed repeatedly over all his designated files.

The answer came in the form

$ for i in `command`
> do
> command $i
> done

After quick experimentation I concluded that “ls *.txt” would indeed only display the files ending with “.txt” in the given directory. Neat! All the pieces are in place, now to put it all together:

$ for f in `ls *.txt`
> do
> sed -i 's/foo/FOO/gi' $f
> done

which, when collapsed into a single row amounts to:

$ ´╗┐for f in `ls *.txt`; do sed -i 's/foo/FOO/ig' $f; done

Or, you could just manually open up all the files in a text-editor, and for each file hit search and replace… The only thing I feel right now is that there probably exist an option in sed for modifying case built into sed, which would make it a bit more flexible to search for variable terms which share a common root (as an example, what if you wanted to capitalize all occurrences of president, presidents and presidential? There simply must be such a command in sed, so once I find it I will update this post)

UPDATE:

The solution did indeed exist, and was of course, simple.

$ sed -i 's/\(foo\)/\U\1/gi' FILE

In order to do post-processing on the output, it can no longer be a static string (indeed that would not work since the whole point was to be able to match words with a common root, i.e several different but similar words), so it needs to be replaced by a back-reference to whatever was matched. Which means we now have to group the term we are searching for.

So the final incantation would look like this:

$ for f in `ls *.txt`; do sed -i 's/\(foo\)/\U\1/gi' $f; done

Tags: , , ,

Comments are closed.