This page deals with Linux utilities which are particularly useful during the translation process itself, irrespective of whether they were originally intended for translators.

Text searching, indexing and retrieval utilities fall broadly into two categories: those which search a body of data, such as a file, a number of files, or the files contained in a designated directory or group of directories; and those which first create an index of the data in these files, and then allow searches of the index.

Text search utilities

These utilities search your file system or a part of it for a string (a sequence of characters). Searches which are not limited to just a few files may take a long time to process.

GNOME Search Tool (gnome-search-tool), Find Files (kfind)

The GNOME Search Tool (gnome-search-tool) is is a front end for grep, or a Linux equivalent to the Windows "Find" function, depending upon your point of view. It provides you with a convenient dialog box in which to enter the search expression, select the directories to be searched, etc.

A successful search for, for example, a filename, character string or expression returns a list of files. Double-click on one of these, and it will be opened automatically by the appropriate application.

Find Files (kfind) is the KDE desktop's equivalent utility.

The GNOME Search Tool has built-in support for regular expression searches, whereas KFind offers this function only if KRegExpEditor is installed. The GNOME Search Tool will also search binary files (such as MS PowerPoint files) by default; KFind does so only if the relevant option is checked. Strangely, however, KFind will search MS Word files even if the binary file option is not checked. Both utilities can handle Unicode characters in human-readable (e.g. OpenOffice.org) files, but not in binary files such as MS Word.

find, grep, xargs

find, grep and xargs are classic command-line utilities used in UNIX-type systems. They are likely to be installed as standard with any major Linux distribution. find is used to search for files, grep to search for lines matching a certain pattern (for example containing a specified character string). xargs is used, in non-geek language, to glue these two commands together. If none of this has daunted you so far, open a terminal window and type in man find or man grep, and read on. Alternatively, if all this command-line stuff has images of penguins and geeks floating before your eyes, open a terminal window (from the KDE desktop, click on the application button and select System > Terminal), then type in the following:

find /home/martinluther/translations -type f | xargs grep 'that term'

substituting the main directory in which your translations are located for /home/martinluther/translations, and an obscure item of terminology which you know is in a file within or below this directory for 'that term' (keep the quotes). If a list of files, possibly accompanied by bits of your translations, appears in the terminal window, the above utilities are installed on your system and you've just used them. If this still all seems too geeky for you, use KFind or GNOME Search Tool (see above).

Text indexing and search utilities

The concept of text indexing and retrieval is familiar to translators from search engines such as Google. Although the functionality is similar to thatoffered by GNOME Search Tool and Find Files (see above), the results are obtained considerably more quickly. For this purpose, the contents of the file system or sub-system must be indexed; this can however be done when the requisite processor power is not required, for example overnight.

Beagle

Beagle is a relatively new arrival, but already has the reputation of being "Google on the desktop". I don't have a home page URL for Beagle at present, but you could try here.

Namazu

Besides being Japanese for "catfish", Namazu is a command-line indexing and search engine.

Glimpse

A command-line text indexing and retrieval program. It can search only for single words, not multiple-word character strings, but searches can be narrowed using the boolean functions.

Glimpse is a commercial application. At the time of writing, server licence for 1 to 50 employees costs US$250. The installation instructions are a little geeky, but installation isn't as hard as it sounds.

Global search and replace utilities

KDEFileReplace

KDE utility for searching and replacing strings across multiple files.

SandR

Java utility for searching and replacing strings across multiple files.