Most commercial translation memory applications now include an aligner as part of a suite of tools. Aligners are used to produce translation memory files (e.g. in the industry-standard TMX format) from legacy translations and their corresponding source files. The resulting file can then be used within a translation memory application, which provides easy access to the source/target segment pairs in the translation memory file.

Although the advantage of aligning relevant legacy texts is obvious, a major drawback is that such tools generally align non-intelligently. Where the source and target texts have different numbers of segments (e.g. because at some point, the original translator merged two sentences to form a single sentence in the translation), the resulting translation memory file is misaligned. Aligners generally provide an interface through which such misalignments can be corrected manually. This can be a time-consuming process, however, and is usually worthwhile only when the legacy translation is known to be useful, such as when the original source text has been modified and a new translation of it is required.

Bitext2tmx

Launched in January 2006, bitext2tmx is a Java application which is able to align two plain-text files. It features a clean graphical user interface and self-explanatory functions for splitting and merging cells etc. Bitext2tmx is distributed under the GNU Public License and is the work of Susana Santos, with the support of other members of the team associated with Mikel L. Forcada.

LF Aligner

Developed by Hungarian translator András Farkas. Written mainly in Perl; also makes use of other open-source utilities such as hunalign and pdftotext.

LF Aligner is a command-line tool, but interactive. It promises "intelligent" sentence-level segmenting.

Heartsome TMX Editor

Like bitext2tmx (see above), Heartsome's modestly and perhaps confusingly named "TMX Editor" is also a Java-based utility. It is a commercial application (currently priced at €88/US$136 for the personal edition). The TMX editor is capable of aligning files in the following formats: RTF, HTML, XML, Plain Text, JavaScript, PO, and all OpenOffice.org files, and offers a range of functions which assist in streamlining the alignment process and reducing the scale of manual intervention.

Stingray Document Aligner

An alignment tool from Maxprograms, the company responsible for the Swordfish translation memory application and the work of Rodolfo Raya, formerly programmer for Heartsome. Stingray can align files in a wide range of file formats. It currently costs €70, and a free, fully functional 30-day trial version is available.

GMA

A tool for "geometric mapping and alignment". Runs on Java and os licensed under the GPL. (With thanks to Patrick Hall for the tip.)

aligner.py

A simple Python script for creating a TMX file from two texts. Written by Dmitri Gabinski.

bligner.py

A simple Python script for creating a TMX file from two texts. Written by Didier Briel.

Other resources

deli.cio.us/patfm/alignment

Links to more resources on alignment. (Thanks again to Patrick Hall.)