How to make up a translation memory using translated documents
Any CAT tool is only good as long as you have a translation memory on hand. There are basically two ways of creating one. You can either make it up as you translate a new document hoping to use it later on for potential projects or you may create it with special software using previously translated files. It should be noted that the documents have to be more or less of the same structure so that alignment software can digest it without problems.
Now we are going to talk about the second option, i.e. how to make up a translation memory with two previously translated documents generously provided by your customers. As a matter of fact practically every major CAT tool provides such an opportunity. There are also third party solutions including ABBYY Aligner, which comes at astonishing $125 and some other minor providers.
In this article we will explain how to start and process your alignment project quite quickly without going into lots of details. As we mostly use SDL Trados in our work we’ll focus on the product that SDL offers for the purpose. The tool is called WinAlign and forms a part of SDL Studio 2011 package. For Trados 7 it comes as a separate tool though essentially there is no difference between these two products. (We tried using other aligning solutions but this one proved to be the best in terms of compatibility with Studio).
The tool can process lots of file extensions, including the most popular MS Office formats and even such binary file extensions as *.exe, .*dll and .*ocx.
So, go ahead and start the program. The first thing that you see is a plain gray window. Do not be scared and create a new project.
What you see next is Alignment Project Settings Window.
Here you can give your project a decent name. Actually it is good to do that because according to our experience some time may be required before you get a proper TM. So, you might want to work on the project for several days, who knows.
Then, select your source and target languages. It is also important to select file types. Even though WinAlign works with practically every file type RTF proved to be the most stable one causing practically no problems as aligning goes on.
As for source and target segmentation tabs in this window we would prefer to keep them as this way the files will be segmented as closely to what SDL Trados does when preparing the files for translation as possible. As a matter of fact this is also quite an important point if you want Trados to pick up translated segments from the aligned TM more or less properly.
After you are done with General tab go on to Files tab.
Here all you need to do is to actually add your files and align file names.
In order to start your alignment process as quickly as possible you can omit Alignment, Structure Recognition, and Interface tabs and go straight to Export tab. There you can specify TM export format, either Translator’s Workbench Export Format or Translation Memory Exchange (TMX). TMX format seems to be the most universal one and accepted by all of the translation memory tools so our recommendation would be to use this one for your export. The next thing you need to do is specify the name of those who did the alignment. This might be quite useful if the TM you’re going to create in the CAT tool has multiple translation options i.e. one source and several translations, out of which a proper one will need to be selected (not a good option for us but somebody may still prefer it). This way you will be able to see who did the translation.
So, once you are done with the process settings you’ll see something like this.
Go ahead and press Alignment tab. Unless you have several files in the project you might want to align only one file pair.
Once the alignment is complete you might want to open the project by double clicking the aligned pair. What you see is your Source file on the left and the target file on the right lined up with each other by some lines. Some segments look very much all right and you’ll have a 100% match if the CAT tool comes across the same segment in the TM. However, some segments seem to be opposed to two segments in translation.
This happens due to different segmentation in Source and Target languages. However, we would not haste to change those settings as this might have some unpleasant consequences in the end unless you’re a real professional and know what you’re doing. In this case disconnecting the segments by clicking on it with the right button of your mouse and connecting it manually seems to be the best solution. Sometimes segments may require joining or splitting to ensure proper alignment. We would recommend looking through the entire document searching for such segments. Certainly not the best solution if you’re running an urgent project though for a long-term one with lots of repetitions it might be quite feasible.
After this step what is left is only to export your project into a TMX file by pressing File à Export Project or Export File Pair. This one can later be imported to your TM in SDL Trados.
Hope you find this information useful. We would be grateful for any comments.