I’m still very, very happy with my recent switch from Heartsome to OmegaT, but one thing I’m still mastering in OmegaT is the use of tags in formatted documents.
Tags in OmegaT aren’t a big issue if your translation work consists of documents that don’t have much formatting, or in which the formatting isn’t very important. However, most of my clients want their translations to look exactly the same in English as they do in French, so the formatting tags are critically important. If you’re used to working in or at least looking at a markup language like HTML, the tags that OmegaT inserts don’t look that odd. When you’re translating in OmegaT, you have to reproduce these tags in the target segment in order for it to be formatted like the source segment, and OmegaT has a handy “tag validation” feature that lets you know if you’ve missed any tags.
The issue that I have is that, as OmegaT’s user manual says, “Tags are usually not taken into account when considering string similarity for matching purposes.” My problem is that I get a lot of fuzzy matches where the text in two segments is identical or very similar, but the tags are completely different. For example, the segment “1.3.1) For ongoing needs” would come up as a fuzzy match for “<f0>1.3.1) For ongoing</f0><f1>needs</f1> “. So far, I haven’t found a way to deal with this other than either a) going character by character through the suggested match and inserting the tags into the text or b) inserting the tagged source segment into the target segment field and then copying and pasting in the matching text from the fuzzy match box. If anyone has suggestions on how to deal with this, I would love to hear them!
Another option that OmegaT’s manual suggests, and on some documents I think this might actually save time, is to remove all or most of the formatting from the source document in order to minimize the number of tags that appear in the source segments. Then, the translator could go back and re-format the document after the translation is finished. I also find it helpful to keep an OpenOffice.org document open with the source text so that I can refer to it when I can’t easily see what the tags’ purpose is.