OmegaT: a free and very useful TenT
Translation environment tools (TenTs), also referred to as CAT tools or translation memory tools, are the subject of numerous passionate discussions among translators. Some people prefer a standalone tool, others a tool that works from within a program like Microsoft Office, some users are adamant that their tool has the best matching algorithm out there while others criticize the competition’s pricing, support, you get the picture.
Over the past few years I’ve primarily used two TenTs with which I’ve been very happy, Wordfast and Heartsome. Wordfast has the advantage of being quite similar in look and fee to, and quite compatible with the market leader SDL Trados, and Wordfast’s developer, Yves Champollion, is a very smart and helpful guy who provides excellent support for his product. Heartsome is a powerful tool at an attractive price, and it’s, in my experience, the best of the commercial TenTs if you don’t run Mac or Windows, or if you like to use OpenOffice.org file formats.
Lately I’ve been feeling like my quiver of TenTs needed an addition; I don’t have any complaints with Wordfast, but in order to run on my Linux computer system, it requires not only Microsoft Word, which I only use for Wordfast-related work, but also a program like CrossOver Linux that allows Windows software to run on Linux without a Windows license. So, this adds a few layers of upgrades for software that I really only use when I’m running Wordfast. Next, (and I’m not blaming this on Heartsome, it may be something to do with my computer) I have had major frustrations with Heartsome slowdowns, when it has taken 15-20 minutes to convert a 5,000 word document to XLIFF format (required in order to translate) and four to six seconds to add a segment to the TM and then progress to the next segment, which adds up over the course of a 500 segment job.
Enter OmegaT, which I’ve tested out before but never really delved into using for my day to day work. Portuguese translator Thelma Sabim gave an impressive presentation (including a live demo) on OmegaT at the 2006 ATA conference, and ever since then I’ve been thinking that I should give OmegaT more than a passing trial. This week I’ve been using it on some non-rush work, and I’m really, really enjoying it. OmegaT runs on any computer with a Java 2 Platform (i.e. Linux, Mac, Windows 98 or higher), and the OmegaT team (all volunteers) has helpfully provided a download of the software that includes a Java 2 Runtime Environment so that you don’t have to figure out if you have the correct Java version or not.
Once you’ve downloaded the software, you need your source document to be in a format that OmegaT supports, which include OpenOffice.org files (note that OO.o now includes a wizard that can batch process the conversion of MS Office files into OO.o files; go to File>Wizards>Document Converter), Open XML files, text files, HTML files, Open Document Format files, INI files, Java resource bundles, XLIFF files, .po files and DocBook files. Then, OmegaT’s quick start guide (which promises that you can “Start using OmegaT in 5 minutes!” and I would agree!) shepherds you through the easily mastered basics of translating in the software’s interface. I am finding many of OmegaT’s features to be just what I need in a TenT. For a long time, OmegaT may have been best known for its “love it or hate it” feature of segmenting at the paragraph level. Now, you can select paragraph or sentence-level segmentation. Also, under Options>Editing Behaviour, you can select what you want to see in an untranslated segment: the source text, nothing, or the best fuzzy match with a percentage that you define. An additional feature that I like is OmegaT’s use of folders that it creates within your project. For example, it creates a “target” folder into which it places your translated documents once they are created. So, you can give the target document the same name as the source without having to worry about overwriting anything or figuring out which document is which.
I also have to mention that this software is *fast,* and not just because I’ve recently had some speed issues with other tools. OmegaT’s processes (segmenting a source file, compiling target files, pulling up fuzzy matches) are so fast that you almost don’t realize they’re happening; in one case I actually clicked “Create Translated Documents” again because I didn’t realize that OmegaT had taken only a few seconds to compile the translations in the 15-file project I was working on. Other translators with different needs may have other opinions about OmegaT, since my use of TM is largely for my own productivity rather than because my clients request it, so issues such as exchanging TMs with other translators (which OmegaT can do) are not very important to me. If you’re looking for a new or additional TenT for your office, I highly recommend OmegaT; even if you just test it out, it’s free, it’s easy, and it’s fast!
I’ve been using OmegaT for about a year and a half, and, I totally dig it.
I agree that it is easy to use, fast, and, of course, you can’t beat the price!
Of course, it is the best option for use on Linux.
It is an included package in Linguas OS, Linux for Translators, http://www.linguasos.org
Thanks for this article – I think I am finally ready to take the plunge and invest the time into a TenT, sounds like OmegaT could be the one for me…I will let you know how it handles Arabic script 🙂
Подскажите шооблончег под WordPress 2.6.2, чтобы был похож на ваш thoughtsontranslation.com.
Заранее благодарю)
Thank you!!! 🙂
I have always loved Wordfast, except when it comes to how erroneous it is in Powerpoint slides. It has never done it well for me. I dreaded being sent slides and just got a set of several of them today with lots of text boxes etc. so I thought it was time to find a solution! And who do I come across in my Googling but my favourite blog 😀 Pity I subscribed first shortly after this article was written – I should look through your previous entries when I get the chance!!
Anyway I’m also on Linux, and a big fan of Openoffice although I don’t (didn’t) use it for translation work – I use a VirtualBox to load Windows simply just because of Wordfast! After seeing that the first of several slides has translated without losing any formatting and how bloody easy it was I am very impressed with OmegaT and am seriously considering making the switch over for my word processing documents too!
Thanks for the clear explanation – I am now an OmegaT fan!! 😀 And your reference to the Wizard in Ooo was a big time saver!! (This is a set of 11 slideshows that I would have converted one by one otherwise). You are officially my heroine of the day! 😛
Wow, heroine of the day… I’m flattered! And for that, you can even spell favorite with an “ou.” I really, really like OmegaT. The only thing that I have chronic trouble with is tags. Most of the time I just strip them out and then put the formatting back in. Otherwise I love it, and as far as presentations and spreadsheets, it does the best of any tool I’ve used- for free!
You can go to properties and click on “remove tags”. I does not delete them, it just hide them. I have no idea how Omegat put tags in place in the target document, but it has never made a mess, at least not in Microsoft Word documents, which I use the most. I have used Omegat since 2013 and I have never had problems with tags.
Hi, I’m new to using OmegaT. So far, I really like it. There is one issue I have though. If a client sends me a Trados based TTX file to me, how can I use this with OmegaT? OmegaT handles TMX files. Is there a good way to convert this TTX file with either OmegaT, OpenOffice, or some other Open Source tool without losing anything? I searched the internet for any guide or tutorial on this subject, to no avail. I would like to stick with only FOSS. Can you please help me? Thanks.
Ela, you can use Rainbow from the Okapi Framework (http://okapi.sourceforge.net/) to convert TTX to TMX. Launch Rainbow, then go to Utilities > Trados Utilities > TTX To TMX Conversion…
It may well be worth investigating Anaphraseus, an add-in to Open Office. By all accounts it seems to port the wordfast TMs and Glossaries to Linux: I’m still experimenting, and manipulating the files, though.
I recently discovered OmegaT and tried it on a large project of +5,000 segments, with PERFECT results. Despite having to go through the step of converting MS Office files to OpenOffice format, the process was agile and bug-free. Very pleased, and concur with Corinne´s observations (this CAT tool is fast, stable, and easy…it´s basically the epitome of the K.I.S.S. philosophy). My “concerns” are along the lines of Ela´s–being able to receive a client-provided Trados TM and use it with OmegaT, and also being able to provide a Trados-compatible TM from Omega-T (if requested). I´ve been investigating these issues, but am not clear on the solutions. Any suggestions? Thanks.
I’m evaluating OmegaT and maybe I’m missing something, but I don’t get the way it handles TM files.
Instead of allowing you to build one central TM for a given topic or customer, it makes a new file for each project which the user manual then tells you if you want to use again, you have to manually copy to your next project (etc. etc. etc.).
Is it really not possible to have one TM for a given topic or customer and just build that without the endless copying and renaming of previous TM files?
I’m also wondering the same thing as Bowjest – if a client has provided a TM that they want you to use – and add to – is there no simple way of using the existing TM without having to start a new one?
Becci,
I’ve found in the interim that there’s a merge tool on the OmegaT site that you can use to merge two TM files. I’ve tried it and it seems to work, so hopefully that is the easiest solution.
Hi Bowjest,
Thanks for your reply!
So just to clarify, even if a client sends a TM they want you to add to, you still have to start a new project with a new TM, and then merge it afterwards?
I’ve seen the additional merger but, like yourself, thought there would be the option to use an existing TM as the main project TM.
Hi, Becci,
Good question! I’m not entirely sure.
I read the following link yesterday:
http://www.proz.com/forum/omegat_support/158125-how_to_update_previous_translation_memory.html
and the bit near the top from Samuel Murray seems to indicate that if you take an existing TM and rename it project_save.tmx and put it in the omegat directory for your given project, OmegaT will write to it. To quote from Samuel Murray:
“4. The TM that OmegaT reads from *and* writes to, is called project_save.tmx, and it is in your project’s /omegat/ folder.”
I’ve not tried this, but it seems logical. I think I’ll experiment a bit with it later today.
Good luck and let us know what results you get.
Hi Bowjest
I did what you mentioned above and it did indeed work!
As long as you rename it, OmegaT will read from this filename – the translator can simply then rename the TMX file afterwards.
Hi, Becci,
I’m glad that worked for you. In the interim I’ve had some feedback from a long-established OmegaT user and he advises against doing the above (sorry, it’s what I’ve found works, but he makes good points below):
OmegaT never deletes segments from project_save.tmx. If you change the translation of a segment (e.g. in a second draft), the segment is modified. But if you remove the segment from a project altogether after that segment has been translated, i.e. by deleting the source text file containing it, or by making changes to the source text file (as a result of which the segment concerned ceases to exist in exactly that form), OmegaT retains these segments in project_save.tmx. They enter a form of limbo, because they can no longer be modified within OmegaT (because they are no longer presented to the translator for translation/modification. However, OmegaT does not delete them because they can be useful. They turn up in Find results, for instance, and are marked as “Orphan segments”.
His recommendation is to create a new project for every doc you translate. Then, after you export it and are happy with it, take (or copy) the level1-tmx file and put it into a master folder of TMs that you can use in future. The added advantage to this is that the level1 file doesn’t capture all the inline formatting nonsense that you sometimes have to deal with from converted Word documents (all that stuff).
I’ve started doing what he suggested and it works great!
I tried it. What a mess! It’s even unable to expand and shrink segments…
All software of that type (Trados, DejaVu, Wordfast, etc.) have a chaotic, completely ineffective and buggy UI.
OmegaT has just one advantage over the others – it’s free whereas the others are awfully expensive.
Translators should refuse to work with such badly designed “tools” based on a wrong assumption, i.e. that a text is a concatenation of segments.
We’re living in the world of CHT – Computer Hindered Translation.
Chris,
I’m sorry to hear you’ve not found OmegaT useful.
I’m not sure what you mean by “expand and shrink segments”, but you have the option per project to display the given text as sentence-level segmentation or paragraph-level. Perhaps that’s not what you mean.
I find it a very fluid, useful tool. It is option rich (IMHO), allows me to deal with long texts in a number of formats, and allows me to build a repository of hundreds of thousands of TMs and glossary files.
Give us a bit of detail on what you’ve found lacking and it can be fed back to the development team.
Thanks.
Chris,
I couldn’t disagree with you more. Fair enough, you have to spend time reading the manual and looking through forums when you get a bit stuck now and again (it must be said, that the OmegaT yahoo group is extremely active and helpful), but that’s the case for any new software. As Bowjest mentioned, you can change the segment size. And if you don’t want to see the text as a bunch of segments hindering your flow then print it out or have the document open in the background!
OmegaT’s main advantage is not that it is free in terms of cash, but that it is free in terms of being open source. This means that, unlike many of the larger CAT tools, if there is an aspect of OmegaT which you do not find works well, or if there is a feature you find is lacking, you can feed this back to the development team, as Bowjest mentioned above, and more often than not changes are made (I say this from first-hand experience). This kind of customer service is something that you will rarely experience elsewhere, at least not at such speed, and definitely makes the whole OmegaT package much more appealing.
Beyond OmegaT, there are some extremely useful CAT tools out there. The whole point of the text being split into segments is for the translation memory to be able to work at its best, saving you time and increasing your consistency, and therefore translation quality, in the long run. Any good translator will always start by reading through the text, so shouldn’t be put off by segmentation, whatever the tool.