Sonntag, 3. April 2011

Open-Source-Software for translators

by Alexandra Kleijn - originally published in Heise C't Open source

The world of translators is a Windows world: MS Word, and the translation tool SDL Trados are the measure of things. Whoever prefers a different operating system or does not want to bow to the dictates of SDL, has so far had a hard time. But now alternative tools are available - even for Mac and Linux users. And they are in part Open Source.  

Hardly any professional group is as loyal to Microsoft as translators. Not surprising when you consider that their clients provide basic texts in many cases still in the well-known doc format -. The migration to Office 2007 and the still relatively new 2010 - and on XML-based, open file formats - is taking place only slowly.


With the open source office suite OpenOffice, however,  a free alternative to MS Office is available, which is hardly inferior to its competition in terms of its functionality. A major advantage of the open-source suite is the fact that it runs equally well on Windows, Mac OS X and Linux. OpenOffice as an independent office software is a good alternative for a number of users as well.

OpenOffice is open source: the application is available together with its source code and anyone who feels called upon can change the application according to his own ideas. Open-source software may also be in general distributed without any restrictions.  Even if the developer of an open source program is free to ask for a license fee for his application, open-source software is usually free of licensing costs. Open source software is generally based on open standards and data exchange between different programs is easy.


Document formats: MS Word reigns
OpenOffice Writer, the word processing application in OpenOffice, can handle the well-known Doc format, used in  Microsoft  2003 and its earlier versions is. The problem, however, are MS-Office documents with macros, many embedded images and forms. Even with complex formatting, such as a division of the text in columns, must often be handled manually.

It does not look so good for the exchange of documents using the new Microsoft's XML-based Office Open XML
format (OOXML), the default file format for MS Office 2007 and 2010 (file extension. Docx). The original format is often lost when opening docx files in OpenOffice. The way back  is (still) blocked: Documents can't be saved as docx in OpenOffice Writer does not save as docx, at least not under Windows and the Mac - the Linux version of OpenOffice offers this possibility.  This rather limits the  usefulness of OpenOffice as a MS-Office replacement for translators: the docx file received from the client for translation, must be returned  in the old DOC format or in the ODF format, available in OpenOffice.

 
ODF, a standard format  for text documents, approved by ISO, can be opened directly in Word 2007 with Service Pack 2 and in Word 2010. In older MS Office versions one can upgrade to ODF support using the Sun ODF Plug-in. The new owner Oracle will ask for money, but one can still download the plug-in from  Softpedia free of charge. The plug-in knows the functions of the new ODF specification 1.2, Microsoft Office 2007 and 2010 only support ODF 1.1. Mac users are left with nothing: Microsoft Office for Mac 2008 and 2011 do not recognize the ODF format.
  
LibreOffice saves in docx format as well
Since autumn 2010, the OpenOffice offshoot LibreOffice has been courting the attention of users. This so-called fork was created after the Oracle's takeover of the Sun, lead developer of the open source office suite. The current LibreOffice 3.3, based on the same source code as OpenOffice 3.3,  can open as well save docx files. It is plagued, however, by the same formatting problems applies  OpenOffice.
For Mac users Microsoft Office for Mac 2011 is available
since last October. The new version has move closer to the Windows version. It also finally supports Visual Basic for Applications (VBA), so that Office macros should  work across platforms.

Translation Tool Translation Memory
For many  translators a Translation Memory System is indispensable. In the market for TM environments one can meet a lot of providers. In the last few years the Top Dog SDL, the manufacturer of the TMS Trados, has gotten quite some competition breathing down its neck. In Germany, for example,  Across and MemoQ. All these tools are, however, proprietary software. After all, cross-platform TM tools such as Wordfast Pro and Swordfish have broken Microsoft's hegemony somewhat. They are written in Java and run also under Mac OS X and Linux.
 

Few years ago everybody was cooking his own little dish, with all the resulting compatibility issues. The latest trend goes increasingly toward open standards: XLIFF (XML Localization Interchange File Format) for document exchange and TMX (Translation Memory Exchange) for translation memories. Both are based on XML. The eXtensible Markup Language separates the content and the other information such as formatting and meta tags, and has established itself as a standard for cross-platform and cross-program data exchange of all kinds. The situation is by no means optimal: many manufacturers push their own interpretations of these standards - so for many, the "dirty" Bilingual Word document remains the measure of things.


OmegaT 

OmegaT - OpenSource TM-based translation tool
for Windows, Mac OS X and Linux
At the moment the only open source TM tool, ready for the production use, is OmegaT.  It is written in Java  and can work directly on Microsoft's OOXML texts. Documents, created in earlier versions of Office documents , must first be converted either with MS Office 2007/2010/2011 (Mac) or to open office before they can be translated. When converting to OpenOffice, the previously mentioned risk still exists, namely that expensive-looking documents do not survive the transition without any formatting changes.

The project openTM2 has yet to grow beyond the beta stage. The focus here is the open-source implementation of a TM oldie: the IBM Translation Manager. The lofty goal of openTM2 has been to become the reference platform for the translation memory exchange standard TMX. The trial version currently available runs only on Windows. 

...to be continued ... 1/2                  

Translation: smo 

Keine Kommentare:

Kommentar veröffentlichen