About this blog

Translator's Shack is a collection of links, news, reviews and opinions about translation technologies. It's edited and updated by Roberto Savelli, an English to Italian translator, project manager and company owner of Albatros Soluzioni Linguistiche, a team of English-Italian translators, which hosts and supports this blog.

The Life as a PM category, managed by Gabriella Ascari, contains topics that are less technical in nature, but which we're sure will be appreciated by owners of small translation businesses and freelancers.

Here are links to my pages on some social networks:

Highly recommended:


memoQfest 2010 – “ask the Geeks” Q&A session

New feature for memoQ 4.2 or later: project archival. Backup and restore functionality for the project including TM and TB. Useful for moving projects to a new PC and keep two PCs in sync.

Speed degradation on network drives: it’s a problem created by the underlying input/output system.

More on archiving and paths. Suggestion to make relative path as the default. This suggestion is being considered.

In 4.2 a brand-new aligner is available. Interface has been reworked, improvements with multiple documents. Segments are editable.

Problems in handling Chinese/Japanese content:The focus from the input method editor shifts away from the main window. Developers are working to solve the problem. It appears to be a rather complex bug. There is also an alleged problem with fonts since version 4.0. If you copy Japanese text from the source to the target and the target uses a Latin-based font, you may get useless squares. However, most users translating from/to Japanese do not have a problem with this.

Dragon Naturally Speaking support: support in the 4.0 version has problems. These should be solved in version 4.2.

Feature request: hide the mouse pointer when the user starts typing. This is probably already supported by your mouse driver.

Plans to have a clone project feature for local projects.

Feature request: search for tags and/or filter for tags. It’s being considered.

Request: make the metadata from the term base visible in the translation environment. In all probability, qTerm will be addressing this issue.

Hunspell problem with the Rumanian dictionary. Perhaps the best solution is to replace the default dictionary with a new one.

Ctrl-shift-B – Ctrl-shift-N keyboard shortcut allows to move the selection to the left/right.

Request: if the source term is in small caps, insert it as small caps even if it’s capitalized in the target term base.

Fragment-assembling: often short segments from the TM take precedence over the same term from the term base. Sometimes the result is that a term is inserted with wrong capitalization. Perhaps this issue will be solved in qTerm. qTerm will also offer filtering capabilities.

Inverting the “direction” of translation memories. Version 4.5 will introduce features that will make this “problem” obsolete.

Inline tags: if you have translated a project containing bilingual files that use the “old” (inserted by F8) type of tags, and then receive a new version that use the “new” version tags (F9), your match rates will decrease because the tags are not substituted on-the-fly. This is something the developers are working on.

Request: LAN access to term bases and translation memories without using memoQ server: this is recognized as an important feature, but it’s not going to happen. Tiny translator groups still have to purchase the server version if they want to share resources.

Request: add a keyboard shortcut to add a term as untranslatable. This is being considered as an addition to future versions.

memoQfest – XLIFF as a bilingual interchange format

Presentation by Thomas Imhof from localix.biz. Just some quick notes here.

Some interesting concepts:

  • allows translators to concentrate on the text rather than on the formatting.
  • standardized exchange of localization data
  • can serve as a common format for localization tool vendors
  • supports review comments, translation status of each string
  • XLIFF allows to create the target document at any stage
  • Custom namespaces and attribute values allow to extend the information included in XLIFF files

Some limitations of XLIFF:

  • XLIFF knows nothing about segmentation. [see comments section. This appears not to be the case]
  • Extensibility is limited to the specific tool that added the specific extra features.
  • Inline elements: XLIFF does not control the filtering process, so the notation of inline elements in entirely in the hands of the translation tool vendor.

XLIFF support in the current translation tools:

Thomas divides XLIFF support in today’s tools into three groups:

level 1: source is copied to target. Considered as “messy”, offered by many translation tools today

level 2: offered by memoQ and Trados Studio: opens the files correctly and handles elements more or less correctly. Use custom namespaces for tool-specific functionality.

level 3: offered by Swordfish and Heartsome, offer full support for all of functionalities and features, do not add custom namespaces. They use the “note” element offered by XLIFF.

memoQ works well when opening third-party XLIFF files. Roundtrip of SDLXLIFF files produced by Trados studio works well, but some Trados-specific attributes (e.g. segment status) are not updated.

Best practices:

  • make sure XLIFF file is bilingual and not multilingual
  • alt-trans elements are not supported in memoQ
  • etc.

memoQfest 2010 – AGITO Translate

Next item on the program is AGITO Translate, a web-based translation environment based on memoQ server. The foundation for this system are the memoQ APIs

The main feature of this system is its simplicity. Thor Angelo from LanguageWire (the translation company that develops AGITO) admits that although AGITO might be even too simple for some translators, it’s the ideal solution for some clients who require super-fast turnaround and who send frequent, but small chunks of material to be translated. For instance, advertising agencies, web service companies, search engine optimization firms.

AGITO offers a modular approach (term base, translation, editor, integration, authoring, etc.). Clients, as well as translators, can access it through the web interface.

Interesting concept: a brief history outlining the transition of tools from everything offline, to TB and TM online, to documents online, to application online, which is supposed to be the final stage we are getting to now.

AGITO allows translators and proofreaders to access the same document simultaneously. No software installation is required, and project managers can see the real-time status of each job.

Some examples of problems on the user’s end were presented, for instance trouble with installing translation tools, problems with the timely delivery, with completeness, etc. AGITO aims to solve this problems by simplifying the whole process on the translator’s end.

During the Q&A session, some concerns were raised by the audience, e.g. spelling control (it’s handled by the browser), quality assessment (there are some basic checks like double spaces etc. but according to LanguageWire a separate proofreader is the way to go). Also, the translator is not allowed to use his/her own translation memories or term bases. Moreover, at the current stage the translator has no local copy of the translation material.

Some concern was expressed about confidentiality. The system is protected by secured passwords. “Just like your home banking system”.

The system is, in theory, ideal for crowdsourced translation projects.

In conclusion, AGITO is certainly not a product tat our team of translators would like to use any time soon, but it’s an innovative concept that could be interesting for some agencies that work with very tight deadlines and require multi-user collaboration without the overhead of supporting local installations.

memoQfest 2010 – Kilgray technology update

Gábor L. Ugray talked about some new aspects introduced by version 4.0, like the use of “resources” (e.g. the expected translation memories and term bases, but also segmentation rules, filter configurations, QA settings etc.), all of which can be shared on a server and deployed to the various translators from one central point.

Gábor briefly explained the concept of of offline project handoffs, in which the project managers sets the project preferences (like TMs and TBs to uses, as well as more specific settings) and sends handoff packages to the translators, who do not have to worry about the setup, because it’s already contained in the handoff.

New editor created for version 4.0, now unicode-enabled. A vast improvement over the previous versions of the editor, where tasks like selecting text using the keyboard were a bit awkward.

The memoQ server API (for connecting memoQ server to project management systems and customer management systems, for instance) is not complete for 4.0, but it’s going to be fully available for 4.2, to be released this week.

Machine translation integration: Kilgray is still evaluating this feature. There are still some concerns about privacy, licenses, copyrights, etc to be addressed.

Some other concepts in the pipeline: online review interface built around memoQ server (allowing reviewers to work in a browser even if they do not have memoQ), qTerm terminology management system (with TBX support and memoQ integration), plus “two major surprises”, probably two new versions to be released before the end of the year.

Kilgray is also working on terminology extraction features, to be released sometime in the future.

During the Q&A session, the availability of a Java property filter functionality was revealed.

memoQfest 2010 – Kilgray’s progress report

Kilgray’s progress report – some notes

I’m posting a few notes about the technical aspects that were mentioned during this presentation by Balázs Kis and Peter Reynolds.

Version 4.2 will be released during the memoQfest, probably today or tomorrow. Some significant new features are included, like the two-column export format (allowing reviewers to work on bilingual files without using a translation tool).

TM Repository is approaching its release date.

qTerm online terminology management system will be released later this year.

Interestingly, the concept of “empathy” was used to describe the approach to product support.

memoQ masterclass by Angelika Zerfass, part 3

memoQ term bases

After the lunch break, the structure of memoQ term base entries was discussed. Angelika explained the import mappings for CSV files.

One trick for importing term bases in the fastest way possible: if you work frequently with one language pair and always use the same term base structure, export a sample term base from memoQ and delete all the content except the rows containing the headings. Then use this CSV template every time by pasting your contents under the column headings. When you have to import the resulting term base into memoQ, you will not need to do any mapping, because the column headings will be correctly accepted and configured by memoQ.

In the current version of memoQ, only 5-6 hard-coded fields are available. While this is probably enough for most translators, organizations that have terminology management systems feel the limitation of this setup. That’s why Kilgray will introduce a brand-new terminology system that will contain custom fields and complex structures.

Terminology plug-ins

memoQ 4.2 offers terminology plug-ins. One example is the EuroTermBank: if you start a term lookup (ctrl-P) , you can type a term search and specify to search the term not only in the normal memoQ term bases, but also in the online EuroTermBank database. Kilgray is part of the EuroTermBank consortium and can offer this feature to all its users for free. Needless to say, you need to be online in order to use this plug-in.

Two-column RTF export

Balázs Kis then proceeded to show the brand-new functionality called two-column RTF export. In the presentation, Balázs added a couple of comments to some segments, created a view that only included commented segments, and proceeded to export the view as a two-column RTF file. He then opened the resulting file in Word. The resulting file is a multi-column, editable file that a reviewer can use even if she does not have memoQ. The third column contains a color-coded value of the segment status. The general comment in the room was “this is better implemented than in Déja Vu”, “great!”. There was even some applause! You could really tell that this was a long-awaited feature.

memoQ masterclass by Angelika Zerfass, part 2

CSV import of a TM

Use Olifant to open a TMX file, copy all rows (this is actually a great tip for converting a TMX file to a columnar format with just a couple of clicks) and paste to Excel. You can change column headings in Excel. From Excel you have to export as unicode .txt file if you want the procedure to work.

Advanced HTML import procedures

Balázs Kis from Kilgray explains how the standard HTML filter in memoQ will not make some HTML attributes editable (e.g. IMG titles). As a workaround, you can use Import file as… and use the XML import file to specify how every single tag should be treated, so any tag attribute can be made editable. Interestingly, the XML files is always used, behind the scenes, when you import an HTML file using the standard settings.

Bilingual formats managed by memoQ

.MBD

.DOC

.TTX (you need to pre-segment TTX files before importing them into memoQ. Unless the file is pre-processed this way, there’s no guarantee that it will work when opening the file in TagEditor). Here’s where the option is located in Trados:

image

.XLIFF

.SDLXLIFF, containing lots of Trados-specific metadata. You can process this type of file in memoQ, but some metadata (like segment status) will not be preserved.

.RTF multi-columnar export, allowing to use other tools or a word processor for reviewing the translation.

.Transit

Handoff packages

This feature allows the project manager to create an offline project and send handoff packages (containing all the resources needed for carrying out the project, i.e. term bases, translation memories, non-translatables, etc.). If you assign different files to different users, memoQ will create as many packages as the number of translators, and include the translator’s name in its file name. The packages can contain TMX files if the translators have to work offline, or a .TMI file, which is a reference to the server TM, for server-based projects.

memoQ masterclass by Angelika Zerfass, part 1

TMX

A short comparison between the contents of TMX data coming from different translation tools (Trados 2007, Studio, memoQ).

TMX can contain tool-specific information (additional fields, segment status, segment context, alignment penalty, etc.) that’s not easily imported into other tools.

If a TMX import does not work, look at the language identifiers first.

A memoQ TMX file contains TM-level (project. client, etc.) and segment-level (project etc., but also changeID, client, corrected, aligned, context)metadata.

The information entered into the User and meta-information fields is case-sensitive, so this can lead to data duplication.

For the moment, the meta-information fields for each new project is limited to username, project ID, domain, client, subject.

Since the user ID is overwritten when a new user changes a field, a workaround is to use the “subject” (or another) field to specify the name of the original translator.

Trados has both creation ID and change ID fields. memoQ only has the change ID field, so when you import from Trados, the change ID from Trados will be imported and the creation ID will be lost.

Trados Studio includes context information in its TMX exports, but these are in form of hashes (a long string of digits), and do not contain the actual context strings like memoQ does. As a consequence, context information cannot be imported into memoQ from Trados.

The memoQ TM import settings

The field Process TRADOS TMX for best results in memoQ should be used if both the TM and the translatable files are in Trados format.

Import <ut> as memoQ tag: this allows support for legacy TMs. “UT” means “Unknown tag” here.

For Trados versions up to 2007, it’s very important to apply the penalties. Otherwise the statistics will treat segments that are almost identical except for punctuation, tags, and formatting, as identical.

If Use context is selected, you should not use Allow multiple translations. Two identical segments with different context will both be saved to the TM.

Olifant

Angelika showed us some quick methods for doing maintenance on TMX files using Olifant. Delete duplicate and inconsistent segments.

It’s probably best to re-create a TM from scratch after the edits, rather to import back into the original memoQ TM.

Import of a Trados TM into memoQ

In order to map the Trados fields to the corresponding memoQ fields, a search & replace is done in the TMX file, using a text editor. The values contained between quotes in the <prop type> fields are replaced by the corresponding hardcoded field names for memoQ.

If the search&replace is too complex, it’s probably better to export from CSV in Olifant and import the CSV into memoQ, where fields mappings can be set up during the pre-import procedure.

Based on the participants’ inputs, here are some of the issues that users face when migrating their resources to memoQ from other tools.

  • Loss of metadata
  • Loss of tags
  • TM files too big for import
  • Moving TMs from one server to another

Moving term bases

TMbuilder (translation memory export creator)

image TMbuilder is a small tool that makes building up TM export/import files as straight-forward as possible. You can use it to batch-import several files in Excel (2003 or 2007) or tab-delimited format and build a Trados-compatible or TMX 1.4b TMX file with a couple of mouse clicks. Here are some more details about the features:

– Accepts two input formats: tab-delimited text files and MS Excel spreadsheets
– Creates output files in two file formats: Translator’s Workbench 7.x/8.x (TXT) and the Translation Memory eXchange (TMX)
– Works on multiple input files and offers a merging feature – there might be just one import file
– Allows the user to specify standard TM fields, like: source and target ISO flags, segment descriptions and author name
– Removes additional quotes often created by MS Excel when saving the file to the Text form
– Works with standard encodings: Unicode and UTF-8
– Rapid file creation: milliseconds for .txt and seconds for .xls input files

The application is free for non-commercial use and can be distributed as a standalone executable program. It requires Microsoft .NET Framework 3.5.

TMbuilder – the easiest Translation Memory export creator

Italian Localization problem n. 1 – System tray

There are some recurring terms in software localization which do not seem to have a well-established Italian translation, even if their meaning is very clear and they should be treated as 1-to-1 correspondences.

One of them is “system tray”, a commonly-used term that refers to a “portion of the taskbar that displays icons for system and program features (…)”, according to Wikipedia.

On the same Wikipedia page, we learn that

The notification area is commonly referred to as the system tray, which Microsoft states is wrong, although the term is sometimes used in Microsoft documentation, articles, and software descriptions.

What this means is that the term “system tray” should be avoided in English documentation that refers to Microsoft operating systems. If found while translating, you may want to warn the author to change it to “notification area”.

If we take a look at Microsoft’s own glossaries, here are the results for system tray. The term is not displayed in the blue “Microsoft Terminology Database” area, indicating that it may not be an official Microsoft term. The in-context results displayed in the orange area contain several inconsistencies. The Italian translation that seems to be used in the newest products (Windows 7, Vista, Server 2008) seems to be “area di notifica”.

A quick search for notification area reveals that this term is also translated as “area di notifica” and that this is an official Microsoft term (contained in the Microsoft Terminology Database).

Image: Danilo Rizzuti / FreeDigitalPhotos.net