Creating Firefox smart keywords for quick access to frequently-used translation glossaries, dictionaries, resources, etc.

The Search bar

imageThe Firefox Search bar is a convenient method for accessing search sites without first having to visit the site’s home page and locating the search field.

So, instead of heading to Answers.com, finding the search field, typing the search term and pressing enter, you can stay in any page you are on, click on the Search bar down arrow to select which engine to use (if it’s not already selected), type your search and press enter. The relevant search results will be displayed immediately. It’s worth remembering that the shortcut key for placing the cursor into the Search bar is Ctrl-E or Ctrl-K.

Adding common search engines

imageYou can also add a search engine directly from the page you are visiting if the site’s publisher has made this feature available.

In that case, the down arrow will “glow” to show that a search engine can be added. See top-left corner in the screenshot to the right (this is way too subtle for me and I always overlook this information).

In the example, two search engines are “discovered” while visiting the CNN.com website.

Adding specific search engines

imageThe Firefox Search bar comes pre-loaded with Google, Yahoo, Amazon, eBay, Answers.com, and Creative Commons search, but it’s easy to add more by visiting popular resources such as the Firefox Add-ons page or the Mycroft Project. These pages contain very specific search engines such as the ProZ term and glossary search, IATE, etc.

So, what’s the problem with the search engine list?

After adding a dozen or so search engines for useful and fun websites, I noticed that my list started to grow much too long and that it was very impractical to use it by clicking on it and scrolling to the right engine. If you stick to about 5-10 engines, you’ll probably be fine with the standard configuration, but if you use several resources for terminology research while you are translating, you’ll soon realize how frustrating it is to get the mouse, click on the list, remember and then find the right engine for the job, go back to the keyboard, etc.

There must be a better way to accomplish this. Of course there are several extensions, add-ons and utilities that can help you get quick access to your favorite search engines. If you, like me, prefer a minimalist approach to computing and want to avoid having all those tiny utilities sitting in the system tray, eating processor cycles and creating conflicts, you may want to read on.

Adding search engines the “geeky way”

image One first helpful feature provided by Firefox is the Keyword option that appears on the right-hand column of the Manage Search Engine List, accessible by clicking on the down arrow on the Search bar and by choosing Manage Search Engines…

Select any search engine in this window, then click the Edit Keyword… button. Then type the keyword you want to use for this specific engine.

In the screenshot to the right I have specified “de” for my online Italian dictionary of choice.

Once you have confirmed and closed the window, you can use the keywords for quick access to these engines, like this:

Click on the URL bar, or, better, enter it by using the appropriate shortcut, Ctrl-L
Type the keyword for the search engine, followed by your query, for instance “dm motore”, no matter what page you’re on. Hit Enter. image
Bang! you’re taken to the results page of your query on the search engine corresponding to the keyword, no questions asked, no clicks involved. image

Firefox offers yet another, perhaps not as widely known way of consulting specific search engines. These engines will not appear in the Search bar, but are still quickly accessible by using custom keywords chosen by the user.

Supposing we want to add a specific IATE search for engineering terms from English to Italian:

First, build a sample search by going to the IATE page and by setting your specific options. Click on the screenshot to the right to see how I set the options for my specific purpose. You can change the language combinations and sectors to your own preference.

image
Once all the desired options are in place, right-click on the “Search term” field and choose the option Add a Keyword for this Search… image
In the window that appears, type a descriptive name for this search in the “Name” field, and an easy-to-remember keyword in the Keyword field. Also, choose where you want to keep the relevant bookmark in the “Create in” field. Perhaps it’s a good idea to keep all the keyword searches in a separate folder in the bookmark structure. Press “Add” to confirm your choice. image
Time to test the search. Go to the URL bar (Ctrl-L), type the keyword (in this case “iatemec”), followed by the term. Then press enter. image
Bang! The list of results, relevant to the options (language combination, domain) that you have specified during the one-time creation of the search keyword.
This works particularly well for websites that insist you choose a plethora of options to narrow down your results every time you get to the initial search mask.
image

LTC Worx (web-based translation project management tool) version 1.3 released

image

Here’s an excerpt from the release announcement:

The Language Technology Centre is proud to announce the release of version 1.3 of LTC Worx, its cutting-edge web-based business system for multilingual information management.

LTC Worx adapts to existing processes, enabling users to optimise and manage them, all the way from the initial request to the final invoice generation.

It includes features for Project, Document and Finance Management, alongside Supplier and Client databases.

LTC Worx 1.3 has a range of major new features and many other smaller enhancements. New features include a new look and feel for the user interface and several enhancements to the timesheet module, as well as extended reporting facilities.

LTC – LTC Worx version 1.3 released

Another online PDF to Word conversion service — this time with OCR included

After the not-so great results I obtained with free online OCR services for PDF files (the main problem being that most services do not do OCR but just convert editable PDF text to Word and do not process embedded text graphics), I may have found a service that actually delivers on this promise: OnlineOCR.net. From the site’s own description:

OnlineOCR.net is a web-based Optical Character Recognition (OCR) service that allows you to convert scanned images and documents into editable Word, Text, Excel, PDF, Html output formats.

A couple of minor caveats
  • You need to get a (free) account if you want to convert PDF>DOC
  • The activation email I received ended up in my Gmail spam. So you may want to check your Spam folder if you think you have not received the activation message.
Testing the system

I did a test with a two-page PDF file containing editable text in fancy formatting on page 1 and text pasted in as lo-res graphics on page 2.

image The first thing that you’ll notice when uploading your first document is the language choice: this is very positive, as it means that the service will compare the scanned text to a language-specific wordlist to correct any errors.
image Options allow to specify that you are uploading a multi-page document and the pages that you want to convert.
image After the document has been processed (which took about 30 seconds in my test), you are taken to the “Workspace”, where a list of all processed documents is available. From there you just need to click on the link of your converted document to download it.

Results

The system worked fairly well with my test document. Page 1 was rendered without any spelling errors and this confirms my impression that the editable text contained in the PDF is preserved without running it through OCR, which is great. The system has added frames, section breaks and tables in order to render the “fancy” multi-column formatting of the source PDF file.

Page 2 of the DOC file, which contained the graphic text, was rendered with some errors. This was low resolution text, and you might obtain better results if using better-quality embedded graphic text. In this case, too, the formatting was rendered by inserting tables and section breaks.

One advantage that was immediately noticeable was the fact that OnlineOCR does a rather good job at preserving the original’s formatting and does this without adding superfluous carriage returns, which are such a nuisance for translators since they disrupt the sentence-by-sentence sequence used by most CAT tools.

Verdict

I could not find any information on the website that would indicate a payment plan for this service, so I would assume it’s offered for free. Considering the price, I think that this system is well worth a try if you need to convert a PDF file into an editable format. If the PDF document only (or mainly) contains editable text, you will be pleased by the results. If the file also contains text that has been pasted as graphic pages, the output will likely require some post-editing, but I think that will be comparable to what you may obtain with the majority of commercial OCR packages.

OnlineOCR.net

Anaphraseus (free, open-source, multi-platform translation memory tool based on OpenOffice) version 1.23 beta released

Here is a previous mini-review I wrote about this program.

These are the improvements added in this beta version:

  • Clean Up in text tables
  • OmegaT TMX format loading.
  • Slight changes in TM loading code.
  • Simple statistic.
  • Big icons for Ubuntu and MacOS
  • Fixed bug in creation *.ini file on Linux
  • Fixed bug in Vista open/save dialogs
  • Added Wordfast TM’s character codes
  • Code reviewed under Wordfast’s specifications
  • All TMX operation runs by "TMX Import" button now
  • Fixed bug with delimiter

Via: SourceForge.net: Anaphraseus: Files

Tool to translate Trados TagEditor (TTX) files using OmegaT

image Kevin Lossner of the Translation Tribulations blog reports the release of “Toxic”, a tool by the OmegaT developers that should allow translators to use OmegaT for translating files saved in TagEditor.

The script, which includes a “readme” instruction file, is available here:

http://www.omegat.org/resources/toxic.zip

Via Translation Tribulations: Toxic for OmegaT!

SpeechTechMag.com: AppTek Launches Hybrid Machine Translation Software

image The Speech Technology Magazine contains an article about AppTek’s hybrid machine translation software. Here’s a brief excerpt:

According to Hassan Sawaf, chief scientist at AppTek, the company’s hybrid model is unlike any other system on the market today—a fact that has lead some universities to attempt to copy the hybrid model.

“Even if companies attempt to hybridize they only do hybridization insofar as that they basically combine translation memory with machine translation,” he says. “Hybridization like we do and a tight integration of rule-based features and statistical-based features are unique.”

AppTek’s HMT solution provides a full integration of both methodologies instead of simply adding rules to the statistical system or a minor statistical module to the rule-based engine.

SpeechTechMag.com: AppTek Launches Hybrid Machine Translation Software

Translator handbook for open-source projects

image The YACS (yet another community system) blog contains a Translator handbook section that can be useful to translators interested in contributing to the translation effort of open-source projects.

Although the posts date back to 2007 and some instructions are specific to the YACS system, they can be useful to translators who are starting their first projects in the open-source world. Here are the links to the single articles:

Translator handbook – www.yacs.fr

Alchemy Catalyst 8.0 localization environment pre-announced

Alchemy CatalystIn a message sent out today to current users, Alchemy announced the imminent release of its localization environment Catalyst 8. Alchemy’s website does not seem to contain any information about version 8 yet.

The main problem with previous versions of Catalyst, in my opinion, is the overdone attempt of making its interface user-friendly through floating windows, bars, widgets, context menus, and a whole series of interface gadgets that are only accessible through mouse clicks. This only makes the program confusing. Sometimes I have to read through the manual several times before I can find out how to perform simple tasks, such as applying multiple filters to sentences. Let’s take a look at the main changes introduced with 8.

After skimming through the usual marketing blurb, here’s what I have found to be the most interesting new features:

Extensive Support for XML Content Management Systems: Alchemy ezParse technology has been extensively redesigned to support multi-lingual and conditional-based XML documents. The best bit!, as with all ezParse solutions, these parsers are developed in a highly visual development environment so you avoid writing the code yourself, Alchemy CATALYST does all the hard work for you!

My recollection is that the ezParse feature was not totally accessible in the Translator-Pro Edition. We’ll see if this has changed in any way.

The New Terminology Standard: TBX is rapidly becoming the standard for term-base sharing and lookups. Alchemy CATALYST 8.0 embraces this new standard and displays suggested terms within the Translator Toolbar. You can also export candidate terms to TBX files using the new and improved Export Project functionality.

This is a welcome introduction, although the previous version already allowed for simple terminology exchange through CSV files.

Machine Translation: Combine the accuracy of TM with the flexibility and speed of Machine Translation. Source segments that cannot be matched in a TM will automatically be sent to a web based Machine Translation service (MT) so that a translator always gets a translation suggestion while working.

As predicted by the main observers in this market, like Global Watchtower, almost all providers are scrambling to add some sort of machine translation functionality to their products. I’m skeptical that this overhyped technology will introduce any concrete benefits for professional translators, unless it’s considered as part as a whole process that includes strict terminology creation and control as a prerequisite.

Support for virtually every TM format: Alchemy CATALYST 8.0 is compatible with virtually every industry TM standard. It supports both desktop and server based TM technology, all integrated seamlessly into the Translator Toolbar. Technologies currently supported include Alchemy TTK (all versions), Alchemy PPF (PUBLISHER TM), Alchemy LANGUAGE EXCHANGE, WordFast, GlobalLink, SDL TM Server, Trados Teamworks, Trados Workbench, XLIFF, TBX, XML and TMX.

This could be interesting. Especially if it allows to export a non-visual Catalyst project (like error strings that do not have any graphical context attached) as a XLIFF file that can be translated with a “text-only” CAT tool more flexible than Catalyst and then imported back to Catalyst without problems. We’ll see if that is possible. The import-export functionalities offered by the current version 7 are sparse to say the least.

Enhanced Total SDL/TRADOS Compatibility: Work seamlessly with past, present and future versions of SDL TRADOS desktop and enterprise technologies such as SDL TM Server, TRADOS Translator’s Workbench (3.x, 5.x, 6.x, 8.0, 2007) and MultiTerm iX Server. No other tool gives you such wide range industry support for 3rd party TM tools.

Let’s hope that the integration will as seamless as described above. In its current incarnation, the integration between Catalyst and Trados is far from that.

Conclusion:

I think that most localization professionals do not care about fine-tuning the placement of their toolbars and windows. Instead, they require a solid product with an easy to understand workflow. In my personal experience, Catalyst 7, while offering a huge list of features as far as supported environments are concerned, left something to be desired in terms of usability.

The list of improvements included in the announcement e-mail is very long and impressive. Hopefully Catalyst 8 will be able to deliver on those promises.

Olifant Candidate Release 22 available

imageOlifant is a utility that can be used to maintain translation memory files. It can import (even by drag-and-drop) translation memories in the TMX, tab-delimited and WordFast formats, and it can export to TMX or WordFast.

Olifant allows, among other things, to perform the following tasks on translation memories:

  • Flag and remove duplicate entries
  • Create a single tri-lingual TM from two separate bi-lingual TMs
  • Reverse the source and target languages of a TM
  • Open a TM that contains invalid XML characters
  • Remove formatting codes (e.g. <bpt>, <ept>, etc.) from the TM segments
  • Search and replace text using regular expressions
  • Filter the entries based on various criteria
  • Partially export the TM
  • Find exact and fuzzy matches or concordances for the current TM entry

Olifant is free software distributed under the GNU Lesser General Public License.

Olifant (Candidate)

University of Ottawa – Survey on use of terminology management systems integrated to translation environment tools

The aim of this study is to learn about the community’s perception of terminology management systems integrated with translation environment tools as well as to find out more about the approaches taken regarding their use.

via Use of Terminology Management Systems Integrated to Translation Environment Tools.