ATA SLD

Slavic Languages Division (American Translators Association)

American Translators Association: The Voice of Interpreters and Translators

  • Home
  • About Us
  • Blog
    • Comments Policy and Disclaimer
  • SlavFile
  • Resources
    • Slavic Languages Presentations Archive
  • Contact Us
  • SLD Podcast

Upcoming November ATA webinars

November 7, 2021

The ATA is offering another round of webinars over the next few weeks that should prove to be fun and informative!

Introduction to Mobile App Localization

There’s an app for everything these days, and there’s now an ATA webinar on mobile app localization, too! 🙂 
Presenter Dorota Pawlak will give us an introduction to mobile app localization and the role of translators working in this field. Drawing on years of experience in this field, the speaker will explain what skills, tools, and qualities are needed to localize mobile apps; what are some of the most common issues in mobile app localization projects; and how to solve them.

Join us on November 9 or sign up to get the recording and 1 ATA CEP
https://www.atanet.org/event/introduction-to-mobile-app-localization/

Registration closes: November 9, 10:00 am EST

memoQ for Intermediate and Advanced Users

Join this webinar to take your knowledge of memoQ to the next level: in this session, you will learn useful tips and tricks that can make your work as a translator a lot easier.
You will also learn how to search your preferred websites directly from the translation grid and how to connect to a machine translation provider to be able to use MT in your work. The trainer will also show you how to automate your processes using templates in memoQ and how to fine-tune the import of your documents with the help of powerful import filters.
Register at https://www.atanet.org/event/memoq-for-intermediate-and-advanced-users/
Remember that ATA members can save 35% on new licenses for memoQ translator pro. 

Join us on November 12 at 12 pm EST (recording will be available) / 1 ATA CEP
Registration Closes:
 November 12, 10:00 am EST

Intermediate Tips and Tricks for Trados Studio


This hands-on webinar will explore useful features that will take you a step closer to becoming a power user of the most powerful and popular CAT tool in the market.
This webinar was organized in collaboration with RWS.

You will learn how to:

  1. Identify and modify file type options
  2. Work with a translation memory’s language resources
  3. Use apps to extend Trados Studio’s functionality
  4. Use machine translation for pre-translation and interactive translation
  5. Set up verification option

Register at https://www.atanet.org/event/intermediate-tips-and-tricks-for-trados-studio/

Remember that ATA members can save 35% on Trados Studio 2021 Freelance and Trados Studio 2021 Freelance Plus.

Join us on November 17 at 12 pm EST (recording will be available) / 1 ATA CEP

Registration Closes: November 17, 10:00 am EST

Filed Under: Professional Development, Webinars Tagged With: CAT tools, localization, professional development, webinar

A [Better] CAT Breed for the Slavic Soul

July 12, 2017

A review by Jennifer Guernsey

Aha! I said to myself upon spying this presentation among the 2013 ATA Conference’s offerings. At last, I will find out which elusive CAT tool actually does a good job with Slavic languages! I had tried several tools, but hadn’t yet run across one that was able to accommodate the peculiarities of my language, Russian, particularly when it came to all of the inflected forms.

Alas, it took no more than two slides for me to be sorely disappointed – not in Konstantin Lakshin’s presentation, but in the sad news that there is, in fact, no such thing as a good CAT tool for Slavic languages. Or, at least, there isn’t yet.

Despite my initial dismay at the news, I fortunately stayed to hear the entire presentation. It can be briefly summarized as follows: A combination of technical, linguistic, and particularly market forces have conspired to make CAT tools what they are today: decidedly Slavic-unfriendly. The good news is that many of the pieces needed to improve them already exist, and it’s up to us to put pressure on developers and companies to make use of those pieces.

The reason it took the better part of an hour to provide this information is that the presentation included a lot of very interesting history, examples, and details. It really was quite educational, at least for me.

Kostya started by outlining the history of computer use in translation, and the development of CATs in particular. He began with a discussion of a 1966 government-funded report by the Automatic Language Processing Advisory Committee on the use of computer technology in translation. The gist of this report as it applies to our CAT tool discussion is that machine translation doesn’t work well, but that something vaguely resembling what we now consider a CAT tool, with a similar workflow, might be useful. This pseudo-CAT workflow used the punch card operator – i.e., a human being – as a morphology analyzer. This is interesting, because one of our principal complaints about today’s CAT tools is that they do not have morphology analysis capability. The report also compared use of this early form of CAT with a standard translation process, and found that while it might save some time, its primary advantage was that it “relieve[d] the translator of the unproductive and tiresome search for the correct technical terms.” The report emphasized that compiling the proper termbase was really the key to an effective translation tool.

In the decade or so following the report, the emphasis in computer-assisted translation was thus on building termbanks. In other words, the focus was on words and phrases – small subsegments, if you will – and these termbanks were generally compiled for specific large organizations operating in specific contexts and were not readily transferrable to other entities.

The philosophy that drives current CAT tools – the “recycling” of previously translated texts – emerged fully only in 1979, though large corporations had begun exploring this starting in the late 1960s. This philosophy was in great part a result of the requirements and technologies in place at the time. In the 1960s, for instance, the world was a less integrated place, and there was limited control over the input side – the source text content, editing, and so on. The example Kostya provided was scientific texts coming out of the USSR that were being translated. Fast-forward to the 1980s and 1990s: large corporations have end-to-end control of processes and utilize translation (and translation technology) for their own documents. In this latter context, being able to retrieve and reuse entire sentences made a lot of sense. Note also that in the prevailing markets in which the early CAT tools developed, the primary languages were not highly inflected.

In the late 1980s and early 1990s, the first commercially available CAT tools appeared: IBM Translation Manager II, XL8, Eurolang, and two still-familiar tools, Trados and Star Transit. Trados, in particular, started life as a language services provider trying to get an IBM contract.

The mid- to late 1990s saw the emergence of tools being created ostensibly for translators: Déjà Vu, Memo Q, and WordFast. However, rather than being fundamentally different from their larger predecessors, these often turned out to be essentially smaller, less functional versions of Trados. This era also witnessed the development of smaller commercial players, such as WordFisher (a set of Word macros) and in-house tools such as LionBridge, Foreign Desk, and Rainbow (specifically for software localization), as well as Omega T, the first open-source CAT tool.

That brings us to the present day, the 2000s, when there are too many CAT tools to list, and there have been many mergers and acquisitions among them. However, NONE of the existing tools can be considered very useful for Slavic or other highly inflected languages. In addition to the reasons noted above, there were other issues that contributed to this situation as the software was being developed. First, there were no obvious ways to incorporate Cyrillic into early software. Second, there were additional market forces, such as software piracy, the cross-border digital divide, and the lack of major clients, that provided little incentive to software developers to make CAT tools that would be particularly useful in Slavic-language markets.

Today, we have a much wider playing field in terms of the market for translation. Translation work is “messier” now, and involves things like corporate rebranding and renaming, a variety of dialects and non-native speech, outsourcing, rewrites for search engine optimization, and bidirectional editing in which both source and target documents are being modified. In this environment, the old “termbase plus recycled text” CAT model is not sufficient.

From this historical background, Kostya next proceeded to illustrate just what the difficulties are that Slavic languages present for today’s CAT tools. These can be boiled down to their relatively free word order, their rich morphology, and their highly inflected nature. The CAT tool’s “fuzzy match” capabilities are insufficient for Slavic languages.

Kostya then provided a number of illustrative examples. Consider the following pairs of segments:

To open the font menu, press CTRL+1.

Press CTRL+1 to open the font menu.

Analyzing and characterizing behaviors

Analysing and characterising behaviours

He ran these and other examples through about a half-dozen CAT tools using a 50% match cutoff, and found that the first example was considered only a 60-80% match, and the second was 0% (in other words, below the 50% threshold). The CAT tools on the market generally do not recognize partial segments in a different order, nor can they tell that “analyzing” and “analysing” are essentially the same word. In other words, they lack language-specific subsegment handling, and morphology-aware matching, searching, and term management. They are also missing form agreement awareness (e.g., noun/adjective case agreement). This diminishes their utility for those translating out of Slavic languages, to be sure, but it also complicates matters for those translating into Slavic languages, as word endings in retrieved fuzzy matches must constantly be checked and corrected.

The obvious question that Kostya next asked is, can this situation be fixed? In theory, yes. Kostya believes that many software tools already in use by search engines, machine translation, and the like could be integrated into CAT tools. These include Levenshtein distance analyzers that can handle differences within words; computational linguistics tools such as taggers, parsers, chunkers, tokenizers, stemmers, and lemmatizers, which analyze such things as syntax and word construction; morphology modules; and even Hunspell, the engine already in use by numerous CAT tools for spellchecking but not for analyzing matches.

Developers continue to cite obstacles to integrating these tools: it’s complicated, they are too language-specific, we don’t know how to set up the interface, there are licensing issues, we have limited resources. While all of these are legitimate factors, Kostya believes that they do not present insurmountable obstacles. He is hopeful that developers will start seeing these tools as data abstraction tools that enable the software to break down the data into something that is no longer language-specific.

So what can we do about this lack of suitable CAT tools? Kostya’s recommendation is principally that we talk to software developers and vendors and explain what we want. We need to create our own market pressure to move things along. In addition, we need to educate developers and vendors about the existing tools that are available; for instance, we might point them to non-English search engines that utilize morphology analyzers.

Alas, there is neither a good CAT tool for the Slavic soul nor a quick fix to this situation. But after listening to Kostya’s presentation, I have a much better understanding of how this situation developed and how we might take action to prompt vendors and developers to move in a new direction.

Filed Under: Annual Conferences, Tools, Translation Tagged With: CAT tools

Recent Posts

  • SlavFile Reprint – Tracking Down Russian Historical Terminology: A Tale of Two Terms and Two Resources
  • Don’t Miss SLD’s Winter Networking Meetup!
  • SlavFile Reprint – Translating Okudzhava: Turning «Песенка старого шарманщика» into “The Organ-Grinder Ditty”
  • SlavFile Reprint – A Volunteer Opportunity: English for Ukrainian Newcomers
  • SlavFile Reprint: Interview with ATA63 DS Dmitry Buzadzhi

SLD Facebook Page

SLD Facebook Page

SLD on Twitter

My Tweets

Subscribe to Blog via Email

Enter your email to subscribe to SLD blog.

SLD Blog Categories

SLD Blog Tags

Administrative ATA ATA58 ATA59 ATA60 ATA61 ATA63 audiovisual AVT blog business CAT tools certification ceu watch conference editing events feedback history human rights interpreting interview legal literary localization marketing medical member profile networking newcomers podcast Polish professional development project management Russian SEO series session review SlavFile SLD practice group specializations subtitling survey translation webinar

Recent Comments

  • Eugenia Tietz-Sokolskaya on SLD’s Upcoming Networking Meetup
  • Dmitry Beschetny on SLD’s Upcoming Networking Meetup

Search this website

Copyright © 2023 · ATA Slavic Languages Division