Creating glossaries for MT and AI: why less is sometimes more
Anyone who uses machine translation (MT) or AI systems for translation is familiar with this problem: AI sometimes translates "Bolzen" as "bolt", sometimes as "stud" – depending on the day of the week. This results in a lot of post-editing being required, which quickly eats up the cost savings of AI. Glossaries have proven to be a tried-and-tested tool for minimising these errors. In our projects, we have reduced terminology corrections by more than half by integrating a glossary. This may lead people to assume that the more terminology you specify, the greater the benefit, but that is not true. An overcrowded glossary may even lead to more errors. The sheer volume overwhelms the system and, in the end, individual terms overshadow the desired end result. As is so often the case, the familiar principle also applies to glossaries: less is sometimes more.