The gText global project provides internal and contractual translators at the four duty stations of the Department for General Assembly and Conference Management, regional commissions and other United Nations entities with a complete and uniform suite of Internet-based language tools, as well as seamless access to background information necessary for high-quality translation.
The eLUNa suite of language tools, which was designed to meet the unique needs of United Nations language professionals, is entirely web based and is continuously improved based on users’ feedback and requests for new functions. The suite of tools comprises a translation and revision interface, an editorial interface and a search application, as well as a series of additional supporting applications for document and terminology management and a machine-translation tool developed by the World Intellectual Property Organization in collaboration with the Department for General Assembly and Conference Management.
eLUNa: the United Nations computer-assisted translation interface
At the core of the tools developed by gText is eLUNa (electronic languages of the United Nations), which is a user-friendly web-based translation tool developed in-house for United Nations translators. It combines automatic identification of all previously translated sentences and terminology, with access to machine translation for all new sentences.
eLUNa reuses previously translated text, automatically recognizes specialized terminology stored in the UNTERM portal, provides references to all United Nations documents in bilingual format through hyperlinks and preserves formatting. As a web-based tool, it can be used by translators working remotely, making it possible for contractual staff to also benefit from its time-saving and consistency-enhancing features.
eLUNa Editorial: interface for editors
eLUNa Editorial is an editing programme that has been custom-built for editors at the United Nations. Unedited documents are uploaded to the programme and editors are then able to access a range of tools for editing tasks.
Edits made in documents in the editorial interface are automatically transferred in tracked changes to the translation interface, facilitating same-time work and parallel processing of United Nations documents.
eLUNa VRS: interface for verbatim reporting
eLUNa VRS is a prototype of a drafting interface developed for United Nations verbatim reporters that incorporates the main features of the eLUNa translation and editorial interfaces and adapts them to the specific needs for the drafting of verbatim records. Like the rest of the eLUNa suite of language tools, the eLUNa VRS interface provides detection of terminology, symbols and reprise, and full-text search.
eLUNa Search: multilingual search engine
eLUNa Search is a web-based search engine designed to retrieve text from the eLUNa collection of documents translated at the main duty stations and regional commissions of the United Nations system. Search results are presented in monolingual, bilingual or trilingual format as a list of segments in the language of the search and their corresponding translations in the target language or languages selected.
eLUNa Converter: Word-to-AKN4UN tool to create machine-readable documents
eLUNa Converter automatically converts General Assembly resolutions from Microsoft Word format into the AKN4UN format in one click. The converter identifies the main elements of the resolution (such as operative and preambular paragraphs, session number, agenda items, adoption date, etc.) and labels these elements to produce a structurally marked-up, machine-readable document in AKN4UN format. It also retrieves additional information that is not present in the document itself (such as sponsorship information, voting records and related documents), and the converted resolutions and all of the metadata can then be used to create the official books of resolutions.
Making our United Nations documents machine-readable will open a wide range of possibilities. Readers will not only be able to search and access documents based on specific data (such as agenda item, mandate or related documents) but also be able to see the relationships between documents and have those connections presented visually, in graphs and charts that make them clear and understandable.
To learn more about this project, visit the machine-readability web page, which provides access to the resolutions adopted by the General Assembly at the main part of its seventy-fourth session in machine-readable format in the six official languages and to a proof-of-concept interactive report that displays the data contained in the resolutions through a series of graphics and visualizations.
UNTERM: United Nations terminology system
UNTERM provides terminology and nomenclature in subjects relevant to the work of the United Nations in the six official languages of the United Nations, as well as in German and Portuguese.
The UNTERM portal features hundreds of thousands of terms from the four main duty stations, regional commissions and the United Nations Educational, Scientific and Cultural Organization, including official country names, phraseology data sets, and a collection of geographical and proper names. The portal also functions as a terminology management system that enables collaborative terminology creation by users through a feedback and queue mechanism.
The UNTERM portal for United Nations terminology can be accessed round the clock by translators, other United Nations staff members, delegates and individual users from around the globe.
TAPTA4UN: United Nations statistical machine translation system
Tapta4UN is a statistical machine translation tool developed in collaboration with the World Intellectual Property Organization, specifically trained with United Nations documents to provide an output that is consistent with United Nations style and terminology. First available as a stand-alone tool, it has now been embedded into the eLUNa environment, both as a default option and on a segment-by-segment basis, and is available for use in the six official languages.
United Nations Parallel Corpus
The United Nations Parallel Corpus is a collection of official records and other parliamentary documents of the United Nations that are in the public domain and available, for the most part, in the six official languages.
The Corpus was created as part of the United Nations commitment to multilingualism and as a reaction to the growing importance of statistical machine translation within the translation services of the Department for General Assembly and Conference Management and the United Nations statistical machine translation system, Tapta4UN.
The purpose of the Corpus is to allow access to multilingual language resources and facilitate research and progress in various natural language processing tasks, including machine translation. The Corpus is also available pre-packaged as language-specific bi-texts and as a six-language parallel subset.
The Corpus has been tested with neural machine translation, a new translation system based on neural networks, and will be extremely valuable for further research in this newly developed and promising field.