Apertium

Apertium
	Apertium-tolk, a simple desktop user interface for Apertium that translates as the user types
Stable release	3.8.3 / 1 November 2022; 17 months ago
Repository	github.com/apertium
Written in	C++
Operating system	POSIX compatible and Windows NT (limited support)
Available in	35 languages, see below
Type	Rule-based machine translation
License	GNU General Public License
Website	www.apertium.org

Apertium

Open-source rule-based machine translation platform

Apertium is a free/open-source rule-based machine translation platform. It is free software and released under the terms of the GNU General Public License.

Quick Facts Stable release, Repository ...

Translation methodology

Pipeline of Apertium machine translation system

This is an overall, step-by-step view how Apertium works.

The diagram displays the steps that Apertium takes to translate a source-language text (the text we want to translate) into a target-language text (the translated text).

Source language text is passed into Apertium for translation.
The deformatter removes formatting markup (HTML, RTF, etc.) that should be kept in place but not translated.
The morphological analyser segments the text (expanding elisions, marking set phrases, etc.), and looks up segments in the language dictionaries, returning dictionary forms and tags for all matches. In pairs that involve agglutinative morphology, including a number of Turkic languages, a Helsinki Finite State Transducer (HFST) is used. Otherwise, an Apertium-specific finite state transducer system called lttoolbox,^[19] is used.
The morphological disambiguator (the morphological analyser and the morphological disambiguator together form the part of speech tagger) resolves ambiguous segments (i.e., when there is more than one match) by choosing one match. Apertium uses Constraint Grammar rules (with the vislcg3 parser^[20]) for most of its language pairs.
Retokenisation uses a finite state transducer to match sequences of lexical units and may reorder or translate tags (often used for translating idiomatic expressions into something that more approaches the target language grammar)
Lexical transfer looks up disambiguated source-language basewords to find their target-language equivalents (i.e., mapping source language to target language). For lexical transfer, Apertium uses an XML-based dictionary format called bidix.^[21]
Lexical selection chooses between alternative translations when the source text word has alternative meanings. Apertium uses a specific XML-based technology, apertium-lex-tools,^[22] to perform lexical selection.
Structural transfer (i.e., it is an XML format that allows writing complex structural transfer rules) can consist of one-step chunking transfer, three-step chunking transfer or a CFG-based transfer module. The chunking modules flag grammatical differences between the source language and target language (e.g. gender or number agreement) by creating a sequence of chunks containing markers for this. They then reorder or modify chunks in order to produce a grammatical translation in the target-language. The newer CFG-based module matches input sequences into possible parse trees, selecting the best-ranking one and applying transformation rules on the tree.
The morphological generator uses the tags to deliver the correct target language surface form. The morphological generator is a morphological transducer,^[23] just like the morphological analyser. A morphological transducer both analyses and generates forms.
The post-generator makes any necessary orthographic changes due to the contact of words (e.g. elisions).
The reformatter replaces formatting markup (HTML, RTF, etc.) that was removed by the deformatter in the first step.
Apertium delivers the target-language translation.

Language pairs

List of currently stable language pairs, hover over the language codes to see the languages that they represent.

More information af, ar ...

	`af`	`ar`	`an`	`ast`	`eu`	`br`	`bg`	`ca`	`da`	`nl`	`en`	`eo`	`fi`	`fr`	`gl`	`de`	`hin`	`is`	`id`	`it`	`kaz`	`mk`	`ms`	`mt`	`sme`	`nb`	`nn`	`oc`	`pt`	`ro`	`sc`	`hbs`	`slv`	`es`	`sv`	`tat`	`urd`	`cy`
Afrikaans	—	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Arabic	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (←)	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Aragonese	No	No	—	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No
Asturian	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No
Basque	No	No	No	No	—	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (→)	No	No	No	No
Breton	No	No	No	No	No	—	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Bulgarian	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Catalan	No	No	Yes (⇄)	No	No	No	No	—	No	No	Yes (⇄)	Yes (→)	No	Yes (⇄)	No	No	No	No	No	Yes (←)	No	No	No	No	No	No	No	Yes (⇄)	Yes (⇄)	No	Yes (→)	No	No	Yes (⇄)	No	No	No	No
Danish	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	Yes (⇄)	No	No	No	No	No	No	No	Yes (←)	No	No	No
Dutch	Yes (⇄)	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
English	No	No	No	No	Yes (←)	No	No	Yes (⇄)	No	No	—	Yes (⇄)	No	No	Yes (⇄)	No	No	Yes (←)	No	No	No	Yes (←)	No	No	No	No	No	No	No	No	No	Yes (←)	No	Yes (⇄)	No	No	No	Yes (←)
Esperanto	No	No	No	No	No	No	No	Yes (←)	No	No	Yes (⇄)	—	No	Yes (←)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Finnish	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
French	No	No	No	No	No	Yes (←)	No	Yes (⇄)	No	No	No	Yes (→)	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	Yes (⇄)	No	No	No
Galician	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	Yes (⇄)	No	No	No	No
German	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Hindi	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No
Icelandic	No	No	No	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No
Indonesian	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Italian	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No
Kazakh	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No
Macedonian	No	No	No	No	No	No	Yes (⇄)	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	Yes (←)	No	No	No	No	No	No
Malaysian	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Maltese	No	Yes (→)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No	No	No	No	No	No	No	No
Northern Sami	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	Yes (→)	No	No	No	No	No	No	No	No	No	No	No	No
Norwegian (Bokmål)	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (←)	—	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No
Norwegian (Nynorsk)	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	—	No	No	No	No	No	No	No	No	No	No	No
Occitan	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	Yes (←)	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	No	Yes (⇄)	No	No	No	No
Portuguese	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	Yes (⇄)	No	No	No	No
Romanian	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No	Yes (←)	No	No	No	No
Sardinian	No	No	No	No	No	No	No	Yes (←)	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	—	No	No	No	No	No	No	No
Serbo-Croatian	No	No	No	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	No	—	Yes (⇄)	No	No	No	No	No
Slovenian	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	—	No	No	No	No	No
Spanish	No	No	Yes (⇄)	Yes (⇄)	Yes (←)	No	No	Yes (⇄)	No	No	Yes (⇄)	Yes (→)	No	Yes (⇄)	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	Yes (⇄)	Yes (←)	No	No	No	—	No	No	No	No
Swedish	No	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No	No
Tatar	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No	No
Urdu	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes (⇄)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—	No
Welsh	No	No	No	No	No	No	No	No	No	No	Yes (→)	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	—

Share this article:

This article uses material from the Wikipedia article Apertium, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[wikidata-5988a180b3c7edc7b024a8b19b21c97b813090f6-v11-1] [1]
"Release 3.8.3 Latest". 1 November 2022. Retrieved 2 March 2023.

[2] [2]
Francis M. Tyers (2010) "Rule-based Breton to French machine translation Archived 2016-11-17 at the Wayback Machine". 'Proceedings of the 14th Annual Conference of the European Association of Machine Translation, EAMT10', pp. 174--181

[3] [3]
Khanna, Tanmai; Washington, Jonathan N.; Tyers, Francis M.; Bayatlı, Sevilay; Swanson, Daniel G.; Pirinen, Tommi A.; Tang, Irene; Alòs i Font, Hèctor (1 December 2021). "Recent advances in Apertium, a free/open-source rule-based machine translation platform for low-resource languages". Machine Translation. 35 (4): 475–502. doi:10.1007/s10590-021-09260-6. hdl:10037/22990.

[4] [4]
"Apertium".

[5] [5]
"Accepted organizations for Google Summer of Code 2009".

[6] [6]
"Accepted organizations for Google Summer of Code 2010".

[7] [7]
"Accepted organizations for Google Summer of Code 2011".

[8] [8]
"Accepted organizations for Google Summer of Code 2012".

[9] [9]
"Accepted organizations for Google Summer of Code 2013".

[10] [10]
"Accepted organizations for Google Summer of Code 2014".

[11] [11]
"Accepted organizations for Google Code-in 2010".

[12] [12]
"Accepted organizations for Google Code-in 2011".

[13] [13]
"Accepted organizations for Google Code In 2012".

[14] [14]
"Accepted organizations for Google Code-in 2013".

[15] [15]
"Accepted organizations for Google Code-in 2014".

[16] [16]
"Accepted organizations for Google Code-in 2015".

[17] [17]
"Accepted organizations for Google Code-in 2016".

[18] [18]
"Accepted organizations for Google Code-in 2017".

[19] [19]
"Lttoolbox - Apertium". wiki.apertium.org. Retrieved 2016-01-19.

[20] [20]
"VISL". beta.visl.sdu.dk. Retrieved 2016-01-19.

[21] [21]
"Bilingual dictionary - Apertium". wiki.apertium.org. Retrieved 2016-01-19.

[22] [22]
"Constraint-based lexical selection module - Apertium". wiki.apertium.org. Retrieved 2016-01-19.

[23] [23]
"Morphological dictionary - Apertium". wiki.apertium.org. Retrieved 2016-01-19.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

Apertium

Apertium

Overview

History

Translation methodology

Language pairs

See also

Notes

References

External links

End-user services and software

Online translation websites

Offline applications

Share this article:


Apertium-tolk, a simple desktop user interface for Apertium that translates as the user types

Stable release	3.8.3^[1] / 1 November 2022; 17 months ago (1 November 2022)

Repository	github.com/apertium
Written in	C++
Operating system	POSIX compatible and Windows NT (limited support)
Available in	35 languages, see below
Type	Rule-based machine translation
License	GNU General Public License
Website	www.apertium.org