请输入您要查询的百科知识:

 

词条 Apertium
释义

  1. Overview

  2. History

  3. How Apertium works

  4. Language pairs

  5. See also

  6. Notes

      References  

  7. External links

  8. End-user services and software

      Online translation websites    Offline applications  
{{Infobox software
| name = Apertium
| logo = Apertium logo.svg
| logo_size = 250px
| screenshot = Apertium-tolk.png
| screenshot_size = 150px
| caption = Apertium-tolk, a simple desktop user interface for Apertium that translates as the user types
| collapsible =
| author =
| developer =
| released =
| discontinued =
| ver layout = simple
| latest release version = 3.5.1[1]
| latest release date = {{start date and age|2018|03|31}}
| latest preview version =
| latest preview date =
| programming language = C++
| operating system = POSIX compatible
| platform =
| language = 35 languages, see below
| language count =
| language footnote =
| genre = Rule-based machine translation
| license = GNU General Public License
| website = {{url|https://www.apertium.org}}
| repo = {{url|https://github.com/apertium}}
| AsOf =
}}

Apertium is a free/open-source rule-based machine translation platform. It is free software and released under the terms of the GNU General Public License.

Overview

Apertium is a shallow-transfer machine translation system, which uses finite state transducers for all of its lexical transformations, and hidden Markov models for part-of-speech tagging or word category disambiguation. Constraint Grammar taggers are also used for some language pairs (e.g. Breton–French).[2]

Existing machine translation systems available at present are mostly commercial or use proprietary technologies, which makes them very hard to adapt to new usages; furthermore, they use different technologies across language pairs, which makes it very difficult, for instance, to integrate them in a single multilingual content management system.

Apertium uses a language-independent specification, to allow for the ease of contributing to Apertium, more efficient development, and enhancing the project's overall growth.

At present, Apertium has released 40 stable language pairs, delivering fast translation with reasonably intelligible results (errors are easily corrected). Being an open-source project, Apertium provides tools for potential developers to build their own language pair and contribute to the project.

History

Apertium originated as one of the machine translation engines in the project OpenTrad, which was funded by the Spanish government, and developed by the Transducens research group at the Universitat d'Alacant. It was originally designed to translate between closely related languages, although it has recently been expanded to treat more divergent language pairs. To create a new machine translation system, one just has to develop linguistic data (dictionaries, rules) in well-specified XML formats.

Language data developed for it (in collaboration with the Universidade de Vigo, the Universitat Politècnica de Catalunya and the Universitat Pompeu Fabra) currently support (in stable version) Arabic, Aragonese, Asturian, Basque, Breton, Bulgarian, Catalan, Danish, English, Esperanto, French, Galician, Hindi, Icelandic, Indonesian, Italian, Kazakh, Macedonian, Malaysian, Maltese, Northern Sami, Norwegian (Bokmål and Nynorsk), Occitan, Portuguese, Romanian, Sardinian, Serbo-Croatian, Slovene, Spanish, Swedish, Tatar, Urdu, and Welsh languages. A full list is available below. Several companies are also involved in the development of Apertium, including Prompsit Language Engineering, Imaxin Software and Eleka Ingeniaritza Linguistikoa.

The project has taken part in the 2009,[3] 2010,[4] 2011,[5] 2012,[6] 2013[7] and 2014[8] editions of Google Summer of Code and the 2010,[9] 2011,[10] 2012,[11] 2013,[12] 2014,[13] 2015,[14] 2016[15] and 2017[16] editions of Google Code-In.

How Apertium works

This is an overall, step-by-step view how Apertium works.

The diagram displays the steps that Apertium takes to translate a source-language text (the text we want to translate) into a target-language text (the translated text).

  1. Source language text is passed into Apertium for translation.
  2. The deformatter removes formatting markup (HTML, RTF, etc) that should be kept in place but not translated.
  3. The morphological analyser segments the text (expanding elisions, marking set phrases, etc), and look up segments in the language dictionaries, then returning baseform and tags for all matches. In pairs that involve agglutinative morphology, including a number of Turkic languages, a Helsinki Finite-State Transducer (HFST) is used. Otherwise, an Apertium-specific technology, called the lttoolbox,&91;17&93; is used.
  4. The morphological disambiguator (the morphological analyser and the morphological disambiguator together form the part of speech tagger) resolves ambiguous segments (i.e., when there is more than one match) by choosing one match. Apertium is working on installing more Constraint Grammar frameworks for its language pairs, allowing the imposition of more fine-grained constraints than would be otherwise possible. Apertium uses the Visual Interactive Syntax Learning Constraint Grammar Parser.&91;18&93;
  5. Lexical transfer looks up disambiguated source-language basewords to find their target-language equivalents (i.e., mapping source language to target language). For lexical transfer, Apertium uses an XML-based dictionary format called bidix.&91;19&93;
  6. Lexical selection chooses between alternative translations when the source text word has alternative meanings. Apertium uses a specific XML-based technology, apertium-lex-tools,&91;20&93; to perform lexical selection.
  7. Structural transfer (i.e., it's an XML format that allows writing complex structural transfer rules) can consist of a one-step transfer or a three-step transfer module. It flags grammatical differences between the source language and target language (e.g. gender or number agreement) by creating a sequence of chunks containing markers for this. It then reorders or modifies chunks in order to produce a grammatical translation in the target-language. This is also done using lttoolbox.
  8. The morphological generator uses the tags to deliver the correct target language surface form. The morphological generator is a morphological transducer,&91;21&93; just like the morphological analyser. A morphological transducer both analyses and generates forms.
  9. The post-generator makes any necessary orthographic changes due to the contact of words (e.g. elisions).
  10. The reformatter replaces formatting markup (HTML, RTF, etc) that was removed by the deformatter in the first step.
  11. Apertium delivers the target-language translation.

Language pairs

List of currently stable language pairs, hover over the language codes to see the languages that they represent.

See also

{{Portal|Free and open-source software}}{{div col|colwidth=30em}}
  • Babel Fish (discontinued; redirects to main Yahoo! site)
  • Comparison of machine translation applications
  • Jollo (discontinued)
  • List of natural language processing toolkits
  • Matxin
  • Microsoft Translator
  • Moses
  • OpenLogos
  • SYSTRAN
  • Yandex.Translate
{{div col end}}

Notes

1. ^https://github.com/apertium/apertium/releases
2. ^Francis M. Tyers (2010) "Rule-based Breton to French machine translation". 'Proceedings of the 14th Annual Conference of the European Association of Machine Translation, EAMT10', pp. 174--181
3. ^{{Cite web|url = https://www.google-melange.com/gsoc/org/list/public/google/gsoc2009|title = Accepted organizations for Google Summer of Code 2009|date = |accessdate = |website = |publisher = |last = |first = }}
4. ^{{Cite web|url = https://www.google-melange.com/gsoc/org/list/public/google/gsoc2010|title = Accepted organizations for Google Summer of Code 2010|date = |accessdate = |website = |publisher = |last = |first = }}
5. ^{{Cite web|url = https://www.google-melange.com/gsoc/org/list/public/google/gsoc2011|title = Accepted organizations for Google Summer of Code 2011|date = |accessdate = |website = |publisher = |last = |first = }}
6. ^{{Cite web|url = https://www.google-melange.com/gsoc/org/list/public/google/gsoc2012|title = Accepted organizations for Google Summer of Code 2012|date = |accessdate = |website = |publisher = |last = |first = }}
7. ^{{Cite web|url = https://www.google-melange.com/gsoc/org/list/public/google/gsoc2013|title = Accepted organizations for Google Summer of Code 2013|date = |accessdate = |website = |publisher = |last = |first = }}
8. ^{{Cite web|url = https://www.google-melange.com/gsoc/org/list/public/google/gsoc2014|title = Accepted organizations for Google Summer of Code 2014|date = |accessdate = |website = |publisher = |last = |first = }}
9. ^{{Cite web|url = https://www.google-melange.com/archive/gci/2010|title = Accepted organizations for Google Code-in 2010|date = |accessdate = |website = |publisher = |last = |first = }}
10. ^{{Cite web|url = https://www.google-melange.com/archive/gci/2011|title = Accepted organizations for Google Code-in 2011|date = |accessdate = |website = |publisher = |last = |first = }}
11. ^{{Cite web|url = https://www.google-melange.com/archive/gci/2012|title = Accepted organizations for Google Code In 2012|date = |accessdate = |website = |publisher = |last = |first = }}
12. ^{{Cite web|url = https://www.google-melange.com/archive/gci/2013|title = Accepted organizations for Google Code-in 2013|date = |accessdate = |website = |publisher = |last = |first = }}
13. ^{{Cite web|url = https://www.google-melange.com/archive/gci/2014|title = Accepted organizations for Google Code-in 2014|date = |accessdate = |website = |publisher = |last = |first = }}
14. ^{{Cite web|title = Accepted organizations for Google Code-in 2015|url = https://codein.withgoogle.com/archive/2015/organization/|website = |access-date = }}
15. ^{{Cite web|title = Accepted organizations for Google Code-in 2016|url = https://codein.withgoogle.com/organizations/|website = |access-date = }}
16. ^{{Cite web|title = Accepted organizations for Google Code-in 2017|url = https://codein.withgoogle.com/organizations/|website = |access-date = }}
17. ^{{Cite web|title = Lttoolbox - Apertium|url = http://wiki.apertium.org/wiki/Lttoolbox|website = wiki.apertium.org|access-date = 2016-01-19}}
18. ^{{Cite web|title = VISL|url = http://beta.visl.sdu.dk/visl/vislcg-doc.html|website = beta.visl.sdu.dk|access-date = 2016-01-19}}
19. ^{{Cite web|title = Bilingual dictionary - Apertium|url = http://wiki.apertium.org/wiki/Bidix|website = wiki.apertium.org|access-date = 2016-01-19}}
20. ^{{Cite web|title = Constraint-based lexical selection module - Apertium|url = http://wiki.apertium.org/wiki/Constraint-based_lexical_selection_module|website = wiki.apertium.org|access-date = 2016-01-19}}
21. ^{{Cite web|title = Morphological dictionary - Apertium|url = http://wiki.apertium.org/wiki/Morphological_dictionary|website = wiki.apertium.org|access-date = 2016-01-19}}

References

  • Corbí-Bellot, M. et al. (2005) "An open-source shallow-transfer machine translation engine for the romance languages of Spain" in Proceedings of the European Association for Machine Translation, 10th Annual Conference, Budapest 2005, pp. 79–86
  • Armentano-Oller, C. et al. (2006) "Open-source Portuguese-Spanish machine translation" in Lecture Notes in Computer Science 3960 [Computational Processing of the Portuguese Language, Proceedings of the 7th International Workshop on Computational Processing of Written and Spoken Portuguese, PROPOR 2006], p 50-59.
  • Forcada, M. L. et al. (2010) "Documentation of the Open-Source Shallow-Transfer Machine Translation Platform Apertium" in Departament de Llenguatges i Sistemes Informatics, University of Alacant.
  • Forcada, M. L. et. al. (2011) "Apertium: a free/open-source platform for rule-based machine translation". in "{{doi|10.1007/s10590-011-9090-0}}

External links

  • Apertium home
  • Apertium Wiki
  • OpenTrad
  • {{SourceForge|apertium|Apertium}}

End-user services and software

(All services are based on the Apertium engine)

Online translation websites

  • [https://www.apertium.org Apertium Translation home]
  • Prompsit Translator
  • [https://politraductor.upv.es/ PoliTraductor Translator]
  • University d' Alacant Translator
  • Universitat Oberta de Catalunya Translator

Offline applications

  • Apertium Caffeine
  • Apertium Android
  • Apertium OmegaT

6 : Free software programmed in C++|Machine translation software|Natural language processing software|Natural language processing toolkits|Products introduced in 2009|Translation websites

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/14 3:16:37