请输入您要查询的百科知识:

 

词条 Noisy channel model
释义

  1. Definition

  2. Example

  3. Error-correction

  4. See also

  5. References

The noisy channel model is a framework used in spell checkers,

question answering, speech recognition, and machine translation.

In this model, the goal is to find the intended word given a word where the

letters have been scrambled in some manner.

Definition

Given an alphabet , let be the set

of all finite strings over . Let the dictionary

of valid words be some subset of , i.e.,

.

The noisy channel is the matrix

,

where is the intended word and

is the scrambled word that was actually received.

Example

Consider the English alphabet

. Some subset

makes up the dictionary of valid English

words.

There are several mistakes that may occur while typing, including:

  1. Missing letters, e.g., {{notatypo|leter}} instead of letter
  2. Accidental letter additions, e.g., {{notatypo|misstake}} instead of mistake
  3. Swapping letters, e.g., {{notatypo|recieved}} instead of received
  4. Replacing letters, e.g., {{notatypo|fimite}} instead of finite

To construct the noisy channel matrix , we must consider

the probability of each mistake, given the intended word

( for all and

). These probabilities may be gathered, for

example, by considering the Levenshtein distance between

and or by comparing the draft of an essay with one that has

been manually edited for spelling.

Error-correction

The goal of the noisy channel model is to find the intended word given the

scrambled word that was received. The decision function

is a function that, given a scrambled word, returns

the intended word.

Methods of constructing a decision function include the

maximum likelihood rule, the

maximum a posteriori rule, and the

minimum distance rule.

In some cases, it may be better to accept the scrambled word as the intended

word rather than attempt to find an intended word in the dictionary. For

example, the word schönfinkeling may not be in the dictionary, but might

in fact be the intended word.

See also

  • Coding theory

References

{{Refbegin}}
  • {{cite journal

| last1 = Brill | first1 = Eric | last2 = Moore | first2 = Robert C.
| title = An Improved Error Model for Noisy Channel Spelling Correction
|date=Jan 2000
| journal = Proceedings of ACL 2000
| url = http://www.aclweb.org/anthology/P00-1037{{Refend}}

3 : Automatic identification and data capture|Computational linguistics|Statistical natural language processing

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/10 21:02:51