请输入您要查询的百科知识:

 

词条 Automatic content extraction
释义

  1. Goals and efforts

  2. Topics and exercises

  3. References

  4. External links

{{Multiple issues|{{citation style|date=December 2011}}{{technical|date=October 2012}}{{abbreviations|date=October 2012}}
}}

Automatic content extraction (ACE) is a research program for developing advanced information extraction technologies convened by the NIST from 1999 to 2008, succeeding MUC and preceding [https://www.nist.gov/tac/ Text Analysis Conference].

Goals and efforts

In general objective, the ACE program is motivated by and addresses the same issues as the MUC program that preceded it. The ACE program, however, defines the research objectives in terms of the target objects (i.e., the entities, the relations, and the events) rather than in terms of the words in the text. For example, the so-called "named entity" task, as defined in MUC, is to identify those words (on the page) that are names of entities. In ACE, on the other hand, the corresponding task is to identify the entity so named. This is a different task, one that is more abstract and that involves inference more explicitly in producing an answer. In a real sense, the task is to detect things that "aren't there".

While the ACE program is directed toward extraction of information from audio and image sources in addition to pure text, the research effort is restricted to information extraction from text. The actual transduction of audio and image data into text is not part of the ACE research effort, although the processing of ASR and OCR output from such transducers is.

The effort involves:

  • defining the research tasks in detail,
  • collecting and annotating data needed for training, development, and evaluation,
  • supporting the research with evaluation tools and research workshops.

Topics and exercises

Given a text in natural language, the ACE challenge is to detect:

  1. entities mentioned in the text, such as: persons, organizations, locations, facilities, weapons, vehicles, and geo-political entities.
  2. relations between entities, such as: person A is the manager of company B. Relation types include: role, part, located, near, and social.
  3. events mentioned in the text, such as: interaction, movement, transfer, creation and destruction.

The program relates to English, Arabic and Chinese texts.

The ACE corpus is one of the standard benchmarks for testing new information extraction algorithms.

References

  • George Doddington@NIS T, Alexis Mitchell@LD C, Mark Przybocki@NIS T, Lance Ramshaw@BB N, Stephanie Strassel@LD C, Ralph Weischedel@BB N. The automatic content extraction (ACE) program–tasks, data, and evaluation. 2004

External links

  • MUC - ACE's predecessor.
  • ACE (LDC)
  • [https://web.archive.org/web/20060308054306/http://www.itl.nist.gov/iad/894.01/tests/ace/ ACE] (NIST)

1 : Information retrieval organizations

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/11 20:17:25