词条 | Information Retrieval Facility |
释义 |
The Information Retrieval Facility (IRF), founded 2006 and located in Vienna, Austria, was a research platform for networking and collaboration for professionals in the field of information retrieval. It ceased operations in 2012. The IRF had members in the following categories:
The Scientific BoardMaristella Agosti, Professor, Department of Information Engineering, University of PadovaGerhard Budin, Director of the Center of Translation Studies at the University of Vienna, Director of the Department of Corpuslinguistics and Text Technology, Austrian Academy of Sciences Jamie Callan, Professor, Language Technologies Institute, CMU, Carnegie Mellon UniversityYves Chiaramella, Professor Emeritus, Department of Computer Science and Applied Mathematics, Joseph Fourier UniversityKilnam Chon, Professor, Computer Science Department, Korea Advanced Institute of Science and Technology (KAIST)W. Bruce Croft, Distinguished Professor, Department of Computer Science and Director Center for Intelligent IR University of Massachusetts AmherstHamish Cunningham, Research Professor, Computer Science Department University SheffieldNorbert Fuhr, Chairman of the Scientific Board, Professor, Institute of Informatics and Interactive Systems University Duisburg-EssenDavid Hawking, Science Leader, Project Leader, CSIRO ICT CentreNoriko Kando, Professor, Software Engineering Research, Software Research Division, National Institute of Informatics (NII)Arcot Desai Narasimhalu, Associate Dean, School of Information Systems Singapore Management UniversityJohn Tait, Chief Scientific Officer of the IRF, Until July 2007 Professor of Intelligent Information Systems and Associate Dean of the School of Computing and TechnologyBenjamin T'sou, Director, Language Information Sciences Research Centre, City University of Hong KongC.J. van Rijsbergen, Dept. Computer Science at the University of GlasgowScientific goals
Semantic supercomputingCurrent technologies to extract concepts from unstructured documents are extremely computational intensive. To allow interactive experimentation with rich and huge text corpora, the IRF has built a high performance computing environment, into which the latest technological advances have been implemented:
The combination of these HPC features to accelerate text mining represents the IRF implementation of semantic supercomputing. The World Patent CorpusThe IRF aims to bring state-of-the-art information retrieval technology to the community of patent information professionals. We expect information retrieval (IR) technology to become the focus of information technology very soon. All industry sectors can profit from applying modern and future text mining processes to the special requirements of patent research. Although all ideas and concepts are universally applicable to all sorts of intellectual property information, patents require the most sophistication, and confront us with challenging technical and organisational problems. The entire body of patent-related documents possibly constitutes the largest corpus of compound documents, making it a rewarding target for text mining scientists and end-users alike. What’s more, patents have become a crucial issue, in particular for large global corporations and universities. The industrial users of patent data are among the most demanding and important information professionals. As a consequence, they could benefit the most from technology that relieves the burden of researching the large body of patent information. Research collectionsThe IRF provides a number of test data collections that have either been developed by the IRF, by one of its members or by third parties. These data collections can be used freely for scientific experimentations. The MAtrixware REsearch Collection (MAREC) is the first standardised patent data corpus for research purposes. It consists of 19 million patent documents in different languages, normalised to a highly specific XML format. The collection has been developed by Matrixware for the IRF. The ClueWeb09 collection is a 25 terabyte dataset of about 1 billion web pages crawled in January and February, 2009. It has been created by the Language Technologies Institute at Carnegie Mellon University to support research on information retrieval and related human language technologies. References
External links
4 : Organizations established in 2006|Computer science organizations|Information retrieval organizations|Education in Vienna |
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。