“Fiocruz Genome Comparison Project”的意思、由来-开放百科全书

词条

Fiocruz Genome Comparison Project

释义

Description
See also
Notes
External links

The Fiocruz Genome Comparison Project is a collaborative effort involving Brazil's Oswaldo Cruz Institute and IBM's World Community Grid, designed to produce a database comparing the genes from many genomes with each other using SSEARCH.^[1] The program SSEARCH performs a rigorous Smith–Waterman alignment between a protein sequence and another protein sequence, a protein database, a DNA or a DNA library.

The nature of the computation in the project allows it to easily take advantage of distributed computing. This, along with the likely humanitarian benefits of the research, has led the World Community Grid (a distributed computing grid that uses idle computer clock time) to run the Fiocruz project. All products are in the public domain by contract with WCG.

Description

The problem is that a very large information body (structural, functional, cross-references, etc.) is attached to protein database entries. Once entered the information is rarely updated or corrected. This annotation of predicted protein function is often incomplete, uses non-standard nomenclature or can be incorrect when cross referenced from previous sometimes incorrectly annotated sequences. Additionally, many proteins composed of several structural and/or functional domains are overlooked by automated systems. The comparative information today is huge when compared to the early days of genomics. A single error is compounded and then made complex.

The Genome Comparison Project performs a complete pairwise comparison between all predicted protein sequences, obtaining indices used (together with standardized Gene Ontology^[2]) as a reference repository for the annotator community. The project provides invaluable data sources for biologists. The sequence similarity comparison program used in the Genome Comparison Project is called SSEARCH. This program mathematically finds best local alignment between sequence pairs,^[3] and is a freely available implementation of the Smith–Waterman algorithm.^[4]

SSEARCH's use makes possible a precise annotation, inconsistencies correction, and possible functions assignment to hypothetical proteins of unknown function. Moreover, proteins with multiple domains and functional elements are correctly spotted. Even distant relationships are detected.

Notes

1. ^SSEARCH webpage.
2. ^The Gene Ontology website
3. ^W.R. Pearson (1991) Genomics 11:635–650
4. ^T. F. Smith and M. S. Waterman (1981) J. Mol. Biol, 147:195–197

External links

Genome Comparison Project
World Community Grid Project

2 : Genomics|Berkeley Open Infrastructure for Network Computing projects

随便看

开放百科全书收录14589846条英语、德语、日语等多语种百科知识，基本涵盖了大多数领域的百科知识，是一部内容自由、开放的电子版国际百科全书。

Description

See also

Notes

External links