请输入您要查询的百科知识:

 

词条 Sulston score
释义

  1. The overlap problem in mapping

      Mathematical scores in overlap assessment  

  2. Sulston score exposition

  3. Mathematical refinement

  4. References

  5. See also

The Sulston score is an equation used in DNA mapping to numerically assess the likelihood that a given "fingerprint" similarity between two DNA clones is merely a result of chance. Used as such, it is a test of statistical significance. That is, low values imply that similarity is significant, suggesting that two DNA clones overlap one another and that the given similarity is not just a chance event. The name is an eponym that refers to John Sulston by virtue of his being the lead author of the paper that first proposed the equation's use.[1]

The overlap problem in mapping

Each clone in a DNA mapping project has a "fingerprint", i.e. a set of DNA fragment lengths inferred from (1) enzymatically digesting the clone, (2) separating these fragments on a gel, and (3) estimating their lengths based on gel location. For each pairwise clone comparison, one can establish how many lengths from each set match-up. Cases having at least 1 match indicate that the clones might overlap because matches may represent the same DNA. However, the underlying sequences for each match are not known. Consequently, two fragments whose lengths match may still represent different sequences. In other words, matches do not conclusively indicate overlaps. The problem is instead one of using matches to probabilistically classify overlap status.

Mathematical scores in overlap assessment

Biologists have used a variety of means (often in combination) to discern clone overlaps in DNA mapping projects. While many are biological, i.e. looking for shared markers, others are basically mathematical, usually adopting probabilistic and/or statistical approaches.

Sulston score exposition

The Sulston score is rooted in the concepts of Bernoulli and binomial processes, as follows. Consider two clones, and , having and measured fragment lengths, respectively, where . That is, clone has at least as many fragments as clone , but usually more. The Sulston score is the probability that at least fragment lengths on clone will be matched by any combination of lengths on . Intuitively, we see that, at most, there can be matches. Thus, for a given comparison between two clones, one can measure the statistical significance of a match of fragments, i.e. how likely it is that this match occurred simply as a result of random chance. Very low values would indicate a significant match that is highly unlikely to have arisen by pure chance, while higher values would suggest that the given match could be just a coincidence.

Mathematical refinement

In a 2005 paper,[2] Michael Wendl gave an example showing that the assumption of independent trials is not valid. So, although the traditional Sulston score does indeed represent a probability distribution, it is not actually the distribution characteristic of the fingerprint problem. Wendl went on to give the general solution for this problem in terms of the Bell polynomials, showing the traditional score overpredicts P-values by orders of magnitude. (P-values are very small in this problem, so we are talking, for example, about probabilities on the order of 10×10−14 versus 10×10−12, the latter Sulston value being 2 orders of magnitude too high.) This solution provides a basis for determining when a problem has sufficient information content to be treated by the probabilistic approach and is also a general solution to the birthday problem of 2 types.

A disadvantage of the exact solution is that its evaluation is computationally intensive and, in fact, is not feasible for comparing large clones.[2] Some fast approximations for this problem have been proposed.[3]

References

1. ^{{cite journal |vauthors=Sulston J, Mallett F, Staden R, Durbin R, Horsnell T, Coulson A |title=Software for genome mapping by fingerprinting techniques |journal=Comput Appl Biosci |volume=4 |issue=1 |pages=125–32 |date=Mar 1988 |pmid=2838135 |doi=10.1093/bioinformatics/4.1.125}}
2. ^{{cite journal |author=Wendl MC |title=Probabilistic assessment of clone overlaps in DNA fingerprint mapping via a priori models |journal=J. Comput. Biol. |volume=12 |issue=3 |pages=283–97 |date=Apr 2005 |pmid=15857243 |doi=10.1089/cmb.2005.12.283 }}
3. ^{{cite journal |author=Wendl MC |title=Algebraic correction methods for computational assessment of clone overlaps in DNA fingerprint mapping |journal=BMC Bioinformatics |volume=8 |pages=127 |year=2007 |pmid=17442113 |pmc=1868038 |doi=10.1186/1471-2105-8-127 }}

See also

  • FPC: a widely used fingerprint mapping program that utilizes the Sulston Score

2 : Bioinformatics|Mathematical and theoretical biology

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/11 23:19:34