2 - Recomposing DNA  pp. 13-41

Recomposing DNA

By Jerome K. Percus

Image View Previous Chapter Next Chapter



Fingerprint Assembly

We now leave the world of estimates and enter that of exact results, but for model situations. The chain to be analyzed is imagined to be present as a set of cloned subchains with substantial but unknown overlap. In this section, we characterize a member of a clone by an ordered set of restriction fragments, or just a set of restriction-fragment lengths, or a set of lengths of special restriction fragments (e.g., those with certain characteristic repeats) called the fingerprint of the clone. We have, in many cases, a library of randomly chosen pieces of DNA or a section of DNA, each with a known fingerprint. Can we order these to produce a physical map of the full sequence? To examine the degree to which this is feasible (Lander and Waterman, 1988), let us first expand our notation. G will denote genome length (in base pairs, bp), L the clone length, N the number of distinct clones available, p = N/G the probability of a clone's starting at a given site, and c = LN/G = Lp the redundancy, the number of times the genome is covered by the aggregate length of all clones. In addition, and crucially, we let T be the number of base pairs two clones must have in common to declare reliably that they overlap, θ = T/L the overlap threshold ratio, and σ = 1 − θ = (L − T)/L; multiple overlap is fine.