
DNA, Words and Models
Statistics of Exceptional Words
Cambridge University Press
Will be published approx. on 13. October 2005
Book
Hardback
158 pages
978-0-521-84729-2 (ISBN)
Description
An important problem in computational biology is identifying short DNA sequences (mathematically, 'words') associated to a biological function. One approach consists in determining whether a particular word is simply random or is of statistical significance, for example, because of its frequency or location. This book introduces the mathematical and statistical ideas used in solving this so-called exceptional word problem. It begins with a detailed description of the principal models used in sequence analysis: Markovian models are central here and capture compositional information on the sequence being analysed. There follows an introduction to several statistical methods that are used for finding exceptional words with respect to the model used. The second half of the book is illustrated with numerous examples provided from the analysis of bacterial genomes, making this a practical guide for users facing a real situation and needing to make an adequate procedure choice.
Reviews / Votes
'For statisticians with a little background in biology, this book delivers a very readable presentation on the analysis of DNA sequences to determine whether a motif is of statistical significance due to its overabundance (or under-abundance) in terms of frequencies or location. This book is concise but sufficiently detailed. Biologists without a background in mathematical statistics may find the learning curve a little steep but tractable. The authors' continuous use of practical examples will be greatly appreciate by biologists and statisticians alike. This book is one of a kind, and I recommend it to any statistician interested in learning about DNA sequences and motifs.' Journal of the American Statistical AssociationMore details
Language
English
Place of publication
Cambridge
United Kingdom
Target group
Professional and scholarly
Product notice
sewn/stitched
Cloth over boards
Illustrations
Worked examples or Exercises; 14 Tables, unspecified; 7 Halftones, unspecified; 21 Line drawings, unspecified
Dimensions
Height: 238 mm
Width: 159 mm
Thickness: 14 mm
Weight
389 gr
ISBN-13
978-0-521-84729-2 (9780521847292)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Persons
Author
Institut National de la Recherche Agronomique (INRA), Paris
Institut National de la Recherche Agronomique (INRA), Paris
Institut National de la Recherche Agronomique (INRA), Paris
Content
Introduction; 1. Simple models for biological sequences; 2. Introduction to Markov chain models; 3. Taking heterogeneities into account; 4. Statistical properties of word occurrences; 5. Words with unexpected frequencies; 6. Words with unexpected locations.