Statistical analysis of large DNA sequences using distribution of DNA words

Chaudhuri, Probal ; Das, Sandip (2001) Statistical analysis of large DNA sequences using distribution of DNA words Current Science, 80 (9). pp. 1161-1166. ISSN 0011-3891

[img]
Preview
PDF - Publisher Version
968kB

Official URL: http://cs-test.ias.ac.in/cs/Downloads/article_3405...

Abstract

Conventional sequence alignment techniques for comparing and analysing relatively smaller DNA sequences of nearly equal sizes are not applicable to data consisting of large sequences with widely varying sizes. In this article DNA sequences have been analysed based on distributions of DNA words. DNA word frequencies are simple yet effective statistical tools to capture information about structural patterns, and they can reveal biologically significant features in DNA sequence. Our analysis demonstrates how such simple statistical summaries of large DNA data can enable us to detect the structural signature of a genome as well as to identify phylogenetic relationships among different species reflected in the variation of word distributions in their DNA sequences.

Item Type:Article
Source:Copyright of this article belongs to Current Science Association.
ID Code:74639
Deposited On:17 Dec 2011 10:29
Last Modified:18 May 2016 18:58

Repository Staff Only: item control page