Prediction of probable genes by fourier analysis of genomic sequences

Tiwari, Shrish ; Ramachandran, S. ; Bhattacharya, Alok ; Bhattacharya, Sudha ; Ramaswamy, Ramakrishna (1997) Prediction of probable genes by fourier analysis of genomic sequences Bioinformatics, 13 (3). pp. 263-270. ISSN 1367-4803

[img]
Preview
PDF - Publisher Version
626kB

Official URL: http://bioinformatics.oxfordjournals.org/cgi/conte...

Related URL: http://dx.doi.org/10.1093/bioinformatics/13.3.263

Abstract

Motivation: The major signal in coding regions of genomic sequences is a three-base periodicity. Our aim is to use fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA. Result: The three-base periodicity in the nucleotide arrangement is evidenced as a sharp peak at frequency f = 1/3 in the Fourier (or power) spectrum. From extensive spectral analysis of DNA sequences of total length over 5.5 million base pairs from a wide variety or organisms (including the human genome), and by separately examining coding and non-coding sequences, we find that the relative height of the peak at f = 1/3 in the Fourier spectrum is a good discriminator of coding potential. This feature is utilized by us to detect probable coding regions in DNA sequences, by examining the local signal-to-noise ratio of the peak within a sliding window. While the overall accuracy is comparable to that of other techniques currently in use, the measure that is presently proposed is independent of training sets or existing database information, and can thus find general application. Availability: A computer program GeneScan which locates coding open reading frames and exonic regions in genomic sequences has been developed, and is available on request.

Item Type:Article
Source:Copyright of this article belongs to Oxford University Press.
ID Code:2717
Deposited On:08 Oct 2010 09:51
Last Modified:16 May 2016 13:39

Repository Staff Only: item control page