EGPred: prediction of eukaryotic genes using Ab initio methods after combining with sequence similarity approaches

Issac, Biju ; Raghava, Gajendra Pal Singh (2004) EGPred: prediction of eukaryotic genes using Ab initio methods after combining with sequence similarity approaches Genome Research, 14 . pp. 1756-1766. ISSN 1088-9051

[img]
Preview
PDF - Publisher Version
207kB

Official URL: http://genome.cshlp.org/content/14/9/1756.short

Related URL: http://dx.doi.org/10.1101/gr.2524704

Abstract

EGPred is a Web-based server that combines ab initio methods and similarity searches to predict genes, particularly exon regions, with high accuracy. The EGPred program proceeds in the following steps: (1) an initial BLASTX search of genomic sequence against the RefSeq database is used to identify protein hits with an E-value <1; (2) a second BLASTX search of genomic sequence against the hits from the previous run with relaxed parameters (E-values <10) helps to retrieve all probable coding exon regions; (3) a BLASTN search of genomic sequence against the intron database is then used to detect probable intron regions; (4) the probable intron and exon regions are compared to filter/remove wrong exons; (5) the NNSPLICE program is then used to reassign splicing signal site positions in the remaining probable coding exons; and (6) finally ab initio predictions are combined with exons derived from the fifth step based on the relative strength of start/stop and splice signal sites as obtained from ab initio and similarity search. The combination method increases the exon level performance of five different ab initio programs by 4%-10% when evaluated on the HMR195 data set. Similar improvement is observed when ab initio programs are evaluated on the Burset/Guigo data set. Finally, EGPred is demonstrated on an ~95-Mbp fragment of human chromosome 13. The list of predicted genes from this analysis are available in the supplementary material. The EGPred program is computationally intensive due to multiple BLAST runs during each analysis. The EGPred server is available at http://www.imtech.res.in/raghava/egpred/.

Item Type:Article
Source:Copyright of this article belongs to Cold Spring Harbor Laboratory Press.
ID Code:43086
Deposited On:09 Jun 2011 12:43
Last Modified:18 May 2016 00:11

Repository Staff Only: item control page