TVSBS: a fast exact pattern matching algorithm for biological sequences

Thathoo, Rahul ; Virmani, Ashish ; Sai Lakshmi, S. ; Balakrishnan, N. ; Sekar, K. (2006) TVSBS: a fast exact pattern matching algorithm for biological sequences Current Science, 91 (1). pp. 47-53. ISSN 0011-3891

[img]
Preview
PDF - Publisher Version
52kB

Official URL: http://www.ias.ac.in/currsci/jul102006/47.pdf

Abstract

The post-genomic era is witnessing a remarkable increase in the number of nucleotide and amino acid sequences. The content of biological sequence databases almost doubles frequently. Pattern matching emerges as a powerful tool in locating nucleotide or amino acid sequence patterns in the biological sequence databases. Presently, several pattern-matching algorithms are available in the literature right from the basic Brute Force algorithm to the recent SSABS. The efficiency of the various algorithms depends on faster and exact identification of the pattern in the text. In this article, we propose an exact pattern-matching algorithm for biological sequences. The proposed algorithm, TVSBS, is a combination of Berry-Ravindran and SSABS algorithms. The performance of the new algorithm has been improved using the shift of Berry-Ravindran bad character table, which leads to lesser number of character comparisons. It works consistently well for both nucleotide and amino acid sequences. The proposed algorithm has been compared with the recent algorithm, SSABS. The results show the robustness of the proposed algorithm and thus it can be incorporated in any exact pattern-matching applications involving biological sequences. The best- and worst-case time complexities of the new algorithm are also outlined.

Item Type:Article
Source:Copyright of this article belongs to Current Science Association.
Keywords:Amino Acids; Character Comparisons; Exact Pattern Matching; Nucleotides
ID Code:64440
Deposited On:10 Oct 2011 07:31
Last Modified:18 May 2016 12:52

Repository Staff Only: item control page