Quantifying the utility of the past in mining large databases

Pudi, Vikram ; Haritsa, Jayant R. (2000) Quantifying the utility of the past in mining large databases Information Systems, 25 (5). pp. 323-343. ISSN 0306-4379

Full text not available from this repository.

Official URL: http://www.sciencedirect.com/science/article/pii/S...

Related URL: http://dx.doi.org/10.1016/S0306-4379(00)00021-1


Incremental mining algorithms that can efficiently derive the current mining output by utilizing previous mining results are attractive to business organizations since data mining is typically a resource-intensive recurring activity. In this paper, we present the Delta algorithm for the robust and efficient incremental mining of association rules on large market basket databases. Delta guarantees efficiency by ensuring that, for any dataset, at most three passes over the increment and one pass over the previous database are required to generate the desired rules. Further, it handles "multi-support" environments where the support requirements for the current mining differ from those used in the previous mining, a feature in tune with the exploratory nature of the mining process. We present a performance evaluation of Delta on large databases over a range of increment sizes and data distributions, as well as change in support requirements. The experimental results show that Delta can provide significant improvements in execution times over previously proposed incremental algorithms in all these environments. In fact, for many workloads, its performance is close to that achieved by an optimal, but practically infeasible, algorithm.

Item Type:Article
Source:Copyright of this article belongs to Elsevier Science.
Keywords:Data Mining; Association Rule; Hierarchical Association Rule; Incremental Mining
ID Code:62451
Deposited On:22 Sep 2011 03:20
Last Modified:22 Sep 2011 03:20

Repository Staff Only: item control page