Sarawagi, Sunita ; Thomas, Shiby ; Agrawal, Rakesh (2000) Integrating association rule mining with databases: alternatives and implications Data Mining and Knowledge Discovery journal, 4 (2-3). pp. 89-125.
PDF
498kB |
Abstract
Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL cursor interface; encapsulation of a mining algorithm in a stored procedure; caching the data to a file system on-the-fly and mining; tight-coupling using primarily user-defined functions; and SQL implementations for processing in the DBMS. We comprehensively study the option of expressing the mining algorithm in the form of SQL queries using Association rule mining as a case in point. We consider four options in and six options in SQL enhanced with object-relational extensions (). Our evaluation of the different architectural alternatives shows that from a performance perspective, the option is superior, although the performance of the option is within a factor of two. Both the and the approaches incur a higher storage penalty than the loose-coupling approach which performance-wise is a factor of 3 to 4 worse than . The implementations were too slow to qualify as a competitive option. We also compare these alternatives on the basis of qualitative factors like automatic parallelization, development ease, portability and inter-operability.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to ResearchGate GmbH |
ID Code: | 128429 |
Deposited On: | 20 Oct 2022 09:56 |
Last Modified: | 14 Nov 2022 11:54 |
Repository Staff Only: item control page