All Projects → andi611 → Apriori-and-Eclat-Frequent-Itemset-Mining

andi611 / Apriori-and-Eclat-Frequent-Itemset-Mining

Licence: MIT license
Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Apriori-and-Eclat-Frequent-Itemset-Mining

FPGrowth-and-Apriori-algorithm-Association-Rule-Data-Mining
Implementation of FPTree-Growth and Apriori-Algorithm for finding frequent patterns in Transactional Database.
Stars: ✭ 19 (-47.22%)
Mutual labels:  data-mining, data-mining-algorithms, frequent-pattern-mining, apriori-algorithm
Data-Mining-and-Warehousing
Data Mining algorithms for IDMW632C course at IIIT Allahabad, 6th semester
Stars: ✭ 19 (-47.22%)
Mutual labels:  data-mining, apriori, data-mining-algorithms, apriori-algorithm
KuiBaDB
Another OLAP database
Stars: ✭ 297 (+725%)
Mutual labels:  transaction, transactions
dtm
A distributed transaction framework that supports multiple languages, supports saga, tcc, xa, 2-phase message, outbox patterns.
Stars: ✭ 6,110 (+16872.22%)
Mutual labels:  transaction, transactions
Tupl
The Unnamed Persistence Library
Stars: ✭ 83 (+130.56%)
Mutual labels:  transaction, transactions
ethereum-tx
Ethereum transaction library in PHP.
Stars: ✭ 144 (+300%)
Mutual labels:  transaction, transactions
Jamais-Vu
Audio Fingerprinting and Recognition in Python using NVidia's CUDA
Stars: ✭ 24 (-33.33%)
Mutual labels:  gpu-acceleration, pycuda
Nem Apps Lib
Semantic Java API Library for NEM Platform
Stars: ✭ 16 (-55.56%)
Mutual labels:  transaction, transactions
gpuhd
Massively Parallel Huffman Decoding on GPUs
Stars: ✭ 30 (-16.67%)
Mutual labels:  gpu-acceleration, gpu-programming
NIDS-Intrusion-Detection
Simple Implementation of Network Intrusion Detection System. KddCup'99 Data set is used for this project. kdd_cup_10_percent is used for training test. correct set is used for test. PCA is used for dimension reduction. SVM and KNN supervised algorithms are the classification algorithms of project. Accuracy : %83.5 For SVM , %80 For KNN
Stars: ✭ 45 (+25%)
Mutual labels:  data-mining, data-mining-algorithms
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Stars: ✭ 2,936 (+8055.56%)
Mutual labels:  data-mining, apriori
Machine-Learning-Models
In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.
Stars: ✭ 30 (-16.67%)
Mutual labels:  apriori, eclat
hierarchical-clustering
A Python implementation of divisive and hierarchical clustering algorithms. The algorithms were tested on the Human Gene DNA Sequence dataset and dendrograms were plotted.
Stars: ✭ 62 (+72.22%)
Mutual labels:  data-mining, data-mining-algorithms
CARE
CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
Stars: ✭ 22 (-38.89%)
Mutual labels:  gpu-acceleration, gpu-programming
mxfactorial
a payment application intended for deployment by the united states treasury
Stars: ✭ 36 (+0%)
Mutual labels:  transaction
iis
Information Inference Service of the OpenAIRE system
Stars: ✭ 16 (-55.56%)
Mutual labels:  data-mining
sciblox
sciblox - Easier Data Science and Machine Learning
Stars: ✭ 48 (+33.33%)
Mutual labels:  data-mining
GnuPlot
A PHP Library for using GnuPlot
Stars: ✭ 26 (-27.78%)
Mutual labels:  plot
PyFstat
a python package for gravitational wave analysis with the F-statistic
Stars: ✭ 41 (+13.89%)
Mutual labels:  pycuda
OGMNeo
[No Maintenance] Neo4j nodeJS OGM(object-graph mapping) abstraction layer
Stars: ✭ 54 (+50%)
Mutual labels:  transaction

Data Mining: Apriori and Eclat Frequent Itemset Mining

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

Implementaions

  • Apriori algorithm
  • Eclat algorithm (recursive method w/ GPU acceleration support)
  • Eclat algorithm (iterative method)

Requirements

  • < Python 3.6+ >
  • < NVIDIA CUDA 9.0 > (Optional)
  • < Pycuda 2018.1.1 > (Optional)
  • < g++ [gcc version 6.4.0 (GCC)] > (Optional)

Environment Setup

sudo pip3 install pycuda
  • Refer here for "CUDA unsupported GNU version" problem, or follow the following steps:
1. sudo apt-get install gcc-6
2. sudo apt-get install g++-6
3. sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-6 10
4. sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 10

Datasets:

  • ./data/data.txt: suggested min support range: [0.6 0.02]
  • ./data/data2.txt: a harder dataset, only eclat can find results in reasonable time. Suggested min support range: [0.1 0.0002]

Usage

  • To run the Apriori / Cclat algorithm with defaul settings:
python3 runner.py apriori
python3 runner.py eclat
  • Other arguments can be given by:
python3 runner.py [mode] --min_support 0.6 --input_path ./data/data.txt --output_path ./data/output.txt
  • To run Eclat with GPU acceleration (Suggested dataset: data2.txt):
python3 runner.py eclat --min_support 0.02 --input_path ./data/data2.txt --use_CUDA
  • To plot run time v.s. different experiment values:
python runner.py [mode] --plot_support
python runner.py [mode] --plot_support_gpu --input_path ./data/data2.txt --use_CUDA
python runner.py [mode] --compare_gpu --input_path ./data/data2.txt --use_CUDA
python runner.py [mode] --plot_thread --input_path ./data/data2.txt --use_CUDA
python runner.py [mode] --plot_block --input_path ./data/data2.txt --use_CUDA
  • To test with toy data:
python runner.py [mode] --toy_data
  • To run the eclat algorithm with the iterative method:
python runner.py [mode] --iterative

Apriori minimum support v.s. run time plot

Eclat minimum support v.s. run time plot

Eclat minimum support v.s. run time plot (data2.txt w/ GPU version)

Eclat w/ GPU and w/o GPU comparison plot (data2.txt w/ GPU version)

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].