andi611 / Apriori-and-Eclat-Frequent-Itemset-Mining

Licence: MIT license

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Apriori-and-Eclat-Frequent-Itemset-Mining

FPGrowth-and-Apriori-algorithm-Association-Rule-Data-Mining

Implementation of FPTree-Growth and Apriori-Algorithm for finding frequent patterns in Transactional Database.

Stars: ✭ 19 (-47.22%)

Mutual labels: data-mining, data-mining-algorithms, frequent-pattern-mining, apriori-algorithm

Data-Mining-and-Warehousing

Data Mining algorithms for IDMW632C course at IIIT Allahabad, 6th semester

Stars: ✭ 19 (-47.22%)

Mutual labels: data-mining, apriori, data-mining-algorithms, apriori-algorithm

KuiBaDB

Another OLAP database

Stars: ✭ 297 (+725%)

Mutual labels: transaction, transactions

dtm

A distributed transaction framework that supports multiple languages, supports saga, tcc, xa, 2-phase message, outbox patterns.

Stars: ✭ 6,110 (+16872.22%)

Mutual labels: transaction, transactions

Tupl

The Unnamed Persistence Library

Stars: ✭ 83 (+130.56%)

Mutual labels: transaction, transactions

ethereum-tx

Ethereum transaction library in PHP.

Stars: ✭ 144 (+300%)

Mutual labels: transaction, transactions

Jamais-Vu

Audio Fingerprinting and Recognition in Python using NVidia's CUDA

Stars: ✭ 24 (-33.33%)

Mutual labels: gpu-acceleration, pycuda

Nem Apps Lib

Semantic Java API Library for NEM Platform

Stars: ✭ 16 (-55.56%)

Mutual labels: transaction, transactions

gpuhd

Massively Parallel Huffman Decoding on GPUs

Stars: ✭ 30 (-16.67%)

Mutual labels: gpu-acceleration, gpu-programming

NIDS-Intrusion-Detection

Simple Implementation of Network Intrusion Detection System. KddCup'99 Data set is used for this project. kdd_cup_10_percent is used for training test. correct set is used for test. PCA is used for dimension reduction. SVM and KNN supervised algorithms are the classification algorithms of project. Accuracy : %83.5 For SVM , %80 For KNN

Stars: ✭ 45 (+25%)

Mutual labels: data-mining, data-mining-algorithms

Alink

Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.

Stars: ✭ 2,936 (+8055.56%)

Mutual labels: data-mining, apriori

Machine-Learning-Models

In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.

Stars: ✭ 30 (-16.67%)

Mutual labels: apriori, eclat

hierarchical-clustering

A Python implementation of divisive and hierarchical clustering algorithms. The algorithms were tested on the Human Gene DNA Sequence dataset and dendrograms were plotted.

Stars: ✭ 62 (+72.22%)

Mutual labels: data-mining, data-mining-algorithms

CARE

CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.

Stars: ✭ 22 (-38.89%)

Mutual labels: gpu-acceleration, gpu-programming

mxfactorial

a payment application intended for deployment by the united states treasury

Stars: ✭ 36 (+0%)

Mutual labels: transaction

iis

Information Inference Service of the OpenAIRE system

Stars: ✭ 16 (-55.56%)

Mutual labels: data-mining

sciblox

sciblox - Easier Data Science and Machine Learning

Stars: ✭ 48 (+33.33%)

Mutual labels: data-mining

GnuPlot

A PHP Library for using GnuPlot

Stars: ✭ 26 (-27.78%)

Mutual labels: plot

PyFstat

a python package for gravitational wave analysis with the F-statistic

Stars: ✭ 41 (+13.89%)

Mutual labels: pycuda

OGMNeo

[No Maintenance] Neo4j nodeJS OGM(object-graph mapping) abstraction layer

Stars: ✭ 54 (+50%)

Mutual labels: transaction

View All Similar Projects ➔

Data Mining: Apriori and Eclat Frequent Itemset Mining

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

Implementaions

Apriori algorithm
Eclat algorithm (recursive method w/ GPU acceleration support)
Eclat algorithm (iterative method)

Requirements

< Python 3.6+ >
< NVIDIA CUDA 9.0 > (Optional)
< Pycuda 2018.1.1 > (Optional)
< g++ [gcc version 6.4.0 (GCC)] > (Optional)

Environment Setup

Install CUDA: CUDA 9.0 installation guide
Install Pycuda:

sudo pip3 install pycuda

Refer here for "CUDA unsupported GNU version" problem, or follow the following steps:

1. sudo apt-get install gcc-6
2. sudo apt-get install g++-6
3. sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-6 10
4. sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 10

Datasets:

./data/data.txt: suggested min support range: [0.6 0.02]
./data/data2.txt: a harder dataset, only eclat can find results in reasonable time. Suggested min support range: [0.1 0.0002]

Usage

To run the Apriori / Cclat algorithm with defaul settings:

python3 runner.py apriori
python3 runner.py eclat

Other arguments can be given by:

python3 runner.py [mode] --min_support 0.6 --input_path ./data/data.txt --output_path ./data/output.txt

To run Eclat with GPU acceleration (Suggested dataset: data2.txt):

python3 runner.py eclat --min_support 0.02 --input_path ./data/data2.txt --use_CUDA

To plot run time v.s. different experiment values:

python runner.py [mode] --plot_support
python runner.py [mode] --plot_support_gpu --input_path ./data/data2.txt --use_CUDA
python runner.py [mode] --compare_gpu --input_path ./data/data2.txt --use_CUDA
python runner.py [mode] --plot_thread --input_path ./data/data2.txt --use_CUDA
python runner.py [mode] --plot_block --input_path ./data/data2.txt --use_CUDA

To test with toy data:

python runner.py [mode] --toy_data

To run the eclat algorithm with the iterative method:

python runner.py [mode] --iterative

Apriori minimum support v.s. run time plot

Eclat minimum support v.s. run time plot

Eclat minimum support v.s. run time plot (data2.txt w/ GPU version)

Eclat w/ GPU and w/o GPU comparison plot (data2.txt w/ GPU version)

Reference

PyCUDA tutorial documentation
PyCUDA array documentation
PyCUDA tutorial
CUDA parallel thread hierarchy
CUDA executes kernels using a grid of blocksof threads. This figure shows the common indexing pattern used in CUDA programs using the CUDA keywords gridDim.x (the number of thread blocks), blockDim.x (the number of threads in each block), blockIdx.x (the index the current block within the grid), and threadIdx.x (the index of the current thread within the block).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

andi611 / Apriori-and-Eclat-Frequent-Itemset-Mining

Programming Languages

Labels

Projects that are alternatives of or similar to Apriori-and-Eclat-Frequent-Itemset-Mining

Data Mining: Apriori and Eclat Frequent Itemset Mining

Implementaions

Requirements

Environment Setup

Datasets:

Usage

Apriori minimum support v.s. run time plot

Eclat minimum support v.s. run time plot

Eclat minimum support v.s. run time plot (data2.txt w/ GPU version)

Eclat w/ GPU and w/o GPU comparison plot (data2.txt w/ GPU version)

Reference