All Projects → LoLei → spmf-py

LoLei / spmf-py

Licence: GPL-3.0 License
Python SPMF Wrapper 🐍 🎁

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to spmf-py

GraMi
GraMi is a novel framework for frequent subgraph mining in a single large graph, GraMi outperforms existing techniques by 2 orders of magnitudes. GraMi supports finding frequent subgraphs as well as frequent patterns, Compared to subgraphs, patterns offer a more powerful version of matching that captures transitive interactions between graph nod…
Stars: ✭ 76 (+117.14%)
Mutual labels:  frequent-patterns
uniswap-python
🦄 The unofficial Python client for the Uniswap exchange.
Stars: ✭ 533 (+1422.86%)
Mutual labels:  wrapper
FPGrowth-and-Apriori-algorithm-Association-Rule-Data-Mining
Implementation of FPTree-Growth and Apriori-Algorithm for finding frequent patterns in Transactional Database.
Stars: ✭ 19 (-45.71%)
Mutual labels:  data-mining
with-wrapper
React HOC for wrapper components.
Stars: ✭ 35 (+0%)
Mutual labels:  wrapper
SHAP FOLD
(Explainable AI) - Learning Non-Monotonic Logic Programs From Statistical Models Using High-Utility Itemset Mining
Stars: ✭ 35 (+0%)
Mutual labels:  data-mining
Mega-index-heroku
Mega nz heroku index, Serves mega.nz to http via heroku web. It Alters downloading speed and stability
Stars: ✭ 165 (+371.43%)
Mutual labels:  wrapper
raylib-nelua
Raylib wrapper to nelua language
Stars: ✭ 27 (-22.86%)
Mutual labels:  wrapper
Taviloglu.Wrike.ApiClient
.NET Client for Wrike API
Stars: ✭ 24 (-31.43%)
Mutual labels:  wrapper
act
Computational synthetic biology: Predicting DNA edits for bioengineering
Stars: ✭ 67 (+91.43%)
Mutual labels:  data-mining
WireGuard-Wrapper
Simple wrapper that makes WireGuard easier to use with VPN providers.
Stars: ✭ 29 (-17.14%)
Mutual labels:  wrapper
imgur-scraper
Retrieve years of imgur.com's data without any authentication.
Stars: ✭ 26 (-25.71%)
Mutual labels:  data-mining
cocoon-demo
Cocoon – a flow-based workflow automation, data mining and visual analytics tool.
Stars: ✭ 19 (-45.71%)
Mutual labels:  data-mining
node-api
A JavaScript API Wrapper for NovelCOVID/API
Stars: ✭ 63 (+80%)
Mutual labels:  wrapper
SharpPhysFS
Managed wrapper for the PhysFS library
Stars: ✭ 14 (-60%)
Mutual labels:  wrapper
MangaDex.py
An easy to use wrapper for the MangaDexAPIv5 written in Python using Requests.
Stars: ✭ 13 (-62.86%)
Mutual labels:  wrapper
data-mining-course
An undergraduate course on data mining.
Stars: ✭ 24 (-31.43%)
Mutual labels:  data-mining
fireREST
Python library for interacting with Cisco Firepower Management Center REST API
Stars: ✭ 47 (+34.29%)
Mutual labels:  wrapper
Ptero4J
A java wrapper for the pterodactyl panel API
Stars: ✭ 26 (-25.71%)
Mutual labels:  wrapper
readability-extractor
Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.
Stars: ✭ 18 (-48.57%)
Mutual labels:  wrapper
datamining algorithms
用python实现SVM/AdaBoost/C4.5/CART/Naïve Bayes等数据挖掘领域十大经典算法
Stars: ✭ 64 (+82.86%)
Mutual labels:  data-mining

spmf-py

Python Wrapper for SPMF 🐍 🎁

Information

The SPMF [1] data mining Java library usable in Python.

Essentially, this module calls the Java command line tool of SPMF, passes the user arguments to it, and parses the output.
In addition, transformation of the data to Pandas DataFrame and CSV is possible.

In theory, all algorithms featured in SPMF are callable. Nothing is hardcoded, the desired algorithm and its parameters need to be perused in the SPMF documentation.

Installation

pip install spmf

Usage

Example:

from spmf import Spmf

spmf = Spmf("PrefixSpan", input_filename="contextPrefixSpan.txt",
            output_filename="output.txt", arguments=[0.7, 5])
spmf.run()
print(spmf.to_pandas_dataframe(pickle=True))
spmf.to_csv("output.csv")

Output:

=============  PREFIXSPAN 0.99-2016 - STATISTICS =============
 Total time ~ 2 ms
 Frequent sequences count : 14
 Max memory (mb) : 6.487663269042969
 minsup = 3 sequences.
 Pattern count : 14
===================================================

      pattern sup
0         [1]   4
1      [1, 2]   4
2      [1, 3]   4
3   [1, 3, 2]   3
4   [1, 3, 3]   3
5         [2]   4
6      [2, 3]   3
7         [3]   4
8      [3, 2]   3
9      [3, 3]   3
10        [4]   3
11     [4, 3]   3
12        [5]   3
13        [6]   3

The usage is similar to the one described in the SPMF documentation.
For all Python parameters, see the Spmf class.

SPMF Arguments

The arguments parameter are the arguments that are passed to SPMF and depend on the chosen algorithm. SPMF handles optional parameters as an ordered list. As there are no named parameters for the algorithms, if e.g. only the first and the last parameter of an algorithm are to be used, the ones in between must be filled with "" blank strings.
For advanced usage examples, see examples.

SPMF Executable

Download it from the SPMF Website.
It is assumed that the SPMF binary spmf.jar is located in the same directory as spmf-py. If it is not, either symlink it, or use the spmf_bin_location_dir parameter.

Input Formats

Either use an input file as specified by SPMF, or use one of the in-line formats as seen in examples.

Memory

The maxmimum memory can be increased in the constructor via Spmf(memory=n), where n is megabyte, see SPMF's FAQ.

Background

Why? If you're in a Python pipeline, like a Jupyter Notebook, it might be cumbersome to use Java as an intermediate step. Using spmf-py you can stay in your pipeline as though Java is never used at all.

Bibliography

Fournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H. T. (2016).  
The SPMF Open-Source Data Mining Library Version 2.  
Proc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Springer LNCS 9853,  pp. 36-40.

Disclaimer

This module has not been tested for all 184 algorithms offered in SPMF. Calling them and writing to the output file should be possible for all. Output parsing however should work for those that have outputs like the sequential pattern mining algorithms. It was not tested it with other types, some adaption of the output parsing might be necessary. If something is not working, submit an issue or create a PR yourself!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].