All Categories → Data Processing → data-mining

Top 285 data-mining open source projects

Raven
RAVEN is a flexible and multi-purpose probabilistic risk analysis, uncertainty quantification, parameter optimization and data knowledge-discovering framework.
Openhistorian
The Open Source Time-Series Data Historian
Kddcup 2020
6th Solution for 2020-KDDCUP Debiasing Challenge
Ayakashi
⚡️ Ayakashi.io - The next generation web scraping framework
Lab Workshops
Materials for workshops on text mining, machine learning, and data visualization
Bella
Bella is a pure python post-exploitation data mining tool & remote administration tool for macOS. 🍎💻
Gspan
Python implementation of frequent subgraph mining algorithm gSpan. Directed graphs are supported.
Gitlogg
💾 🧮 🤯 Parse the 'git log' of multiple repos to 'JSON'
Vizuka
Explore high-dimensional datasets and how your algo handles specific regions.
Graph sampling
Graph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.
Msnoise
A Python Package for Monitoring Seismic Velocity Changes using Ambient Seismic Noise | http://www.msnoise.org
Daggy
Daggy - Data Aggregation Utility. Open source, free, cross-platform, server-less, useful utility for remote or local data aggregation and streaming
Vvedenie Mashinnoe Obuchenie
📝 Подборка ресурсов по машинному обучению
Csmath 2020
This mathematics course is taught for the first year Ph.D. students of computer science and related areas @ZJU
Dc Hi guides
[Data Castle 算法竞赛] 精品旅行服务成单预测 final rank 11
Dex
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
2017 Ccf Bdci Enterprise
2017-CCF-BDCI-企业经营退出风险预测:9th/569 (Top 1.58%)
Tsv Utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Tsrepr
TSrepr: R package for time series representations
Rental Prediction
2018年全国大学生计算机应用能力大赛之住房月租金预测第一名代码
Bee University
Project thu thập điểm chuẩn đại học 2014 - 2018 và phân tích dữ liệu
Bolt
Fast approximate vector operations
Ffbe
Datamining for FFBE GL
Evalne
Source code for EvalNE, a Python library for evaluating Network Embedding methods.
Linkedingiveaway
👨🏽‍🏫You can learn about anything over here. What Giveaways I do and why it's important in today's modern world. Are you interested in Giveaway's?🔋
Wordtokenizers.jl
High performance tokenizers for natural language processing and other related tasks
Gendis
Contains an implementation (sklearn API) of the algorithm proposed in "GENDIS: GEnetic DIscovery of Shapelets" and code to reproduce all experiments.
Ail Framework
AIL framework - Analysis Information Leak framework
Etherscan Ml
Python Data Science and Machine Learning Library for the Ethereum and ERC-20 Blockchain
Tadw
An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Ai For Security Learning
安全场景、基于AI的安全算法和安全数据分析学习资料整理
Helioml
A book about machine learning, statistics, and data mining for heliophysics
Mldm
потоковый курс "Машинное обучение и анализ данных (Machine Learning and Data Mining)" на факультете ВМК МГУ имени М.В. Ломоносова
Drugs Recommendation Using Reviews
Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Metasra Pipeline
MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
Invoice2data
Extract structured data from PDF invoices
Subdue
The Subdue graph miner discovers highly-compressing patterns in an input graph.
Clevercsv
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
En Data mining
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
Data mining
The Ruby DataMining Gem, is a little collection of several Data-Mining-Algorithms
Vectorbt
Ultimate Python library for time series analysis and backtesting at scale
Dataflowjavasdk
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Twitter Get Old Tweets Scraper
A data scraper for retrieving old tweets in Twitter using Python3.
Biolitmap
Code for the paper "BIOLITMAP: a web-based geolocated and temporal visualization of the evolution of bioinformatics publications" in Oxford Bioinformatics.
Stocktalk
Data collection tool for social media analytics
61-120 of 285 data-mining projects