hh researchАвтоматизация поиска и исследования вакансий с сайта hh.ru (Headhunter) с помощью методов Python. Классификация данных, поиск статистических параметров.
Stars: ✭ 36 (-56.63%)
Combo(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
Stars: ✭ 481 (+479.52%)
kaggle-codeA repository for some of the code I used in kaggle data science & machine learning tasks.
Stars: ✭ 100 (+20.48%)
Awesome Datascience📝 An awesome Data Science repository to learn and apply for real world problems.
Stars: ✭ 17,520 (+21008.43%)
Orange3🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+3697.59%)
Tsv UtilseBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Stars: ✭ 1,215 (+1363.86%)
ReaperSocial media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Stars: ✭ 240 (+189.16%)
NIDS-Intrusion-DetectionSimple Implementation of Network Intrusion Detection System. KddCup'99 Data set is used for this project. kdd_cup_10_percent is used for training test. correct set is used for test. PCA is used for dimension reduction. SVM and KNN supervised algorithms are the classification algorithms of project. Accuracy : %83.5 For SVM , %80 For KNN
Stars: ✭ 45 (-45.78%)
DatascienceCurated list of Python resources for data science.
Stars: ✭ 3,051 (+3575.9%)
PyodA Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Stars: ✭ 5,083 (+6024.1%)
DeepgraphAnalyze Data with Pandas-based Networks. Documentation:
Stars: ✭ 232 (+179.52%)
evineInteractive CLI Web Crawler
Stars: ✭ 140 (+68.67%)
nightlightNightlight: Astronomic Image Processing
Stars: ✭ 25 (-69.88%)
SubdueThe Subdue graph miner discovers highly-compressing patterns in an input graph.
Stars: ✭ 20 (-75.9%)
Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (+162.65%)
kasthack.ospГенератор сырых дампов пользователей VK.
Stars: ✭ 15 (-81.93%)
Gwu data miningMaterials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (+161.45%)
Game Datasets🎮 A curated list of awesome game datasets, and tools to artificial intelligence in games
Stars: ✭ 261 (+214.46%)
data-science-popular-algorithmsData Science algorithms and topics that you must know. (Newly Designed) Recommender Systems, Decision Trees, K-Means, LDA, RFM-Segmentation, XGBoost in Python, R, and Scala.
Stars: ✭ 65 (-21.69%)
QminerAnalytic platform for real-time large-scale streams containing structured and unstructured data.
Stars: ✭ 206 (+148.19%)
FSCNMFAn implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-80.72%)
Estadistica Con RApuntes personales sobre estadística, machine learning y lenguaje de programación R
Stars: ✭ 201 (+142.17%)
Wordtokenizers.jlHigh performance tokenizers for natural language processing and other related tasks
Stars: ✭ 63 (-24.1%)
InstascrapePowerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Stars: ✭ 202 (+143.37%)
Pyss3A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Stars: ✭ 191 (+130.12%)
Kranglkrangl is a {K}otlin DSL for data w{rangl}ing
Stars: ✭ 430 (+418.07%)
ChefboostA Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (+112.05%)
DmtkMicrosoft Distributed Machine Learning Toolkit
Stars: ✭ 2,766 (+3232.53%)
Data Science ToolkitCollection of stats, modeling, and data science tools in Python and R.
Stars: ✭ 169 (+103.61%)
Instagram-Comments-ScraperInstagram comment scraper using python and selenium. Save the comments into excel.
Stars: ✭ 73 (-12.05%)
Allstate capstoneAllstate Kaggle Competition ML Capstone Project
Stars: ✭ 72 (-13.25%)
PdftabextractA set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
Stars: ✭ 1,969 (+2272.29%)
taller SparkRTaller SparkR para las Jornadas de Usuarios de R
Stars: ✭ 12 (-85.54%)
GensimTopic Modelling for Humans
Stars: ✭ 12,763 (+15277.11%)
DataprooferA proofreader for your data
Stars: ✭ 628 (+656.63%)
Sourced Cesource{d} Community Edition (CE)
Stars: ✭ 153 (+84.34%)
sl3💪 🤔 Modern Super Learning with Machine Learning Pipelines
Stars: ✭ 93 (+12.05%)
Alimusic🎼天池阿里音乐流行趋势预测大赛,项目中涵盖了从初赛到复赛的全部核心代码。复赛的聚合数据可以在百度网盘下载,更详细的思路介绍欢迎访问我的博客。
Stars: ✭ 147 (+77.11%)
VectorbtUltimate Python library for time series analysis and backtesting at scale
Stars: ✭ 855 (+930.12%)
non-api-fb-scraperScrape public FaceBook posts from any group or user into a .csv file without needing to register for any API access
Stars: ✭ 40 (-51.81%)
LagoujobJob data mining repo for lagou.com
Stars: ✭ 256 (+208.43%)
interpretable-mlTechniques & resources for training interpretable ML models, explaining ML models, and debugging ML models.
Stars: ✭ 17 (-79.52%)
NfstreamNFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (+649.4%)
PySPODA Python package for spectral proper orthogonal decomposition (SPOD).
Stars: ✭ 50 (-39.76%)