dflibIn-memory Java DataFrame library
Stars: ✭ 50 (+108.33%)
polarsFast multi-threaded DataFrame library in Rust | Python | Node.js
Stars: ✭ 6,368 (+26433.33%)
taller SparkRTaller SparkR para las Jornadas de Usuarios de R
Stars: ✭ 12 (-50%)
Cookbook 2nd CodeCode of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (+2154.17%)
ElkiELKI Data Mining Toolkit
Stars: ✭ 613 (+2454.17%)
Static FrameImmutable and grow-only Pandas-like DataFrames with a more explicit and consistent interface.
Stars: ✭ 217 (+804.17%)
python-notebooksA collection of Jupyter Notebooks used in conferences or just to have some snippets.
Stars: ✭ 14 (-41.67%)
PracticalMachineLearningA collection of ML related stuff including notebooks, codes and a curated list of various useful resources such as books and softwares. Almost everything mentioned here is free (as speech not free food) or open-source.
Stars: ✭ 60 (+150%)
Knowage ServerKnowage is the professional open source suite for modern business analytics over traditional sources and big data systems.
Stars: ✭ 276 (+1050%)
TipdmTipDM建模平台,开源的数据挖掘工具。
Stars: ✭ 130 (+441.67%)
PandastableTable analysis in Tkinter using pandas DataFrames.
Stars: ✭ 376 (+1466.67%)
DataframeC++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: ✭ 828 (+3350%)
tutorialsShort programming tutorials pertaining to data analysis.
Stars: ✭ 14 (-41.67%)
TablesawJava dataframe and visualization library
Stars: ✭ 2,785 (+11504.17%)
LagoujobJob data mining repo for lagou.com
Stars: ✭ 256 (+966.67%)
PyodA Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Stars: ✭ 5,083 (+21079.17%)
genieGenie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)
Stars: ✭ 21 (-12.5%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+3458.33%)
VectorbtUltimate Python library for time series analysis and backtesting at scale
Stars: ✭ 855 (+3462.5%)
Rightmove webscraper.pyPython class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (+420.83%)
PycmMulti-class confusion matrix library in Python
Stars: ✭ 1,076 (+4383.33%)
Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (+808.33%)
DatascienceCurated list of Python resources for data science.
Stars: ✭ 3,051 (+12612.5%)
Dominando-PandasEste repositório está destinado ao processo de aprendizagem da biblioteca Pandas.
Stars: ✭ 22 (-8.33%)
datatileA library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+1645.83%)
Morpheus CoreThe foundational library of the Morpheus data science framework
Stars: ✭ 203 (+745.83%)
genericsNo description or website provided.
Stars: ✭ 25 (+4.17%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+879.17%)
pandas-workshopAn introductory workshop on pandas with notebooks and exercises for following along.
Stars: ✭ 161 (+570.83%)
twitter-analytics-wrapperA simple Python wrapper to download tweets data from the Twitter Analytics platform. Particularly interesting for the impressions metrics that are unavailable on current Twitter API. Also works for the videos data.
Stars: ✭ 44 (+83.33%)
genieclustGenie++ Fast and Robust Hierarchical Clustering with Noise Point Detection - for Python and R
Stars: ✭ 34 (+41.67%)
UrsUniversal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Stars: ✭ 275 (+1045.83%)
Ai Learn人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Stars: ✭ 4,387 (+18179.17%)
Pydataroadopen source for wechat-official-account (ID: PyDataLab)
Stars: ✭ 302 (+1158.33%)
DataFrameDataFrame Library for Java
Stars: ✭ 51 (+112.5%)
Model Describermodel-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-8.33%)
Drugs Recommendation Using ReviewsAnalyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Stars: ✭ 35 (+45.83%)
Cookbook 2ndIPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+2833.33%)
DexDex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
Stars: ✭ 1,238 (+5058.33%)
Tsrepr TSrepr: R package for time series representations
Stars: ✭ 75 (+212.5%)
DataprooferA proofreader for your data
Stars: ✭ 628 (+2516.67%)
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (+612.5%)
Pipelinethe `pipeline` shell command
Stars: ✭ 168 (+600%)
DeepgraphAnalyze Data with Pandas-based Networks. Documentation:
Stars: ✭ 232 (+866.67%)
woodworkWoodwork is a Python library that provides robust methods for managing and communicating data typing information.
Stars: ✭ 97 (+304.17%)
go-ringbufLock-free MPMC Ring Buffer (Generic) for SMP, in golang. Some posts in chinese:
Stars: ✭ 43 (+79.17%)
NfstreamNFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (+2491.67%)
Sourced Cesource{d} Community Edition (CE)
Stars: ✭ 153 (+537.5%)
isarn-sketches-sparkRoutines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (+16.67%)
dh-coreFunctional data science
Stars: ✭ 123 (+412.5%)