AkshareAKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 4,334 (+252.64%)
FlyteAccelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (+1.06%)
SkdataPython tools for data analysis
Stars: ✭ 16 (-98.7%)
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+300.24%)
RetrieverQuickly download, clean up, and install public datasets into a database management system
Stars: ✭ 241 (-80.39%)
Knowledge RepoA next-generation curated knowledge sharing platform for data scientists and other technical professions.
Stars: ✭ 4,956 (+303.25%)
Data Science HacksData Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (-77.79%)
CodesearchnetDatasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+12.12%)
OpenrefineOpenRefine is a free, open source power tool for working with messy data and improving it
Stars: ✭ 8,531 (+594.14%)
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-86.09%)
DatacomparerdataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
Stars: ✭ 58 (-95.28%)
PycmMulti-class confusion matrix library in Python
Stars: ✭ 1,076 (-12.45%)
Machine Learning ResourcesA curated list of awesome machine learning frameworks, libraries, courses, books and many more.
Stars: ✭ 226 (-81.61%)
DatacleanerThe premier open source Data Quality solution
Stars: ✭ 391 (-68.19%)
GraphiaA visualisation tool for the creation and analysis of graphs
Stars: ✭ 67 (-94.55%)
Tsrepr TSrepr: R package for time series representations
Stars: ✭ 75 (-93.9%)
Dream3dData Analysis program and framework for materials science data analytics, based on the managing framework SIMPL framework.
Stars: ✭ 73 (-94.06%)
DatasheetsRead data from, write data to, and modify the formatting of Google Sheets
Stars: ✭ 593 (-51.75%)
ElkiELKI Data Mining Toolkit
Stars: ✭ 613 (-50.12%)
RowsA common, beautiful interface to tabular data, no matter the format
Stars: ✭ 739 (-39.87%)
Awesome StreamlitThe purpose of this project is to share knowledge on how awesome Streamlit is and can be
Stars: ✭ 769 (-37.43%)
DataframeC++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: ✭ 828 (-32.63%)
Cookbook 2nd CodeCode of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (-55.98%)
PachydermReproducible Data Science at Scale!
Stars: ✭ 5,305 (+331.65%)
RumaleRumale is a machine learning library in Ruby
Stars: ✭ 526 (-57.2%)
PdpipeEasy pipelines for pandas DataFrames.
Stars: ✭ 590 (-51.99%)
Imbalanced LearnA Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
Stars: ✭ 5,617 (+357.04%)
MagicboxA platform that uses real-time data to inform life-saving humanitarian responses to emergency situations
Stars: ✭ 73 (-94.06%)
DapyEasy-to-use data analysis / manipulation framework for humans
Stars: ✭ 523 (-57.45%)
Cookbook 2ndIPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (-42.72%)
MetabaseThe simplest, fastest way to get business intelligence and analytics to everyone in your company 😋
Stars: ✭ 26,803 (+2080.88%)
DataprooferA proofreader for your data
Stars: ✭ 628 (-48.9%)
Riceteacatpandarepo with challenge material for riceteacatpanda (2020)
Stars: ✭ 18 (-98.54%)
Disk.frameFast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data
Stars: ✭ 517 (-57.93%)
NfstreamNFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (-49.39%)
ResourcesPyMC3 educational resources
Stars: ✭ 930 (-24.33%)
Model Describermodel-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-98.21%)
Hyperlearn50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
Stars: ✭ 1,204 (-2.03%)
PydatasetInstant access to many datasets in Python.
Stars: ✭ 880 (-28.4%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+577.71%)
Dataframes.jlIn-memory tabular data in Julia
Stars: ✭ 951 (-22.62%)
Data Forge TsThe JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 967 (-21.32%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (-29.7%)
Mlcourse.aiOpen Machine Learning Course
Stars: ✭ 7,963 (+547.93%)
Janitorsimple tools for data cleaning in R
Stars: ✭ 981 (-20.18%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-19.77%)
MathematicavsrExample projects, code, and documents for comparing Mathematica with R.
Stars: ✭ 41 (-96.66%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (-30.51%)
ApogeeTools for dealing with APOGEE data
Stars: ✭ 34 (-97.23%)
Data PolygamyData Polygamy is a topology-based framework that allows users to query for statistically significant relationships between spatio-temporal data sets.
Stars: ✭ 39 (-96.83%)