Dvcπ¦Data Version Control | Git for Data & Models | ML Experiments Management
Stars: β 9,004 (+7103.2%)
Orange3π π π‘ Orange: Interactive data analysis
Stars: β 3,152 (+2421.6%)
H1stThe AI Application Platform We All Need. Human AND Machine Intelligence. Based on experience building AI solutions at Panasonic: robotics predictive maintenance, cold-chain energy optimization, Gigafactory battery mfg, avionics, automotive cybersecurity, and more.
Stars: β 697 (+457.6%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: β 147 (+17.6%)
KoalasKoalas: pandas API on Apache Spark
Stars: β 3,044 (+2335.2%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: β 22,048 (+17538.4%)
LuxPython API for Intelligent Visual Data Discovery
Stars: β 787 (+529.6%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: β 8,329 (+6563.2%)
Crime AnalysisAssociation Rule Mining from Spatial Data for Crime Analysis
Stars: β 20 (-84%)
Optimusπ Agile Data Preparation Workflows madeΒ easy with dask, cudf, dask_cudf and pyspark
Stars: β 986 (+688.8%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: β 998 (+698.4%)
Ds and ml projectsData Science & Machine Learning projects and tutorials in python from beginner to advanced level.
Stars: β 56 (-55.2%)
RowsA common, beautiful interface to tabular data, no matter the format
Stars: β 739 (+491.2%)
DataframeC++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: β 828 (+562.4%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: β 26 (-79.2%)
FoxcrossAsyncIO serving for data science models
Stars: β 18 (-85.6%)
Mlcourse.aiOpen Machine Learning Course
Stars: β 7,963 (+6270.4%)
MachinelearningcourseA collection of notebooks of my Machine Learning class written in python 3
Stars: β 35 (-72%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: β 55 (-56%)
Rocket.chatThe communications platform that puts data protection first.
Stars: β 31,251 (+24900.8%)
SetlA simple Spark-powered ETL framework that just works πΊ
Stars: β 79 (-36.8%)
Pymc Example ProjectExample PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.
Stars: β 90 (-28%)
Danfojsdanfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Stars: β 1,304 (+943.2%)
Nothing PrivateDo you think you are safe using private browsing or incognito mode?. π πΏ This will prove that you're wrong.
Stars: β 1,375 (+1000%)
Pyspark Cheatsheetπ Quick reference guide to common patterns & functions in PySpark.
Stars: β 108 (-13.6%)
OakMeaningful control of data in distributed systems.
Stars: β 698 (+458.4%)
Rightmove webscraper.pyPython class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: β 125 (+0%)
Querido Diarioπ° Brazilian government gazettes, accessible to everyone.
Stars: β 681 (+444.8%)
BoltzmanncleanFill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Stars: β 23 (-81.6%)
PysyftA library for answering questions using data you cannot see
Stars: β 7,811 (+6148.8%)
Plots2a collaborative knowledge-exchange platform in Rails; we welcome first-time contributors! π
Stars: β 666 (+432.8%)
Place2liveAnalysis of the characteristics of different countries
Stars: β 30 (-76%)
Nsfw Filterπ A Google Chrome / Firefox extension that blocks NSFW images from the web pages that you load using TensorFlow JS.
Stars: β 984 (+687.2%)
Python for mlbrief introduction to Python for machine learning
Stars: β 29 (-76.8%)
SkootA package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn friendly interface in an effort to expedite the modeling process.
Stars: β 50 (-60%)
RumbleβοΈ Rumble 1.11.0 "Banyan Tree"π³ for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: β 58 (-53.6%)
SeabornStatistical data visualization in Python
Stars: β 9,007 (+7105.6%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: β 65 (-48%)
Container Tabs SidebarFirefox addon aiming to utilize screen estate more efficiently by showing tabs in a sidebar grouped by privacy containers. Inspired by TreeStyleTab.
Stars: β 87 (-30.4%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: β 64 (-48.8%)
SspipeSimple Smart Pipe: python productivity-tool for rapid data manipulation
Stars: β 96 (-23.2%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: β 1,338 (+970.4%)
Sigmoidal aiTutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal
Stars: β 103 (-17.6%)
Dev PracticePractice your skills with these ideas.
Stars: β 1,127 (+801.6%)
Seaborn TutorialThis repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: β 114 (-8.8%)
GoodworkSelf hosted project management and collaboration tool powered by TALL stack
Stars: β 1,730 (+1284%)
Dat8General Assembly's 2015 Data Science course in Washington, DC
Stars: β 1,516 (+1112.8%)
Syft.jsThe official Syft worker for Web and Node, built in Javascript
Stars: β 118 (-5.6%)
SweetvizVisualize and compare datasets, target values and associations, with one line of code.
Stars: β 1,851 (+1380.8%)
D6t PythonAccelerate data science
Stars: β 118 (-5.6%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: β 2,385 (+1808%)
Pandas VideosJupyter notebook and datasets from the pandas Q&A video series
Stars: β 1,716 (+1272.8%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: β 633 (+406.4%)
PyjanitorClean APIs for data cleaning. Python implementation of R package Janitor
Stars: β 647 (+417.6%)