All Projects → Datacompy → Similar Projects or Alternatives

1898 Open source projects that are alternatives of or similar to Datacompy

Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+180.95%)
Mutual labels:  data-science, spark, data
Cape Python
Collaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-14.97%)
Mutual labels:  data-science, spark, pandas
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+1970.75%)
Mutual labels:  data-science, spark, pandas
Pyspark Cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-26.53%)
Mutual labels:  data-science, spark, data
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+53.74%)
Mutual labels:  data-science, pandas, data
Pdpipe
Easy pipelines for pandas DataFrames.
Stars: ✭ 590 (+301.36%)
Mutual labels:  data-science, pandas, data
Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (+85.71%)
Mutual labels:  data-science, pandas, data
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+14898.64%)
Mutual labels:  data-science, spark, pandas
Datasheets
Read data from, write data to, and modify the formatting of Google Sheets
Stars: ✭ 593 (+303.4%)
Mutual labels:  data-science, pandas, data
Coffee Quality Database
Building the Coffee Quality Institute Database
Stars: ✭ 141 (-4.08%)
Mutual labels:  data-science, data
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+631.97%)
Mutual labels:  data-science, data
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-60.54%)
Mutual labels:  data-science, spark
Machinelearningcourse
A collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-76.19%)
Mutual labels:  data-science, pandas
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+570.75%)
Mutual labels:  data-science, spark
Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-62.59%)
Mutual labels:  data-science, spark
Pixiedust
Python Helper library for Jupyter Notebooks
Stars: ✭ 998 (+578.91%)
Mutual labels:  data-science, spark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-56.46%)
Mutual labels:  data-science, spark
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-55.78%)
Mutual labels:  data-science, spark
Python Cheat Sheet
Python Cheat Sheet NumPy, Matplotlib
Stars: ✭ 1,739 (+1082.99%)
Mutual labels:  data-science, pandas
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-46.26%)
Mutual labels:  data-science, spark
Openrefine
OpenRefine is a free, open source power tool for working with messy data and improving it
Stars: ✭ 8,531 (+5703.4%)
Mutual labels:  data-science, data
Graphia
A visualisation tool for the creation and analysis of graphs
Stars: ✭ 67 (-54.42%)
Mutual labels:  data-science, data
Gopup
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
Stars: ✭ 1,229 (+736.05%)
Mutual labels:  data-science, data
Pymc Example Project
Example PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.
Stars: ✭ 90 (-38.78%)
Mutual labels:  data-science, pandas
Sigmoidal ai
Tutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal
Stars: ✭ 103 (-29.93%)
Mutual labels:  data-science, pandas
Awesome Bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 10,478 (+7027.89%)
Mutual labels:  data-science, data
Data Forge Js
JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 139 (-5.44%)
Mutual labels:  pandas, data
Mlcourse.ai
Open Machine Learning Course
Stars: ✭ 7,963 (+5317.01%)
Mutual labels:  data-science, pandas
Data Forge Ts
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 967 (+557.82%)
Mutual labels:  pandas, data
Data Polygamy
Data Polygamy is a topology-based framework that allows users to query for statistically significant relationships between spatio-temporal data sets.
Stars: ✭ 39 (-73.47%)
Mutual labels:  data-science, data
Python for ml
brief introduction to Python for machine learning
Stars: ✭ 29 (-80.27%)
Mutual labels:  data-science, pandas
Skoot
A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn friendly interface in an effort to expedite the modeling process.
Stars: ✭ 50 (-65.99%)
Mutual labels:  data-science, pandas
10 Simple Hacks To Speed Up Your Data Analysis In Python
Some useful Tips and Tricks to speed up the data analysis process in Python.
Stars: ✭ 45 (-69.39%)
Mutual labels:  data-science, pandas
Ds and ml projects
Data Science & Machine Learning projects and tutorials in python from beginner to advanced level.
Stars: ✭ 56 (-61.9%)
Mutual labels:  data-science, pandas
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-86.39%)
Mutual labels:  data-science, pandas
Data science blogs
A repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-5.44%)
Mutual labels:  spark, data
Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (-59.18%)
Mutual labels:  data-science, spark
Seaborn
Statistical data visualization in Python
Stars: ✭ 9,007 (+6027.21%)
Mutual labels:  data-science, pandas
Datacomparer
dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
Stars: ✭ 58 (-60.54%)
Mutual labels:  data-science, data
Magicbox
A platform that uses real-time data to inform life-saving humanitarian responses to emergency situations
Stars: ✭ 73 (-50.34%)
Mutual labels:  data-science, data
Locopy
locopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-50.34%)
Mutual labels:  pandas, data
Flyte
Accelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (+744.9%)
Mutual labels:  data-science, data
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+5565.99%)
Mutual labels:  data-science, pandas
Codesearchnet
Datasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+837.41%)
Mutual labels:  data-science, data
Sspipe
Simple Smart Pipe: python productivity-tool for rapid data manipulation
Stars: ✭ 96 (-34.69%)
Mutual labels:  data-science, pandas
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+810.2%)
Mutual labels:  data-science, spark
Python Bigdata
Data science and Big Data with Python
Stars: ✭ 112 (-23.81%)
Mutual labels:  data-science, spark
Sweetviz
Visualize and compare datasets, target values and associations, with one line of code.
Stars: ✭ 1,851 (+1159.18%)
Mutual labels:  data-science, pandas
Seaborn Tutorial
This repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-22.45%)
Mutual labels:  data-science, pandas
Dat8
General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+931.29%)
Mutual labels:  data-science, pandas
Ml Pyxis
Tool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
Stars: ✭ 93 (-36.73%)
Mutual labels:  data-science, data
Hass Data Detective
Explore and analyse your Home Assistant data
Stars: ✭ 109 (-25.85%)
Mutual labels:  data-science, data
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+927.89%)
Mutual labels:  data-science, data
D6t Python
Accelerate data science
Stars: ✭ 118 (-19.73%)
Mutual labels:  data-science, pandas
Dbg Pds
Deutsche Boerse's Financial Trading Public Data Set
Stars: ✭ 124 (-15.65%)
Mutual labels:  data-science, data
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-17.01%)
Mutual labels:  data-science, spark
Rightmove webscraper.py
Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (-14.97%)
Mutual labels:  data-science, pandas
Pandas Videos
Jupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+1067.35%)
Mutual labels:  data-science, pandas
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1522.45%)
Mutual labels:  data-science, pandas
Data Science For Marketing Analytics
Achieve your marketing goals with the data analytics power of Python
Stars: ✭ 127 (-13.61%)
Mutual labels:  data-science, pandas
1-60 of 1898 similar projects