Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+5171.52%)
CodeCompilation of R and Python programming codes on the Data Professor YouTube channel.
Stars: ✭ 287 (+81.65%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+746.84%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-59.49%)
ZatZeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (+91.77%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+26.58%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-60.13%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+503.8%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+524.05%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-5.06%)
Covid 19 jhu data web scrap and cleaningThis repository contains data and code used to get and clean data from https://github.com/CSSEGISandData/COVID-19 and https://www.worldometers.info/coronavirus/
Stars: ✭ 80 (-49.37%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-46.84%)
Py QuantmodPowerful financial charting library based on R's Quantmod | http://py-quantmod.readthedocs.io/en/latest/
Stars: ✭ 155 (-1.9%)
Kaggle CompetitionsThere are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-45.57%)
Pymc Example ProjectExample PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.
Stars: ✭ 90 (-43.04%)
AlmondA Scala kernel for Jupyter
Stars: ✭ 1,354 (+756.96%)
Bitcoin Value Predictor[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-42.41%)
Maps Location HistoryGet, Concatenate and Process you location history from Google Maps TimeLine
Stars: ✭ 99 (-37.34%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-31.65%)
Spark R Notebooks R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-31.01%)
Pydata Pandas WorkshopMaterial for my PyData Jupyter & Pandas Workshops, I'm also available for personal in-house trainings on request
Stars: ✭ 65 (-58.86%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-44.3%)
PythonJupyter notebooks and datasets for the interesting pandas/python/data science video series.
Stars: ✭ 65 (-58.86%)
KglabGraph-Based Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, pyarrow, etc.
Stars: ✭ 98 (-37.97%)
100 Pandas Puzzles100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
Stars: ✭ 1,382 (+774.68%)
Sigmoidal aiTutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal
Stars: ✭ 103 (-34.81%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-31.65%)
Dat8General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+859.49%)
Seaborn TutorialThis repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-27.85%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+931.65%)
Pandas VideosJupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+986.08%)
SweetvizVisualize and compare datasets, target values and associations, with one line of code.
Stars: ✭ 1,851 (+1071.52%)
PbpythonCode, Notebooks and Examples from Practical Business Python
Stars: ✭ 1,724 (+991.14%)
Jupyter DatatablesJupyter Notebook extension leveraging pandas DataFrames by integrating DataTables and ChartJS.
Stars: ✭ 127 (-19.62%)
Cape PythonCollaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-20.89%)
Repo 2019BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (-15.82%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-29.11%)
GdeltpyrPython based framework to retreive Global Database of Events, Language, and Tone (GDELT) version 1.0 and version 2.0 data.
Stars: ✭ 124 (-21.52%)
Data Analysis主要是爬虫与数据分析项目总结,外加建模与机器学习,模型的评估。
Stars: ✭ 142 (-10.13%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-6.96%)