GeopythonNotebooks and libraries for spatial/geo Python explorations
Stars: ✭ 268 (+69.62%)
HelkThe Hunting ELK
Stars: ✭ 3,097 (+1860.13%)
Spark SyntaxThis is a repo documenting the best practices in PySpark.
Stars: ✭ 412 (+160.76%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+156.96%)
Pytablewriterpytablewriter is a Python library to write a table in various formats: CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
Stars: ✭ 422 (+167.09%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-84.18%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+240.51%)
Data Science Your WayWays of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (+235.44%)
Bamboolibbamboolib - a GUI for pandas DataFrames
Stars: ✭ 622 (+293.67%)
Or Pandas【运筹OR帷幄|数据科学】pandas教程系列电子书
Stars: ✭ 492 (+211.39%)
Jdata京东JData算法大赛-高潜用户购买意向预测入门程序(starter code)
Stars: ✭ 662 (+318.99%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-65.82%)
QuickvizVisualize a pandas dataframe in a few clicks
Stars: ✭ 18 (-88.61%)
LuxPython API for Intelligent Visual Data Discovery
Stars: ✭ 787 (+398.1%)
PbpythonCode, Notebooks and Examples from Practical Business Python
Stars: ✭ 1,724 (+991.14%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-92.41%)
Crime AnalysisAssociation Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-87.34%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+371.52%)
GdeltpyrPython based framework to retreive Global Database of Events, Language, and Tone (GDELT) version 1.0 and version 2.0 data.
Stars: ✭ 124 (-21.52%)
Jupyter DatatablesJupyter Notebook extension leveraging pandas DataFrames by integrating DataTables and ChartJS.
Stars: ✭ 127 (-19.62%)
Ds and ml projectsData Science & Machine Learning projects and tutorials in python from beginner to advanced level.
Stars: ✭ 56 (-64.56%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-46.84%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-44.3%)
AlphalensPerformance analysis of predictive (alpha) stock factors
Stars: ✭ 2,130 (+1248.1%)
Pydata Pandas WorkshopMaterial for my PyData Jupyter & Pandas Workshops, I'm also available for personal in-house trainings on request
Stars: ✭ 65 (-58.86%)
Practical Machine Learning With PythonMaster the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+1082.28%)
Data Analysis主要是爬虫与数据分析项目总结,外加建模与机器学习,模型的评估。
Stars: ✭ 142 (-10.13%)
Maps Location HistoryGet, Concatenate and Process you location history from Google Maps TimeLine
Stars: ✭ 99 (-37.34%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-31.65%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-31.65%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-29.11%)
PythonJupyter notebooks and datasets for the interesting pandas/python/data science video series.
Stars: ✭ 65 (-58.86%)
Pandas VideosJupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+986.08%)
Cape PythonCollaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-20.89%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+931.65%)
Repo 2019BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (-15.82%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-12.03%)
Dat8General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+859.49%)
Stock Price PredictorThis project seeks to utilize Deep Learning models, Long-Short Term Memory (LSTM) Neural Network algorithm, to predict stock prices.
Stars: ✭ 146 (-7.59%)
visionsType System for Data Analysis in Python
Stars: ✭ 136 (-13.92%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-84.18%)
Seaborn TutorialThis repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-27.85%)