WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (-19.61%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+76.27%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-63.68%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (+141.65%)
25daysinmachinelearningI will update this repository to learn Machine learning with python with statistics content and materials
Stars: ✭ 53 (-87.17%)
MachinelearningA repo with tutorials for algorithms from scratch
Stars: ✭ 96 (-76.76%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+223.97%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-72.88%)
Spark NotebookInteractive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+646%)
Machine learning refinedNotes, examples, and Python demos for the textbook "Machine Learning Refined" (published by Cambridge University Press).
Stars: ✭ 750 (+81.6%)
ZatZeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (-26.63%)
Covid19 DashboardA site that displays up to date COVID-19 stats, powered by fastpages.
Stars: ✭ 1,212 (+193.46%)
Openml RR package to interface with OpenML
Stars: ✭ 81 (-80.39%)
Ds and ml projectsData Science & Machine Learning projects and tutorials in python from beginner to advanced level.
Stars: ✭ 56 (-86.44%)
ArticlesA repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
Stars: ✭ 350 (-15.25%)
MydatascienceportfolioApplying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-45.04%)
100 Days Of Ml CodeA day to day plan for this challenge. Covers both theoritical and practical aspects
Stars: ✭ 172 (-58.35%)
Awesome PulsarA curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-86.2%)
OryxOryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Stars: ✭ 1,785 (+332.2%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+1269.49%)
Free Ai Resources🚀 FREE AI Resources - 🎓 Courses, 👷 Jobs, 📝 Blogs, 🔬 AI Research, and many more - for everyone!
Stars: ✭ 192 (-53.51%)
SkdataPython tools for data analysis
Stars: ✭ 16 (-96.13%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-64.41%)
CartolaExtração de dados da API do CartolaFC, análise exploratória dos dados e modelos preditivos em R e Python - 2014-20. [EN] Data munging, analysis and modeling of CartolaFC - the most popular fantasy football game in Brazil and maybe in the world. Data cover years 2014-19.
Stars: ✭ 304 (-26.39%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+138.74%)
Machine Learning From ScratchSuccinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-89.83%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-73.85%)
SuspeitandoProjeto de análise de contratos com suspeita de superfaturamento e má qualidade na prestação de serviços.
Stars: ✭ 76 (-81.6%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-84.5%)
Ds With PysimpleguiData science and Machine Learning GUI programs/ desktop apps with PySimpleGUI package
Stars: ✭ 93 (-77.48%)
CodesearchnetDatasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+233.66%)
Data Science Stack Cookiecutter🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (-62.95%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-86.68%)
Spark Jupyter AwsA guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (-37.29%)
Data Science HacksData Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (-33.9%)
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-58.6%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-66.34%)
Stats Maths With PythonGeneral statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (-7.75%)
Beyond Jupyter🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-67.31%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+92.01%)
Awesome StreamlitThe purpose of this project is to share knowledge on how awesome Streamlit is and can be
Stars: ✭ 769 (+86.2%)
Model Describermodel-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-94.67%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-69.01%)
Azkarra Streams🚀 Azkarra is a lightweight java framework to make it easy to develop, deploy and manage cloud-native streaming microservices based on Apache Kafka Streams.
Stars: ✭ 146 (-64.65%)