Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-76.19%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+78.79%)
Dist KerasDistributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (+165.37%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+269.7%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+274.03%)
Spark NotebookInteractive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+1233.77%)
Scalable Data ScienceScalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Stars: ✭ 142 (-38.53%)
HubDataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+1632.9%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-44.59%)
CollapseAdvanced and Fast Data Transformation in R
Stars: ✭ 184 (-20.35%)
Covid19zaCoronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Stars: ✭ 208 (-9.96%)
Eli5A library for debugging/inspecting machine learning classifiers and explaining their predictions
Stars: ✭ 2,477 (+972.29%)
ScihubSource code and data analyses for the Sci-Hub Coverage Study
Stars: ✭ 205 (-11.26%)
StreamlitStreamlit — The fastest way to build data apps in Python
Stars: ✭ 16,906 (+7218.61%)
Python For Data ScienceA collection of Jupyter Notebooks for learning Python for Data Science.
Stars: ✭ 205 (-11.26%)
Estadistica Con RApuntes personales sobre estadística, machine learning y lenguaje de programación R
Stars: ✭ 201 (-12.99%)
Ml WorkspaceMachine Learning (Beginners Hub), information(courses, books, cheat sheets, live sessions) related to machine learning, data science and python is available
Stars: ✭ 221 (-4.33%)
LightautomlLAMA - automatic model creation framework
Stars: ✭ 196 (-15.15%)
Cml♾️ CML - Continuous Machine Learning | CI/CD for ML
Stars: ✭ 2,843 (+1130.74%)
Gspread PandasA package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (-2.16%)
Gwu data miningMaterials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (-6.06%)
Trump LiesTutorial: Web scraping in Python with Beautiful Soup
Stars: ✭ 201 (-12.99%)
CartoframesCARTO Python package for data scientists
Stars: ✭ 208 (-9.96%)
FlamlA fast and lightweight AutoML library.
Stars: ✭ 205 (-11.26%)
DashAnalytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Stars: ✭ 15,592 (+6649.78%)
ComposeA machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.
Stars: ✭ 203 (-12.12%)
Statistical LearningLecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course
Stars: ✭ 223 (-3.46%)
TsfelAn intuitive library to extract features from time series
Stars: ✭ 202 (-12.55%)
MydatascienceportfolioApplying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-1.73%)
LauraeAdvanced High Performance Data Science Toolbox for R by Laurae
Stars: ✭ 203 (-12.12%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+1154.98%)
ElasticR client for the Elasticsearch HTTP API
Stars: ✭ 227 (-1.73%)
InstascrapePowerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Stars: ✭ 202 (-12.55%)
Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-5.63%)
FastpagesAn easy to use blogging platform, with enhanced support for Jupyter Notebooks.
Stars: ✭ 2,888 (+1150.22%)
WebstructNER toolkit for HTML data
Stars: ✭ 230 (-0.43%)
Quinnpyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (-6.06%)
AchooAchoo uses a Raspberry Pi to predict if my son will need his inhaler on any given day using weather, pollen, and air quality data. If the prediction for a given day is above a specified threshold, the Pi will email his school nurse, and myself, notifying her that he may need preemptive treatment. Community-sourced health monitoring!
Stars: ✭ 200 (-13.42%)
LaleLibrary for Semi-Automated Data Science
Stars: ✭ 198 (-14.29%)
CardioCardIO is a library for data science research of heart signals
Stars: ✭ 218 (-5.63%)
RadioRadIO is a library for data science research of computed tomography imaging
Stars: ✭ 198 (-14.29%)
ChordPython package for creating beautiful interactive Chord Diagrams. Pro version available at https://m8.fyi/chord
Stars: ✭ 217 (-6.06%)
Analytics ZooDistributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
Stars: ✭ 2,448 (+959.74%)
Machine Learning ResourcesA curated list of awesome machine learning frameworks, libraries, courses, books and many more.
Stars: ✭ 226 (-2.16%)
TutorialsAI-related tutorials. Access any of them for free → https://towardsai.net/editorial
Stars: ✭ 204 (-11.69%)
CqlCategorical Query Language IDE
Stars: ✭ 196 (-15.15%)
Climate Change Data🌍 A curated list of APIs, open data and ML/AI projects on climate change
Stars: ✭ 195 (-15.58%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-6.93%)
TadA desktop application for viewing and analyzing tabular data
Stars: ✭ 2,275 (+884.85%)