Spark Jupyter AwsA guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+1750%)
Covid Chestxray DatasetWe are building an open database of COVID-19 cases with chest X-ray or CT images.
Stars: ✭ 2,759 (+19607.14%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (+892.86%)
CdapAn open source framework for building data analytic applications.
Stars: ✭ 509 (+3535.71%)
Covid19zaCoronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Stars: ✭ 208 (+1385.71%)
WeatherbenchA benchmark dataset for data-driven weather forecasting
Stars: ✭ 227 (+1521.43%)
WhylogsProfile and monitor your ML data pipeline end-to-end
Stars: ✭ 328 (+2242.86%)
Enterprise gatewayA lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
Stars: ✭ 412 (+2842.86%)
CoronawatchnlNumbers concerning COVID-19 disease cases in The Netherlands by RIVM, LCPS, NICE, ECML, and Rijksoverheid.
Stars: ✭ 135 (+864.29%)
Datasets🎁 3,000,000+ Unsplash images made available for research and machine learning
Stars: ✭ 1,805 (+12792.86%)
HandysparkHandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+1028.57%)
LacmusLacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.
Stars: ✭ 142 (+914.29%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+1328.57%)
Datasetssource{d} datasets ("big code") for source code analysis and machine learning on source code
Stars: ✭ 231 (+1550%)
Cifar 10.1Release of CIFAR-10.1, a new test set for CIFAR-10.
Stars: ✭ 166 (+1085.71%)
ZatZeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (+2064.29%)
Covid19 twitterCovid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development
Stars: ✭ 304 (+2071.43%)
Comma2k19A driving dataset for the development and validation of fused pose estimators and mapping algorithms
Stars: ✭ 391 (+2692.86%)
Dsprites DatasetDataset to assess the disentanglement properties of unsupervised learning methods
Stars: ✭ 340 (+2328.57%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+5221.43%)
ContactposeLarge dataset of hand-object contact, hand- and object-pose, and 2.9 M RGB-D grasp images.
Stars: ✭ 129 (+821.43%)
Know Your IntentState of the Art results in Intent Classification using Sematic Hashing for three datasets: AskUbuntu, Chatbot and WebApplication.
Stars: ✭ 116 (+728.57%)
Protest Detection Violence EstimationImplementation of the model used in the paper Protest Activity Detection and Perceived Violence Estimation from Social Media Images (ACM Multimedia 2017)
Stars: ✭ 114 (+714.29%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+971.43%)
Motion SenseMotionSense Dataset for Human Activity and Attribute Recognition ( time-series data generated by smartphone's sensors: accelerometer and gyroscope)
Stars: ✭ 159 (+1035.71%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (+700%)
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (+1121.43%)
Shape Detection🟣 Object detection of abstract shapes with neural networks
Stars: ✭ 170 (+1114.29%)
Trump LiesTutorial: Web scraping in Python with Beautiful Soup
Stars: ✭ 201 (+1335.71%)
MydatascienceportfolioApplying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (+1521.43%)
Taco🌮 Trash Annotations in Context Dataset Toolkit
Stars: ✭ 243 (+1635.71%)
Tehran StocksA python package to access tsetmc data
Stars: ✭ 282 (+1914.29%)
Data Science HacksData Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (+1850%)
VpgnetVPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition (ICCV 2017)
Stars: ✭ 382 (+2628.57%)
Medmnist[ISBI'21] MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis
Stars: ✭ 338 (+2314.29%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+2850%)
HelkThe Hunting ELK
Stars: ✭ 3,097 (+22021.43%)
Caffenet BenchmarkEvaluation of the CNN design choices performance on ImageNet-2012.
Stars: ✭ 700 (+4900%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+40300%)
Covid CtCOVID-CT-Dataset: A CT Scan Dataset about COVID-19
Stars: ✭ 820 (+5757.14%)
Imagenetv2A new test set for ImageNet
Stars: ✭ 109 (+678.57%)
Dataset ApiThe ApolloScape Open Dataset for Autonomous Driving and its Application.
Stars: ✭ 260 (+1757.14%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+3742.86%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+6535.71%)