All Projects → aws-samples → cloud-experiments

aws-samples / cloud-experiments

Licence: Apache-2.0 License
Open innovation with 60 minute cloud experiments on AWS

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to cloud-experiments

document-processing-pipeline-for-regulated-industries
A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
Stars: ✭ 36 (-50%)
Mutual labels:  amazon-s3, amazon-comprehend
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+3212.5%)
Mutual labels:  amazon-athena, aws-glue
analyzing-reddit-sentiment-with-aws
Learn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level tutorial.
Stars: ✭ 40 (-44.44%)
Mutual labels:  amazon-athena, aws-glue
automating-livestream-video-monitoring
This repo presents a demo application for realtime livestream video quality monitoring using AWS serverless and AI/ML services.
Stars: ✭ 20 (-72.22%)
Mutual labels:  amazon-rekognition
lightning-tutorials
Collection of Pytorch lightning tutorial form as rich scripts automatically transformed to ipython notebooks.
Stars: ✭ 145 (+101.39%)
Mutual labels:  notebooks
python-for-excel
This is the companion repo of the O'Reilly book "Python for Excel".
Stars: ✭ 253 (+251.39%)
Mutual labels:  notebooks
examples
Example nteract notebooks with links to execution on mybinder.org
Stars: ✭ 24 (-66.67%)
Mutual labels:  notebooks
WebDAVServerSamplesJava
WebDAV server examples in Java based on IT Hit WebDAV Server Library for Java
Stars: ✭ 38 (-47.22%)
Mutual labels:  amazon-s3
demo-code
Bits of code I use during live demos
Stars: ✭ 18 (-75%)
Mutual labels:  amazon-athena
xlines
X lines of Python
Stars: ✭ 100 (+38.89%)
Mutual labels:  notebooks
fluent-bit-go-s3
[Deprecated] The predessor of fluent-bit output plugin for Amazon S3. https://aws.amazon.com/s3/
Stars: ✭ 34 (-52.78%)
Mutual labels:  amazon-s3
amazon-sagemaker-mlops-workshop
MLOps workshop with Amazon SageMaker
Stars: ✭ 39 (-45.83%)
Mutual labels:  amazon-sagemaker
serverless data pipeline example
Build and Deploy A Serverless Data Pipeline on AWS
Stars: ✭ 24 (-66.67%)
Mutual labels:  aws-glue
amazon-eventbridge-cdk-audit-service-sample
Sample of a decoupled audit service using Amazon EventBridge and AWS Step Functions. Provisioned with AWS CDK.
Stars: ✭ 25 (-65.28%)
Mutual labels:  amazon-s3
DominicanWhoCodes
DominicanWho.Codes App
Stars: ✭ 58 (-19.44%)
Mutual labels:  amazon-s3
artefactory-connectors-kit
ACK is an E(T)L tool specialized in API data ingestion. It is accessible through a Command-Line Interface. The application allows you to easily extract, stream and load data (with minimum transformations), from the API source to the destination of your choice.
Stars: ✭ 34 (-52.78%)
Mutual labels:  amazon-s3
scalikejdbc-athena
Library for using Amazon Athena JDBC Driver with ScalikeJDBC
Stars: ✭ 19 (-73.61%)
Mutual labels:  amazon-athena
Introduction to ML with TF2
A repo that gives a hands-on introduction to machine learning using TensorFlow 2.0
Stars: ✭ 16 (-77.78%)
Mutual labels:  notebooks
mash 2016 sklearn intro
Material for the MASH course on introduction to scikit-learn
Stars: ✭ 16 (-77.78%)
Mutual labels:  notebooks
amazon-rekognition-engagement-meter
The Engagement Meter calculates and shows engagement levels of an audience participating in a meeting
Stars: ✭ 49 (-31.94%)
Mutual labels:  amazon-rekognition

Cloud Experiments

Sample notebooks, starter apps, and low/no code guides for rapidly (within 60-minutes) building and running open innovation experiments on AWS Cloud

Cloud experiments follow step-by-step workflow for performing analytics, machine learning, AI, and data science on AWS cloud. We present guidance on using AWS Cloud programmatically or visually using the console, introduce relevant AWS services, explaining the code, reviewing the code outputs, evaluating alternative steps in our workflow, and ultimately designing an abstrated reusable API for rapidly deploying these experiments on AWS cloud.

Documentation: Why Cloud Experiments | What Are Cloud Experiments

Cloud Experiments: Guides | Exploratory Data Apps | Notebooks

Guides

All you need to run these experiments is access to an AWS Console from your web browser.

Flying Cars with Glue DataBrew

Smarter cities will have smarter transportation including multi-modal, eco-friendly, balancing commuter convenience with safety and social distancing. This 60 minute experiment uses open data for good and low/no code services provided by AWS to enable insights for business model innovation in smart transport use case. The experiment is intended as a step-by-step guided co-innovation and design workshop along with an AWS specialist. If you are familiar with the pre-requisites specified in the Cloud Experiment Guide (last section of this experiment) then feel free to make this experiment your own.

Exploratory Data Apps

We use Streamlit for many expriments in this repository. Streamlit is the fastest way to build and share data apps. Streamlit turns data scripts into shareable web apps in minutes. All in Python. All for free. No front‑end experience required.

These three steps will set you up for running experiments on your laptop.

Step 1: Setup AWS IAM user with programmatic access.

Step 2: Install AWS Shell and configure IAM credentials.

pip install aws-shell
aws-shell
aws> configure

Step 3: Install Streamlit. Clone repo. Add path to API. Run app.

pip install streamlit
clone https://github.com/aws-samples/cloud-experiments
export PYTHONPATH="$HOME/WhereYouClonedRepo/cloud-experiments"
streamlit run cloud-experiments/experiments/data-apps/open_data_explorer/s3_app.py

Open Data Explorer

Apps and API for exploring open data sources including AWS Registry of Open Data which lists datasets for genomics, satellite, transport, COVID, medical imaging, and other use cases in data for social good.

COVID EDA and Models

Experiments for running exploratory data analysis (EDA) and models on COVID related open datasets. This includes Case Fatality Rate model on country data from John Hopkins. EDA techniques include growth factor analysis, cases growth rate, doubling rate, recovery and mortality rate, and country-wise analysis.

Notebooks

You may want to run these notebooks using Amazon SageMaker. Amazon SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action.

COVID Insights

This experiment provides a catalog of open datasets for deriving insights related to COVID-19 and helping open source and open data community to collaborate in fighting this global threat. The notebook provides (a) reusable API to speed up open data analytics related to COVID-19, customized for India however can be adopted for other countries, (b) sample usage of the API, (c) documentation of insights, and (d) catalog of open datasets referenced.

Comprehend Medical for Electronic Health Records

Amazon Comprehend Medical is an API-level service which is HIPAA eligible and uses machine learning to extract medical information with high accuracy. The service eliminates the barriers to entry to access the biomedical knowledge stored in natural language text - from the research literature that entails biological processes and therapeutic mechanisms of action to the Electronic Medical Records that have the patients’ journeys through our healthcare systems documented. The service also helps us comb through that information and study relationships like symptoms, diagnosis, medication, dosage while redacting the Protected Health Information (PHI). This is an illustrative notebook that includes a step-by-step workflow for analyzing health data on the cloud.

Video Analytics

Analyzing video based content requires transforming from one media format (video or audio) to another format (text or numeric) while identifying relevant structure in the resulting format. This multi-media transformation requires machine learning based recognition. Analytics libraries can work on the transformed data to determine the required outcomes including visualizations and charts. The structured data in text or numeric format can also be reused as input to training new machine learning models.

Using AI Services for Analyzing Public Data

So far we have been working with structured data in flat files as our data source. What if the source is images and unstructured text. AWS AI services provide vision, transcription, translation, personalization, and forecasting capabilities without the need for training and deploying machine learning models. AWS manages the machine learning complexity, you just focus on the problem at hand and send required inputs for analysis and receive output from these services within your applications.

Exploring data with Python and Amazon S3 Select

For this notebook let us start with a big open dataset. Big enough that we will struggle to open it in Excel on a laptop. Excel has around million rows limit. We will setup AWS services to source from a 270GB data source, filter and store more than 8 million rows or 100 million data points into a flat file, extract schema from this file, transform this data, load into analytics tools, run Structured Query Language (SQL) on this data.

Optimizing data for analysis with Amazon Athena and AWS Glue

We will continue our open data analytics workflow starting with the AWS Console then moving to using the notebook. Using AWS Glue we can automate creating a metadata catalog based on flat files stored on Amazon S3. Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL.

Cloudstory API

Cloudstory API Python module and demo of using the API. The cloudstory API is documented in the other notebooks listed here.

License

This library is licensed under the Apache 2.0 License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].