Awesome AWS Research
A curated list of awesome AWS workshops, open source repos, guides, blogs, and other resources aimed at research on AWS.
To find out more about how AWS is working with Academic Researchers in collaboration with National Science Foundation (NSF) and National Institutes of Health (NIH) in the field of Computer, Biomedical, Engineering and Information Science. AWS Research Initiatives
Along with the collaboration above AWS provides Cloud Credits for Research
Contents
Getting Started
Learn all about core services available for compute, storage, networking, and how to control cost in this section with services like Elastic Cloud Compute (EC2), Simple Storage Service (S3), networking fundementals with Transit Gateway, and cost control with AWS Budgets and CloudWatch Alerts.
Guides
- Intro to AWS
- Open Guide to AWS
- Learn about AWS Global Infrastructure
- Open Data Registry on AWS
- Amazon Science
- AWS Training & Certificaton Free Cloud Essentials Digital Training
- AWS Innovation Sandbox
Tutorials
- Launch a Linux Virtual Machine
- Store and Retrieve a file
- Host Static Web Site
- Controlling Costs with Free tier and Budgets
- Set up HIPAA Reference Architecture on AWS
- FHIRWorks on AWS
- Service Workbench on AWS
- HITRUST on AWS
Services
Whitepapers
- Overview of AWS Services
- Right sizing for cost optimzation
- Cost Management
- Big Data & Analytics overview
- Storage services overview
- Deep Learning
- EC2 Spot Best Practices
Workshops
- AWS Account & Root User Basics
- Identity and Access Management
- Launch EC2 Spot Instances
- Amazon S3 & FSx for Lustre: Dive on high-performance file storage
Videos
- Amazon EC2 foundations
- AWS Outposts: Extend the AWS experience to on-premises environments
- Cost Optimization with Containers and Spot
- AWS networking fundamentals
- Monitoring the Earth without costing the world
- Get your data to AWS: How to choose and use data migration services
- Connectivity to AWS and hybrid AWS network architectures
- NIH uses the cloud to accelerate cures
- Accelerating Time to Science Using Cloud
- Enabling Research Using Cloud
HPC
IN this section you will learn all about HPC services offered in AWS like cloud native scheduling with AWS Batch, using traditional job schedulers like Slurm with AWS ParallelCluster, and features like Elastic Fabric Adapter (EFA) to scale Message Passing Interface (MPI) and Machine Learning (ML) jobs in your cloud HPC clusters.
General Information
- HPC on AWS
- AWS Batch Landing Page
- AWS Batch Documentation
- AWS Batch Getting Started Guide
- ParallelCluster Landing Page
- ParellelCluster Documentation
- ParallelCluster Getting Started Guide
Workshops
- Introduction to ParallelCluster
- ParallelCluster with FSx for Luster
- Introduction to AWS Batch with CARLA driving simulator
- Monte Carlo Simulations on AWS Batch with Spot
- AWS Batch Genomics Workflows with Cromwell and Nextflow
Blogs
- Building an interactive and scalable ML research environment using AWS ParallelCluster
- A Scientist's Guide to Cloud-HPC: Example with AWS ParallelCluster, Slurm, Spack, and WRF
Videos
- AWS infrastructure for large-scale training at Facebook AI
- Enabling Research using Hybrid HPC Cloud Computing
Tutorials
HPC Bursting
HT Condor
Slurm
Machine Learning
Learn about end-to-end Machine Learning resources in AWS like Amazon SageMaker, leverage AI Services like Amazon Comprehend for sentiment analysis, Amazon Transcribe for speech to text, or Translate to provide language translation without needing the knowledge of building ML models.
General Information
- Landing Page
- SageMaker Documentation
- AWS Training & Certification Free Digital Training
- AWS Machine Learning Research Awards
- Coursera Getting Started with AWS Machine Learning
- AWSLabs Machine Learning Samples
- Augmented AI Example UIs
- Group Truth data labeling UIs
Workshops
- Distributed Training with SageMaker and Horovod
- Distributed Training with EKS and Horovod
- FHIR Integration with Amazon Comprehend Medical
- Everything DeepRacer
- SpaceNet on SageMaker
- ETL Pipelines for SageMaker
- Amazon SageMaker Heart Disease Prediction
- MLOps with Amazon SageMaker
- Object Detection from scratch with Amazon SageMaker
- Elastic Inference Object Detection with Amazon SageMaker
- Document Understanding Solution
- AI-Powered Health Data Masking
Blogs
SageMaker
- AWS DataExchange and Amazon SageMaker for sharing data for ML workloads
- Semantic segmentation labeling with Amazon SageMaker Ground Truth
- Amazon SageMaker multi-model inference endpoints
- Annotate DICOM images and build an ML model
- Batch Inference with Amazon SageMaker and Tensorflow
- Optimizing TensorFlow model serving with Kubernetes and Amazon Elastic Inference
- Power contextual bandits using continual learning with Amazon SageMaker RL
- Speed up training on Amazon SageMaker using Amazon FSx for Lustre and Amazon EFS file systems
- Git integration with Amazon SageMaker
- Build end-to-end machine learning workflows with Amazon SageMaker and Apache Airflow
- Amazon SageMaker automatic model tuning now supports random search and hyperparameter scaling
- Architecting ML with Amazon SageMaker 3 day course
- Classification of chest x-rays with Amazon SageMaker
- Building predictive disease models using Amazon SageMaker with Amazon HealthLake normalized data
AI Services
- Custom Classifier with Amazon Comprehend
- Amazon Rekognition custom labels
- Applying voice classification in an Amazon Connect telemedicine contact flow
- Building a medical image search platform on AWS
- Custom Classifier with Amazon Comprehend
- Build a custom entity recognizer using Amazon Comprehend
- De-identify medical images with the help of Amazon Comprehend Medical and Amazon Rekognition
- Map clinical notes to the OMOP Common Data Model and healthcare ontologies using Amazon Comprehend Medical
- Population health applications with Amazon HealthLake
- Introduction to Amazon Transcribe
- Introduction to Amazon Comprehend Medical
- Enabling efficient patient care using Amazon AI services
- Population health applications with Amazon HealthLake
Videos
- Artificial Intelligence and Machine Learning in Research
- Deep learning for disaster management and response
- End-to-end machine learning using Spark and Amazon SageMaker
- Amazon SageMaker deep dive: A modular solution for machine learning
- Insights into patient health with Amazon Comprehend Medical
- Build accurate training datasets with Amazon SageMaker Ground Truth
- How Amazon Sagemaker can help?
Tutorials
Containers
AWS has a number of container offerings like Elastic Container Service (ECS) or Elastic Kubernetes Service (EKS). Learn about how to use managed Kubernetes for Machine Learning, Analytics, and HPC workloads.
General Information
- AWS Training & Certification Free Digital Training
- SageMaker Operators for Kubernetes
- EKS Worker Node Drainer
Workshops
- Everything EKS Workshop
- EKS and Kops Kubernetes Networking
- EKS with Kubeflow
- Apache Spark on EKS
- EKS with Terrform and Kubeflow
Blogs
- GitOps with Weave Flux on EKS
- Kubernetes workloads on EC2 Spot with EKS
- Using the FSx for Lustre CSI Driver with Amazon EKS
- Deploy Apache Spark jobs on EKS
- Optimizing Distributed Deep Learning Performance on Amazon EKS
Videos
- Running Kubernetes Applications on AWS Fargate
- EKS under the hood
- Running Kubernetes at Amazon scale using Amazon EKS
- Building machine-learning infrastructure on Amazon EKS with Kubeflow
- Top 5 container and Kubernetes best practices
- Running Containers in a Hybrid Environment
Tutorials
Security
Security at AWS is job zero and here you can learn all about the available services for security and how to implement workloads minimizing blast radius, integrating identity, and protecting your storage workloads at rest and in transit with service like Key Management Service (KMS), Identity and Access Management (IAM), and AWS GuardDuty for threat detection.
General Information
Workshops
- Introduction to Security
- Security Best Practices
- Protect Data at Rest
- Protect Data in Transit
- SAML with ADFS or Shibboleth
- Threat Detection with SageMaker and GuardDuty
Blogs
Videos
Tutorials
Robotics
AWS RoboMaker, it is easy to enable a robot to stream data, navigate, communicate, comprehend, and learn. Tasks that once could either not be done or took months can now be done in hours or days. RoboMaker provides an IDE, simulation service, fleet management capabilities, and seamless integration with various Amazon and AWS services to empower customers to innovate and provide best-of-class robotic solutions.
Workshops
- Build and operate a smart robot
- AWS RoboMaker Turtlebot
- RISE OF THE MACHINES: BRING ARTIFICIAL INTELLIGENCE TO YOUR ROBOT
- Finding Martians with AWS RoboMaker and the JPL Open Source Rover
- Voice controlled robots
Blogs
- An introduction to reinforcement learning with AWS RoboMaker
- Deploy Robotic Applications Using AWS RoboMaker
- Fresno State builds an autonomous Bulldog Bot with AWS RoboMaker
Videos
Management & Governance
Enable, Provision, and Operate at scale by using AWS Control Tower for your account management solution. Utilize AWS CLoudFormation or the Cloud Development Kit (CDK) to provision resources in your accounts providing the ability to not only reproduce your research but the environment it ran in as well.
General Information
Workshops
Blogs
Videos
- Architecting security & governance across your landing zone
- Deep dive into AWS Cloud Development Kit
Genomics
General Information
Workshops
Blogs
- A generalized approach to benchmarking genomics workloads in the cloud
- Using Amazon FSx for Lustre for Genomics Workflows on AWS
Videos
Tutorials
- Genomics Secondary Analysis Using AWS Step Functions and AWS Batch
- Hail on AWS
- Genomics Tertiary Analysis and Data Lakes Using AWS Glue and Amazon Athena
- Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker
Contribute
Contributions welcome! Read the contribution guidelines first.
License
To the extent possible under law, Randy Ridgley has waived all copyright and related or neighboring rights to this work.