All Projects β†’ MarwanDebbiche β†’ Post Tuto Deployment

MarwanDebbiche / Post Tuto Deployment

Licence: mit
Build and deploy a machine learning app from scratch πŸš€

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Post Tuto Deployment

Json Serverless
Transform a JSON file into a serverless REST API in AWS cloud
Stars: ✭ 108 (-70.65%)
Mutual labels:  api, aws, deployment
Seleniumcrawler
An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-68.21%)
Mutual labels:  scrapy, scraping, selenium
RARBG-scraper
With Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (-89.67%)
Mutual labels:  scraping, selenium, scrapy
Up
Up focuses on deploying "vanilla" HTTP servers so there's nothing new to learn, just develop with your favorite existing frameworks such as Express, Koa, Django, Golang net/http or others.
Stars: ✭ 8,439 (+2193.21%)
Mutual labels:  api, aws, deployment
InstaBot
Simple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-91.3%)
Mutual labels:  scraping, selenium, scrapy
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-94.02%)
Mutual labels:  scraping, scrapy
XMQ-BackUp
ε°ε―†εœˆε€‡δ»½οΌŒεœˆε­/话钘/图片/文仢。
Stars: ✭ 22 (-94.02%)
Mutual labels:  selenium, scrapy
memes-api
API for scrapping common meme sites
Stars: ✭ 17 (-95.38%)
Mutual labels:  scraping, scrapy
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-81.52%)
Mutual labels:  scraping, scrapy
extensiveautomation-server
Extensive Automation server
Stars: ✭ 19 (-94.84%)
Mutual labels:  deployment, selenium
policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-94.02%)
Mutual labels:  scraping, scrapy
schedule-tweet
Schedules tweets using TweetDeck
Stars: ✭ 14 (-96.2%)
Mutual labels:  scraping, selenium
proxi
Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-91.3%)
Mutual labels:  scraping, scrapy
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-89.67%)
Mutual labels:  scraping, scrapy
scrapy-zyte-smartproxy
Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
Stars: ✭ 317 (-13.86%)
Mutual labels:  scraping, scrapy
chesf
CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-95.11%)
Mutual labels:  scraping, selenium
bots-zoo
No description or website provided.
Stars: ✭ 59 (-83.97%)
Mutual labels:  scraping, selenium
Vue Cli Plugin S3 Deploy
A vue-cli plugin that uploads your built Vue.js project to an S3 bucket
Stars: ✭ 304 (-17.39%)
Mutual labels:  aws, deployment
Edu Mail Generator
Generate Free Edu Mail(s) within minutes
Stars: ✭ 301 (-18.21%)
Mutual labels:  scraping, selenium
Linkedin
Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (-16.03%)
Mutual labels:  scrapy, scraping

End 2 End Machine Learning : From Data Collection to Deployment πŸš€

In this job, I collaborated with Ahmed BESBES

Medium post here.

You may also read about it here and here.

In this post, we'll go through the necessary steps to build and deploy a machine learning application. This starts from data collection to deployment; and the journey, you'll see, is exciting and fun. πŸ˜€

Before we begin, let's have a look at the app we'll build:

As you see, this web app allows a user to evaluate random brands by writing reviews. While writing, the user will see the sentiment score of his input updating in real-time, alongside a proposed 1 to 5 rating.

The user can then change the rating in case the suggested one does not reflect his views, and submit.

You can think of this as a crowd sourcing app of brand reviews, with a sentiment analysis model that suggests ratings that the user can tweak and adapt afterwards.

To build this application, we'll follow these steps:

  • Collecting and scraping customer reviews data using Selenium and Scrapy
  • Training a deep learning sentiment classifier on this data using PyTorch
  • Building an interactive web app using Dash
  • Setting a REST API and a Postgres database
  • Dockerizing the app using Docker Compose
  • Deploying to AWS

Project architecture

Run the app locally

To run this project locally using Docker Compose run:

docker-compose build
docker-compose up

You can then access the dash app at http://localhost:8050

Development

If you want to contribute to this project and run each service independently:

Launch API

In order to launch the API, you will first need to run a local postgres db using Docker:

docker run --name postgres -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=password -e POSTGRES_DB=postgres -p 5432:5432 -d postgres

Then you'll have to type the following commands:

cd src/api/
python app.py

Launch Dash app

In order to run the dash server to visualize the output:

cd src/dash/
python app.py

How to contribute 😁

Feel free to contribute! Report any bugs in the issue section.

Here are the few things we noticed, and wanted to add.

  • [ ] Add server-side pagination for Admin Page and GET /api/reviews route.
  • [ ] Protect admin page with authentication.
  • [ ] Either use Kubernetes or Amazon ECS to deploy the app on a cluster of containers, instead of on one single EC2 instance.
  • [ ] Use continuous deployment with Travis CI
  • [ ] Use a managed service such as RDD for the database

Licence

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].