All Projects → tal95shah → OLX_Scraper

tal95shah / OLX_Scraper

Licence: Apache-2.0 license
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to OLX Scraper

Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+4273.33%)
Mutual labels:  scraper, web-crawler, web-scraper, web-scraping
Scrapple
A framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+2993.33%)
Mutual labels:  web-scraper, web-scraping, scrapy
Linkedin-Client
Web scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (+180%)
Mutual labels:  scraper, web-scraper, web-scraping
Faster Than Requests
Faster requests on Python 3
Stars: ✭ 639 (+4160%)
Mutual labels:  web-scraper, web-scraping, scrapy
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+31853.33%)
Mutual labels:  scraper, web-crawler, web-scraper
Awesome Web Scraper
A collection of awesome web scaper, crawler.
Stars: ✭ 147 (+880%)
Mutual labels:  web-crawler, web-scraper, scrapy
Scrapy Craigslist
Web Scraping Craigslist's Engineering Jobs in NY with Scrapy
Stars: ✭ 54 (+260%)
Mutual labels:  web-scraper, web-scraping, scrapy
Phpscraper
PHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+886.67%)
Mutual labels:  scraper, web-scraper, web-scraping
Scrape Linkedin Selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+1493.33%)
Mutual labels:  scraper, web-scraper, web-scraping
doc crawler.py
Explore a website recursively and download all the wanted documents (PDF, ODT…)
Stars: ✭ 22 (+46.67%)
Mutual labels:  web-crawler, web-crawler-python
ant
A web crawler for Go
Stars: ✭ 264 (+1660%)
Mutual labels:  scraper, web-crawler
saveddit
Bulk Downloader for Reddit
Stars: ✭ 130 (+766.67%)
Mutual labels:  scraper, web-scraping
scrapy-wayback-machine
A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 92 (+513.33%)
Mutual labels:  web-scraping, scrapy
BookingScraper
🌎 🏨 Scrape Booking.com 🏨 🌎
Stars: ✭ 68 (+353.33%)
Mutual labels:  scraper, web-scraping
Raspagem-de-dados-para-iniciantes
Raspagem de dados para iniciante usando Scrapy e outras libs básicas
Stars: ✭ 113 (+653.33%)
Mutual labels:  web-crawler, scrapy
Amazon-Flipkart-Price-Comparison-Engine
Compares price of the product entered by the user from e-commerce sites Amazon and Flipkart 💰 📊
Stars: ✭ 41 (+173.33%)
Mutual labels:  web-crawling, web-crawler-python
TikTokDownloader PyWebIO
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音|TikTok数据爬取工具,支持API调用,在线批量解析及下载。
Stars: ✭ 919 (+6026.67%)
Mutual labels:  scraper, web-scraping
ScrapeM
A monadic web scraping library
Stars: ✭ 17 (+13.33%)
Mutual labels:  scraper, scrapping
scrapy-LBC
Araignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-6.67%)
Mutual labels:  scraper, scrapy
ioweb
Web Scraping Framework
Stars: ✭ 31 (+106.67%)
Mutual labels:  web-scraping, web-crawling

OLX_Scraper

An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

NOTE: This repository is not maintained anymore.

Screenshot

Screenshot

About

A Scrapy Program that scrapes recent ads about products and stores them in MONGODB Database. All the information regarding product to be searched is in args.py Screenshot

Change values after return command

Usage

For proper usage first install selenium and parsel.Open Command Line and type commands given below

pip install pymongo
Configure these Settings in settings.py
ITEM_PIPELINES = { 'olx_scraper.pipelines.MongoDBPipeline': 300, }
MONGODB_SERVER = "localhost" (can be changed) MONGODB_PORT = 27017(Set Whatever port mongodb is running on your system) MONGODB_DB = "" (set this) MONGODB_COLLECTION = "" (set this)
After all the above configurations have been successfully done.Then open command line and type:-
scrapy crawl scrape_olx

Result

Open MongoDB GUI and check database, Your result should be like Screenshot shown above.

Gotchas

1-You must have python 3.6 pre-installed to use this software.
2-Make sure mongodb is running before you run spider.

If

If any issue comes do write in issues column. Thanks!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].