harryandriyan / Warta Scrap
Indonesia Index News Crawler, including 10 online media
Stars: ✭ 57
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Warta Scrap
Ruiji.net
crawler framework, distributed crawler extractor
Stars: ✭ 220 (+285.96%)
Mutual labels: scraper, scrapy
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-61.4%)
Mutual labels: scraper, scrapy
scrapy-LBC
Araignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-75.44%)
Mutual labels: scraper, scrapy
Scrapoxy
Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+2219.3%)
Mutual labels: scraper, scrapy
Goribot
[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190 (+233.33%)
Mutual labels: scraper, scrapy
OpenScraper
An open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (+40.35%)
Mutual labels: scraper, scrapy
Seleniumcrawler
An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (+105.26%)
Mutual labels: scraper, scrapy
Advanced Web Scraping Tutorial
The Zipru scraper developed in the Advanced Web Scraping Tutorial.
Stars: ✭ 384 (+573.68%)
Mutual labels: scraper, scrapy
Email Extractor
The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (+42.11%)
Mutual labels: scraper, scrapy
Django Dynamic Scraper
Creating Scrapy scrapers via the Django admin interface
Stars: ✭ 1,024 (+1696.49%)
Mutual labels: scraper, scrapy
OLX Scraper
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-73.68%)
Mutual labels: scraper, scrapy
Linkedin
Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (+442.11%)
Mutual labels: scraper, scrapy
Mailinglistscraper
A python web scraper for public email lists.
Stars: ✭ 19 (-66.67%)
Mutual labels: scraper, scrapy
Voyages Sncf Api
A scrapy spider that scraps times and prices from Voyages Sncf. It uses scrapyrt to provide an API interface.
Stars: ✭ 7 (-87.72%)
Mutual labels: scraper, scrapy
Avbook
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Stars: ✭ 8,133 (+14168.42%)
Mutual labels: scraper
warta-scrap
Indonesia Index News Crawler, including 10 online
Online Media List:
- Detik.com http://news.detik.com/indeks
- Republika.co.id http://www.republika.co.id/indeks
- Viva.co.id http://www.viva.co.id/indeks
- Kompas.com http://indeks.kompas.com/
- Antaranews.com http://www.antaranews.com/terkini
- Tempo.co https://www.tempo.co/indeks
- Okezone.com http://index.okezone.com/
- Liputan6.com http://www.liputan6.com/indeks
- Merdeka.com https://www.merdeka.com/berita-hari-ini/
- Tirto.id https://tirto.id/indeks
Installation :
Open Terminal, and clone this repo:
Go to project folder
cd warta-scrap
Setup virtualenv
virtualenv venv
Activate virtualenv
. venv/bin/activate
Install requirements
pip install -r requirements.txt
How to use
Open the specific project, example
cd republika
Run crawl command, example
scrapy crawl republika -o sampleResult.json -t json
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].