Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → harryandriyan → Warta Scrap

harryandriyan / Warta Scrap

Indonesia Index News Crawler, including 10 online media

Programming Languages

139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Warta Scrap

crawler framework, distributed crawler extractor

Stars: ✭ 220 (+285.96%)

Mutual labels: scraper, scrapy

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (-61.4%)

Mutual labels: scraper, scrapy

Araignée LeBonCoin avec Scrapy et ElasticSearch

Stars: ✭ 14 (-75.44%)

Mutual labels: scraper, scrapy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+2219.3%)

Mutual labels: scraper, scrapy

HTTP API for Scrapy spiders

Stars: ✭ 637 (+1017.54%)

Mutual labels: scraper, scrapy

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (+233.33%)

Mutual labels: scraper, scrapy

An open source webapp for scraping: towards a public service for webscraping

Stars: ✭ 80 (+40.35%)

Mutual labels: scraper, scrapy

Seleniumcrawler

An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site

Stars: ✭ 117 (+105.26%)

Mutual labels: scraper, scrapy

A Facebook crawler

Stars: ✭ 536 (+840.35%)

Mutual labels: scraper, scrapy

Advanced Web Scraping Tutorial

The Zipru scraper developed in the Advanced Web Scraping Tutorial.

Stars: ✭ 384 (+573.68%)

Mutual labels: scraper, scrapy

Email Extractor

The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url

Stars: ✭ 81 (+42.11%)

Mutual labels: scraper, scrapy

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (+1696.49%)

Mutual labels: scraper, scrapy

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-73.68%)

Mutual labels: scraper, scrapy

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

Stars: ✭ 309 (+442.11%)

Mutual labels: scraper, scrapy

Mailinglistscraper

A python web scraper for public email lists.

Stars: ✭ 19 (-66.67%)

Mutual labels: scraper, scrapy

Voyages Sncf Api

A scrapy spider that scraps times and prices from Voyages Sncf. It uses scrapyrt to provide an API interface.

Stars: ✭ 7 (-87.72%)

Mutual labels: scraper, scrapy

Shopify Scraper (not monitor)

Stars: ✭ 41 (-28.07%)

Mutual labels: scraper

Scrapy Pyppeteer

Pyppeteer integration for Scrapy

Stars: ✭ 48 (-15.79%)

Mutual labels: scrapy

📙 中华新华字典数据库。包括歇后语，成语，词语，汉字。

Stars: ✭ 8,705 (+15171.93%)

Mutual labels: scraper

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+14168.42%)

Mutual labels: scraper

View All Similar Projects ➔

warta-scrap

Indonesia Index News Crawler, including 10 online

Online Media List:

Detik.com http://news.detik.com/indeks
Republika.co.id http://www.republika.co.id/indeks
Viva.co.id http://www.viva.co.id/indeks
Kompas.com http://indeks.kompas.com/
Antaranews.com http://www.antaranews.com/terkini
Tempo.co https://www.tempo.co/indeks
Okezone.com http://index.okezone.com/
Liputan6.com http://www.liputan6.com/indeks
Merdeka.com https://www.merdeka.com/berita-hari-ini/
Tirto.id https://tirto.id/indeks

Installation :

Open Terminal, and clone this repo:

git clone https://github.com/harryandriyan/warta-scrap

Go to project folder

cd warta-scrap

Setup virtualenv

virtualenv venv

Activate virtualenv

. venv/bin/activate

Install requirements

pip install -r requirements.txt

How to use

Open the specific project, example

cd republika

Run crawl command, example

scrapy crawl republika -o sampleResult.json -t json

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 57

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗