Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → TaiwanStat → Taiwan News Crawlers

TaiwanStat / Taiwan News Crawlers

Licence: mit

Scrapy-based Crawlers for news of Taiwan

Programming Languages

python

139335 projects - #7 most used programming language

Labels

crawler scrapy news taiwan

Projects that are alternatives of or similar to Taiwan News Crawlers

Scrapy Redis

Redis-based components for Scrapy.

Stars: ✭ 4,998 (+5921.69%)

Mutual labels: crawler, scrapy

Icrawler

A multi-thread crawler framework with many builtin image crawlers provided.

Stars: ✭ 629 (+657.83%)

Mutual labels: crawler, scrapy

Fbcrawl

A Facebook crawler

Stars: ✭ 536 (+545.78%)

Mutual labels: crawler, scrapy

Tsrtc

台灣股票即時爬蟲。Taiwan Stock Exchange Real Time Crawler

Stars: ✭ 359 (+332.53%)

Mutual labels: taiwan, crawler

News Please

news-please - an integrated web crawler and information extractor for news that just works.

Stars: ✭ 969 (+1067.47%)

Mutual labels: news, crawler

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (+459.04%)

Mutual labels: crawler, scrapy

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (+602.41%)

Mutual labels: crawler, scrapy

Scrapy Crawlera

Crawlera middleware for Scrapy

Stars: ✭ 281 (+238.55%)

Mutual labels: crawler, scrapy

Scrapy Azuresearch Crawler Samples

Scrapy as a Web Crawler for Azure Search Samples

Stars: ✭ 20 (-75.9%)

Mutual labels: crawler, scrapy

Py3 scripts

Life is short, *****.

Stars: ✭ 5 (-93.98%)

Mutual labels: crawler, scrapy

Vault

swiss army knife for hackers

Stars: ✭ 346 (+316.87%)

Mutual labels: crawler, scrapy

Terpene Profile Parser For Cannabis Strains

Parser and database to index the terpene profile of different strains of Cannabis from online databases

Stars: ✭ 63 (-24.1%)

Mutual labels: crawler, scrapy

Ttbot

今日头条机器人，支持用户登陆、关注、取消关注、获取关注粉丝、发文、发悟空问答、点赞、评论、采集各种类型新闻讯息等，使用今日头条网页版API实现

Stars: ✭ 338 (+307.23%)

Mutual labels: news, crawler

Haipproxy

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Stars: ✭ 4,993 (+5915.66%)

Mutual labels: crawler, scrapy

Tsec

台灣上市上櫃股票爬蟲 Taiwan Stock Exchange Crawler

Stars: ✭ 327 (+293.98%)

Mutual labels: taiwan, crawler

Wechatsogou

基于搜狗微信搜索的微信公众号爬虫接口

Stars: ✭ 5,220 (+6189.16%)

Mutual labels: crawler, scrapy

Woid

Simple news aggregator displaying top stories in real time

Stars: ✭ 204 (+145.78%)

Mutual labels: news, crawler

ptt-web-crawler

PTT 網路版爬蟲

Stars: ✭ 20 (-75.9%)

Mutual labels: crawler, scrapy

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+667.47%)

Mutual labels: crawler, scrapy

Crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Stars: ✭ 8,392 (+10010.84%)

Mutual labels: crawler, scrapy

View All Similar Projects ➔

Taiwan-news-crawlers

🐞 Scrapy-based Crawlers for news of Taiwan including 10 media companies:

蘋果日報
中國時報
中央社
華視
東森新聞雲
自由時報
公視
三立
TVBS
UDN

Getting Started

$ git clone https://github.com/TaiwanStat/Taiwan-news-crawlers.git
$ cd Taiwan-news-crawlers
$ pip install -r requirements.txt
$ scrapy crawl apple -o apple_news.json

Prerequisites

Python3
Scrapy 1.3.0

Usage

scrapy crawl <spider> -o <output_name>

Available spiders

apple
appleRealtime
china
cna
cts
ettoday
liberty
libertyRealtime
pts
setn
tvbs
udn

Output

Key	Value
website	the publisher
url	the origin web
title	the news title
content	the news content
category	the category of news

License

The MIT License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 83

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗