Programming Languages

javascript

184084 projects - #8 most used programming language

crawler

Nodejs crawler for cnbeta.com, The source code is on Github.

用于爬取并保存cnbeta新闻内容及图片
从起始文章开始爬取，异步获取上一篇文章ID并循环爬取
支持爬取总条数限制，默认50条
支持301跳转追踪
仅用于Nodejs学习，无意冒犯

使用

安装依赖：npm install
修改app.js中的startId变量为起始文章ID
运行抓取：node app [limitNumber=50]

示例

例如从该篇文章开始爬取http://www.cnbeta.com/articles/tech/620719.htm，修改 startId="620719";
执行爬取10条：node app 10

前端路上

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

tower1229 / crawler

Programming Languages

Labels

crawler

使用

示例

更多