bouxinLou / Company Crawler
Licence: mit
天眼查爬虫&企查查爬虫,指定关键字爬取公司信息
Stars: ✭ 285
Programming Languages
Labels
Projects that are alternatives of or similar to Company Crawler
Mtail
extract internal monitoring data from application logs for collection in a timeseries database
Stars: ✭ 3,028 (+962.46%)
Mutual labels: proxy
Proxy Manager Bridge
Provides integration for ProxyManager with various Symfony components.
Stars: ✭ 274 (-3.86%)
Mutual labels: proxy
Bus
Bus 是一个基础框架、服务套件,它基于Java8编写,参考、借鉴了大量已有框架、组件的设计,可以作为后端服务的开发基础中间件。代码简洁,架构清晰,非常适合学习使用。
Stars: ✭ 253 (-11.23%)
Mutual labels: proxy
Spring Cloud Gateway
A Gateway built on Spring Framework 5.x and Spring Boot 2.x providing routing and more.
Stars: ✭ 3,305 (+1059.65%)
Mutual labels: proxy
Dorknet
Selenium powered Python script to automate searching for vulnerable web apps.
Stars: ✭ 256 (-10.18%)
Mutual labels: proxy
Websocketd
Turn any program that uses STDIN/STDOUT into a WebSocket server. Like inetd, but for WebSockets.
Stars: ✭ 15,828 (+5453.68%)
Mutual labels: proxy
Httptunnel
Bidirectional data stream tunnelled in HTTP requests.
Stars: ✭ 279 (-2.11%)
Mutual labels: proxy
Websockify
Websockify is a WebSocket to TCP proxy/bridge. This allows a browser to connect to any application/server/service.
Stars: ✭ 2,942 (+932.28%)
Mutual labels: proxy
Infini Gateway
INFINI-GATEWAY(极限网关), a high performance and lightweight gateway written in golang, for elasticsearch and his friends.
Stars: ✭ 272 (-4.56%)
Mutual labels: proxy
Socks5
A full-fledged high-performance socks5 proxy server written in C#. Plugin support included.
Stars: ✭ 286 (+0.35%)
Mutual labels: proxy
Cloudbunny
CloudBunny is a tool to capture the real IP of the server that uses a WAF as a proxy or protection. In this tool we used three search engines to search domain information: Shodan, Censys and Zoomeye.
Stars: ✭ 273 (-4.21%)
Mutual labels: proxy
天眼查、企查查
公司信息爬虫
使用说明
- 设置数据源
MysqlConfig = { 'develop': { 'host': '192.168.1.103', 'port': 3306, 'db': 'enterprise', 'username': 'root', 'password': '[email protected]' } }
- 执行
db/data.sql
生成数据结构 - 配置IP代理
config/settings
# 全局代理控制 GLOBAL_PROXY = True PROXY_POOL_URL = "http://localhost:5010"
- 设置爬取关键字
qichacha
&tianyancha
keys = ['Google'] # 设置爬取列表 crawler.load_keys(keys) crawler.start()
PS:建议使用IP代理 + 随机UA,否者一定会被ban
- 随机UA推荐fake_useragent
- 代理池推荐proxy_pool
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].