Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Stars: ✭ 22 (-96.05%)

Mutual labels: scrapy

Scrapy demo

all kinds of scrapy demo

Stars: ✭ 128 (-77.02%)

Mutual labels: scrapy

cappy

☕🗄CAching Proxy in Python – Simple file based python http proxy

Stars: ✭ 15 (-97.31%)

Mutual labels: requests

Dialogue.moe

Stars: ✭ 127 (-77.2%)

Mutual labels: scrapy

Xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Stars: ✭ 335 (-39.86%)

Mutual labels: xpath

Crawlab Lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

Stars: ✭ 122 (-78.1%)

Mutual labels: scrapy

Scrapy IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

Stars: ✭ 100 (-82.05%)

Mutual labels: scrapy

Qqmusicspider

基于Scrapy的QQ音乐爬虫(QQ Music Spider)，爬取歌曲信息、歌词、精彩评论等，并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料

Stars: ✭ 120 (-78.46%)

Mutual labels: scrapy

douban-spider

基于Scrapy框架的豆瓣电影爬虫

Stars: ✭ 25 (-95.51%)

Mutual labels: scrapy

HeWeather

HomeAssistant HeWeather Plugin

Stars: ✭ 66 (-88.15%)

Mutual labels: requests

Copybook

用爬虫爬取小说网站上所有小说，存储到数据库中，并用爬到的数据构建自己的小说网站

Stars: ✭ 117 (-78.99%)

Mutual labels: scrapy

Requests Respectful

Minimalist Requests wrapper to work within rate limits of any amount of services simultaneously. Parallel processing friendly.

Stars: ✭ 417 (-25.13%)

Mutual labels: requests

Maria Quiteria

Backend para coleta e disponibilização dos dados 📜

Stars: ✭ 115 (-79.35%)

Mutual labels: scrapy

163Music

163music spider by scrapy.

Stars: ✭ 60 (-89.23%)

Mutual labels: scrapy

Weibo hot search

微博爬虫：每天定时爬取微博热搜榜的内容，留下互联网人的记忆。

Stars: ✭ 113 (-79.71%)

Mutual labels: scrapy

ptt-web-crawler

PTT 網路版爬蟲

Stars: ✭ 20 (-96.41%)

Mutual labels: scrapy

Programer log

最新动态在这里【我的程序员日志】

Stars: ✭ 112 (-79.89%)

Mutual labels: scrapy

pyinrail

A python wrapper for Indian Railways Enquiry API!

Stars: ✭ 40 (-92.82%)

Mutual labels: requests

Hive

lots of spider (很多爬虫）

Stars: ✭ 110 (-80.25%)

Mutual labels: scrapy

Htmlquery

htmlquery is golang XPath package for HTML query.

Stars: ✭ 338 (-39.32%)

Mutual labels: xpath

Scrapyd Cluster On Heroku

Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉

Stars: ✭ 106 (-80.97%)

Mutual labels: scrapy

htmx-talk-2021

Code examples and slides from my 2021 talk Server-Side is Dead! Long Live Server-Side (+ HTMX), presented at DjangoCon and Code Code Code

Stars: ✭ 18 (-96.77%)

Mutual labels: requests

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (-82.05%)

Mutual labels: scrapy

dannyAVgleDownloader

知名網站avgle下載器

Stars: ✭ 27 (-95.15%)

Mutual labels: scrapy

Proxy server crawler

an awesome public proxy server crawler based on scrapy framework

Stars: ✭ 94 (-83.12%)

Mutual labels: scrapy

torchestrator

Spin up Tor containers and then proxy HTTP requests via these Tor instances

Stars: ✭ 32 (-94.25%)

Mutual labels: scrapy

Distributed Multi User Scrapy System With A Web Ui

Django based application that allows creating, deploying and running Scrapy spiders in a distributed manner

Stars: ✭ 88 (-84.2%)

Mutual labels: scrapy

Pycookiecheat

Borrow cookies from your browser's authenticated session for use in Python scripts.

Stars: ✭ 465 (-16.52%)

Mutual labels: requests

Email Extractor

The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url

Stars: ✭ 81 (-85.46%)

Mutual labels: scrapy

wc18-cli

An easy command line interface for the 2018 World Cup

Stars: ✭ 15 (-97.31%)

Mutual labels: requests

Capturer

capture pictures from website like sina, lofter, huaban and so on

Stars: ✭ 76 (-86.36%)

Mutual labels: scrapy

rigor

HTTP-based DSL for for validating RESTful APIs

Stars: ✭ 65 (-88.33%)

Mutual labels: requests

Image Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

Stars: ✭ 1,173 (+110.59%)

Mutual labels: scrapy

animecenter

The source code for animecenter

Stars: ✭ 16 (-97.13%)

Mutual labels: scrapy

Taobao duoshou

使用Scrapy采集淘宝数据，Flask展示

Stars: ✭ 63 (-88.69%)

Mutual labels: scrapy

Node Request Retry

💂 Wrap NodeJS request module to retry http requests in case of errors

Stars: ✭ 330 (-40.75%)

Mutual labels: requests

ArticleSpider

Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).

Stars: ✭ 34 (-93.9%)

Mutual labels: scrapy

image-crawler

An image scraper that scraps images from unsplash.com

Stars: ✭ 12 (-97.85%)

Mutual labels: requests

Jawbreaker

A Python obfuscator using HTTP Requests and Hastebin.

Stars: ✭ 50 (-91.02%)

Mutual labels: requests

uiautomatorview

给uiautomatorview添加xpath等待