Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Stars: ✭ 22 (-48.84%)

Mutual labels: scraping

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+11827.91%)

Mutual labels: scraping

Katana

A Python Tool For google Hacking

Stars: ✭ 355 (+725.58%)

Mutual labels: scraping

Webhere

HTML scraping for Objective-C.

Stars: ✭ 16 (-62.79%)

Mutual labels: scraping

Tinking

🧶 Extract data from any website without code, just clicks.

Stars: ✭ 331 (+669.77%)

Mutual labels: scraping

Facebook Scraper

Scrape Facebook public pages without an API key

Stars: ✭ 499 (+1060.47%)

Mutual labels: scraping

Elixir Scrape

Scrape any website, article or RSS/Atom Feed with ease!

Stars: ✭ 306 (+611.63%)

Mutual labels: scraping

Pypatent

Search for and retrieve US Patent and Trademark Office Patent Data

Stars: ✭ 31 (-27.91%)

Mutual labels: scraping

Scrapy Crawlera

Crawlera middleware for Scrapy

Stars: ✭ 281 (+553.49%)

Mutual labels: scraping

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (+979.07%)

Mutual labels: scraping

schedule-tweet

Schedules tweets using TweetDeck

Stars: ✭ 14 (-67.44%)

Mutual labels: scraping

Parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Stars: ✭ 628 (+1360.47%)

Mutual labels: scraping

ARGUS

ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9

Stars: ✭ 68 (+58.14%)

Mutual labels: scraping

Pandapower

Convenient Power System Modelling and Analysis based on PYPOWER and pandas

Stars: ✭ 387 (+800%)

Mutual labels: power

Undetected Chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Stars: ✭ 365 (+748.84%)

Mutual labels: scraping

scraper

Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.

Stars: ✭ 37 (-13.95%)

Mutual labels: scraping

Tabula

Tabula is a tool for liberating data tables trapped inside PDF files

Stars: ✭ 5,420 (+12504.65%)

Mutual labels: scraping

Coronadatascraper

COVID-19 Coronavirus data scraped from government and curated data sources.

Stars: ✭ 372 (+765.12%)

Mutual labels: scraping

Instagram Scraper

Scrape the Instagram frontend. Inspired from twitter-scraper by @kennethreitz.

Stars: ✭ 903 (+2000%)

Mutual labels: scraping

Comic Dl

Comic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.

Stars: ✭ 365 (+748.84%)

Mutual labels: scraping

Gazpacho

🥫 The simple, fast, and modern web scraping library

Stars: ✭ 525 (+1120.93%)

Mutual labels: scraping

Laptop Mode Tools

Power Savings tool for Linux

Stars: ✭ 346 (+704.65%)

Mutual labels: power

Configs

Public, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores

Stars: ✭ 37 (-13.95%)

Mutual labels: scraping

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+9381.4%)

Mutual labels: scraping

Facebook data analyzer

Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more

Stars: ✭ 515 (+1097.67%)

Mutual labels: scraping

Social Media Profiles Regexs

📇 Extract social media profiles and more with regular expressions

Stars: ✭ 324 (+653.49%)

Mutual labels: scraping

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+1734.88%)

Mutual labels: scraping

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

Stars: ✭ 309 (+618.6%)

Mutual labels: scraping

Nickjs

Web scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)

Stars: ✭ 494 (+1048.84%)

Mutual labels: scraping

Edu Mail Generator

Generate Free Edu Mail(s) within minutes

Stars: ✭ 301 (+600%)

Mutual labels: scraping

Iobroker.sourceanalytix

Detailed analysis of your Energy, gas and liquid consumptions

Stars: ✭ 40 (-6.98%)

Mutual labels: power

Clean Text

🧹 Python package for text cleaning

Stars: ✭ 284 (+560.47%)

Mutual labels: scraping

Ferret

Declarative web scraping

Stars: ✭ 4,837 (+11148.84%)

Mutual labels: scraping

Lambdasoup

Functional HTML scraping and rewriting with CSS in OCaml

Stars: ✭ 280 (+551.16%)

Mutual labels: scraping

Imagescraper

✂️ High performance, multi-threaded image scraper

Stars: ✭ 630 (+1365.12%)

Mutual labels: scraping

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+7234.88%)

Mutual labels: scraping

Dataflowkit

Extract structured data from web sites. Web sites scraping.

Stars: ✭ 456 (+960.47%)

Mutual labels: scraping

instagram explorer

📷 An app to scrap instagram posts and analyze data.

Stars: ✭ 17 (-60.47%)

Mutual labels: scraping

Auto Cpufreq

Automatic CPU speed & power optimizer for Linux