Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.

Stars: ✭ 24 (-59.32%)

Mutual labels: scraping, puppeteer, playwright

Instagram Bot

An Instagram bot developed using the Selenium Framework

Stars: ✭ 138 (+133.9%)

Mutual labels: crawler, crawling, selenium

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+5245.76%)

Mutual labels: scraping, crawling, puppeteer

diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

Stars: ✭ 53 (-10.17%)

Mutual labels: scraper, scraping, crawling

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (-11.86%)

Mutual labels: scraper, scraping, crawling

Scrape Linkedin Selenium

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Stars: ✭ 239 (+305.08%)

Mutual labels: scraper, scraping, selenium

Dataflowkit

Extract structured data from web sites. Web sites scraping.

Stars: ✭ 456 (+672.88%)

Mutual labels: scraper, scraping, crawling

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+979.66%)

Mutual labels: crawler, scraper, crawling

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

Stars: ✭ 71 (+20.34%)

Mutual labels: scraping, puppeteer, playwright

Scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Stars: ✭ 42,343 (+71667.8%)

Mutual labels: crawler, scraping, crawling

Seleniumcrawler

An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site

Stars: ✭ 117 (+98.31%)

Mutual labels: scraper, scraping, selenium

Sasila

一个灵活、友好的爬虫框架

Stars: ✭ 286 (+384.75%)

Mutual labels: crawler, scraping, crawling

Webster

a reliable high-level web crawling & scraping framework for Node.js.

Stars: ✭ 364 (+516.95%)

Mutual labels: crawler, crawling, puppeteer

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (+888.14%)

Mutual labels: crawler, scraping, crawling

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

Stars: ✭ 1,246 (+2011.86%)

Mutual labels: crawler, scraper, scraping

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (+69.49%)

Mutual labels: crawler, scraping, crawling

Squidwarc

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

Stars: ✭ 125 (+111.86%)

Mutual labels: crawler, crawling, puppeteer

whatsapp-tracking

Scraping the status of WhatsApp contacts

Stars: ✭ 49 (-16.95%)

Mutual labels: scraper, scraping, puppeteer

Antch

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

Stars: ✭ 198 (+235.59%)

Mutual labels: crawler, scraping, crawling

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-74.58%)

Mutual labels: crawler, scraper, scraping

Awesome Puppeteer

A curated list of awesome puppeteer resources.

Stars: ✭ 1,728 (+2828.81%)

Mutual labels: scraping, crawling, puppeteer

Jvppeteer

Headless Chrome For Java （Java 爬虫）

Stars: ✭ 193 (+227.12%)

Mutual labels: crawler, scraper, puppeteer

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (+369.49%)

Mutual labels: crawler, scraping, crawling

Goose Parser

Universal scrapping tool, which allows you to extract data using multiple environments

Stars: ✭ 211 (+257.63%)

Mutual labels: crawler, scraper, scraping

Udemycoursegrabber

Your will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!

Stars: ✭ 137 (+132.2%)

Mutual labels: scraper, scraping, selenium

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+19467.8%)

Mutual labels: crawler, scraper, crawling

Tianyancha

pip安装的天眼查爬虫API，指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.

Stars: ✭ 206 (+249.15%)

Mutual labels: crawler, scraper, selenium

double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

Stars: ✭ 123 (+108.47%)

Mutual labels: scraping, crawling, puppeteer

docker-selenium-lambda

The simplest demo of chrome automation by python and selenium in AWS Lambda

Stars: ✭ 172 (+191.53%)

Mutual labels: scraping, selenium

TinderBotz

Automated Tinder bot and scraper using selenium in python.

Stars: ✭ 265 (+349.15%)

Mutual labels: scraper, selenium

crawling-framework

Easily crawl news portals or blog sites using Storm Crawler.

Stars: ✭ 22 (-62.71%)

Mutual labels: scraping, crawling

puppeteer-botcheck

🕵‍♂ Bot detection tests for Puppeteer. Hide and seek!

Stars: ✭ 42 (-28.81%)

Mutual labels: scraping, puppeteer

browserslist-generator

A library that makes generating and validating Browserslists a breeze!

Stars: ✭ 77 (+30.51%)

Mutual labels: user-agent, useragent

LInkedIn-Reverese-Lookup

🔎Search LinkedIn profile by email address📧

Stars: ✭ 20 (-66.1%)

Mutual labels: scraping, puppeteer

scrapman

Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs

Stars: ✭ 21 (-64.41%)

Mutual labels: scraper, scraping

zcrawl

An open source web crawling platform

Stars: ✭ 21 (-64.41%)

Mutual labels: scraping, crawling

TikTok

Download public videos on TikTok using Python with Selenium

Stars: ✭ 37 (-37.29%)

Mutual labels: scraper, selenium

throughout

🎪 End-to-end testing made simple (using Jest and Puppeteer)

Stars: ✭ 16 (-72.88%)

Mutual labels: selenium, puppeteer

scrapy-fieldstats

A Scrapy extension to log items coverage when the spider shuts down

Stars: ✭ 17 (-71.19%)

Mutual labels: scraping, crawling

instagram-get-images

Instagram get images 🌄 (hashtags, account, locations) with puppeteer

Stars: ✭ 69 (+16.95%)

Mutual labels: scraper, puppeteer

ha-multiscrape

Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.

Stars: ✭ 103 (+74.58%)

Mutual labels: scraper, scraping

Instagram-to-discord

Monitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!

Stars: ✭ 113 (+91.53%)

Mutual labels: scraper, scraping

InstaBot

Simple and friendly Bot for Instagram, using Selenium and Scrapy with Python.

Stars: ✭ 32 (-45.76%)

Mutual labels: scraping, selenium

copycat

A PHP Scraping Class

Stars: ✭ 70 (+18.64%)

Mutual labels: scraper, scraping

angel.co-companies-list-scraping

No description or website provided.

Stars: ✭ 54 (-8.47%)

Mutual labels: scraper, scraping

leumi-leumicard-bank-data-scraper

Open bank data for Leumi bank and Leumi card credit card

Stars: ✭ 28 (-52.54%)

Mutual labels: scraper, puppeteer

pumba

Fetch, store and access user agent strings for different browsers

Stars: ✭ 12 (-79.66%)

Mutual labels: user-agent, crawling

Recorder

A browser extension that generates Cypress, Playwright and Puppeteer test scripts from your interactions 🖱 ⌨

Stars: ✭ 277 (+369.49%)

Mutual labels: puppeteer, playwright

InstagramLocationScraper

No description or website provided.

Stars: ✭ 13 (-77.97%)

Mutual labels: scraper, selenium

site-audit-seo

Web service and CLI tool for SEO site audit: crawl site, lighthouse all pages, view public reports in browser. Also output to console, json, csv, xlsx, Google Drive.

Stars: ✭ 91 (+54.24%)

Mutual labels: scraper, puppeteer

go-scrapy

Web crawling and scraping framework for Golang

Stars: ✭ 17 (-71.19%)

Mutual labels: scraping, crawling

dijnet-bot

Az összes számlád még egy helyen :)

Stars: ✭ 17 (-71.19%)

Mutual labels: crawler, scraper

1-60 of 1610 similar projects

›

next*5