DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (-63.5%)

Mutual labels: crawler, webscraping

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+382.48%)

Mutual labels: crawler, scraper

Php Crawler

A php crawler that finds emails on the internets

Stars: ✭ 119 (-56.57%)

Mutual labels: crawler, webscraping

Onegram

This repository is no longer maintained.

Stars: ✭ 137 (-50%)

Mutual labels: crawler, scraper

Querylist

🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Stars: ✭ 2,392 (+772.99%)

Mutual labels: crawler, scraper

MyCrawler

我的爬虫合集

Stars: ✭ 55 (-79.93%)

Mutual labels: crawler, scraper

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+1771.9%)

Mutual labels: crawler, scraper

dijnet-bot

Az összes számlád még egy helyen :)

Stars: ✭ 17 (-93.8%)

Mutual labels: crawler, scraper

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+132.48%)

Mutual labels: crawler, scraper

Goscraper

Golang pkg to quickly return a preview of a webpage (title/description/images)

Stars: ✭ 72 (-73.72%)

Mutual labels: crawler, scraper

Wombat

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

Stars: ✭ 1,220 (+345.26%)

Mutual labels: crawler, scraper

Instagram-Scraper-2021

Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).

Stars: ✭ 57 (-79.2%)

Mutual labels: scraper, webscraping

Pypergrabber

Fetches PubMed article IDs (PMIDs) from email inbox, then crawls PubMed, Google Scholar and Sci-Hub for respective PDF files.

Stars: ✭ 14 (-94.89%)

Mutual labels: crawler, scraper

Instagram Crawler

Crawl instagram photos, posts and videos for download.

Stars: ✭ 178 (-35.04%)

Mutual labels: crawler, scraper

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (-30.66%)

Mutual labels: crawler, scraper

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+5569.71%)

Mutual labels: crawler, scraper

Goose Parser

Universal scrapping tool, which allows you to extract data using multiple environments

Stars: ✭ 211 (-22.99%)

Mutual labels: crawler, scraper

Ruiji.net

crawler framework, distributed crawler extractor

Stars: ✭ 220 (-19.71%)

Mutual labels: crawler, scraper

Skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

Stars: ✭ 231 (-15.69%)

Mutual labels: crawler, scraper

newspaperjs

News extraction and scraping. Article Parsing

Stars: ✭ 59 (-78.47%)

Mutual labels: scraper, webscraping

Fbcrawl

A Facebook crawler

Stars: ✭ 536 (+95.62%)

Mutual labels: crawler, scraper

Awesome Crawler

A collection of awesome web crawler,spider in different languages

Stars: ✭ 4,793 (+1649.27%)

Mutual labels: crawler, scraper

weibo-scraper

Simple Weibo Scraper

Stars: ✭ 50 (-81.75%)

Mutual labels: crawler, scraper

Ferret

Declarative web scraping

Stars: ✭ 4,837 (+1665.33%)

Mutual labels: crawler, scraper

Crawler

A high performance web crawler in Elixir.

Stars: ✭ 781 (+185.04%)

Mutual labels: crawler, scraper

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+139.42%)

Mutual labels: crawler, scraper

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-90.88%)

Mutual labels: crawler, scraper

Nintendo Switch Eshop

Crawler for Nintendo Switch eShop

Stars: ✭ 463 (+68.98%)

Mutual labels: crawler, scraper

Jd Autobuy

Python爬虫，京东自动登录，在线抢购商品

Stars: ✭ 1,174 (+328.47%)

Mutual labels: crawler, scraper

Social Scraper

Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt

Stars: ✭ 47 (-82.85%)

Mutual labels: crawler, scraper

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

Stars: ✭ 1,246 (+354.74%)

Mutual labels: crawler, scraper

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+2868.25%)

Mutual labels: crawler, scraper

Google Play Scraper

Node.js scraper to get data from Google Play

Stars: ✭ 1,606 (+486.13%)

Mutual labels: crawler, scraper

Not Your Average Web Crawler

A web crawler (for bug hunting) that gathers more than you can imagine.

Stars: ✭ 107 (-60.95%)

Mutual labels: crawler, scraper

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+4113.5%)

Mutual labels: crawler, scraper

Scrapedin

LinkedIn Scraper (currently working 2020)

Stars: ✭ 453 (+65.33%)

Mutual labels: crawler, scraper

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (-37.59%)

Mutual labels: crawler, scraper

Datmusic Api

Alternative for VK Audio API

Stars: ✭ 160 (-41.61%)

Mutual labels: crawler, scraper

Jvppeteer

Headless Chrome For Java （Java 爬虫）

Stars: ✭ 193 (-29.56%)

Mutual labels: crawler, scraper

Instagram Scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

Stars: ✭ 2,209 (+706.2%)

Mutual labels: crawler, scraper

Media Scraper

Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok

Stars: ✭ 206 (-24.82%)

Mutual labels: crawler, scraper

Tianyancha

pip安装的天眼查爬虫API，指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.

Stars: ✭ 206 (-24.82%)

Mutual labels: crawler, scraper

Annie

👾 Fast and simple video download library and CLI tool written in Go

Stars: ✭ 16,369 (+5874.09%)

Mutual labels: crawler, scraper

newsemble

API for fetching data from news websites.

Stars: ✭ 42 (-84.67%)

Mutual labels: scraper, webscraping

bing-ip2hosts

bingip2hosts is a Bing.com web scraper that discovers websites by IP address

Stars: ✭ 99 (-63.87%)

Mutual labels: scraper, webscraping

Mimo-Crawler

A web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.

Stars: ✭ 22 (-91.97%)

Mutual labels: scraper, webscraping

robotstxt

robots.txt file parsing and checking for R

Stars: ✭ 65 (-76.28%)

Mutual labels: scraper, webscraping

metacritic api

PHP Metacritic API - Mirrored by my GitLab

Stars: ✭ 31 (-88.69%)

Mutual labels: scraper, webscraping

BookingScraper

🌎 🏨 Scrape Booking.com 🏨 🌎

Stars: ✭ 68 (-75.18%)

Mutual labels: scraper, webscraping

Crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

Stars: ✭ 440 (+60.58%)

Mutual labels: crawler, scraper

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (+61.68%)

Mutual labels: crawler, scraper

Google Play Scraper

Google play scraper for Python inspired by <facundoolano/google-play-scraper>

Stars: ✭ 143 (-47.81%)

Mutual labels: crawler, scraper

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-93.8%)

Mutual labels: crawler, scraper

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-94.53%)

Mutual labels: crawler, scraper

1-60 of 827 similar projects

›

next*5