DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (+566.67%)

Mutual labels: crawler, crawling

Scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Stars: ✭ 42,343 (+282186.67%)

Mutual labels: crawler, crawling

Python Dcdownloader

由Python编写的全异步实现的动漫之家(dmzj)漫画批量下载器（爬虫）

Stars: ✭ 146 (+873.33%)

Mutual labels: crawler, downloader

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+76866.67%)

Mutual labels: crawler, crawling

Annie

👾 Fast and simple video download library and CLI tool written in Go

Stars: ✭ 16,369 (+109026.67%)

Mutual labels: crawler, downloader

Tumblthree

A Tumblr Backup Application

Stars: ✭ 211 (+1306.67%)

Mutual labels: crawler, downloader

DownloadRedditImages

Easily download all the images from any subreddit (also select sort_type if you want hot/top/new/controversial, and also sort_time day/week/month/year/all). Randomly select downloaded images and set as wallpaper, updating every 30 mins (or whenever you want duh)!

Stars: ✭ 66 (+340%)

Mutual labels: downloader, image-downloader

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (+1040%)

Mutual labels: crawler, crawling

4chan Downloader

Python3 script to continuously download all images/webms of multiple 4chan thread simultaneously - without installation

Stars: ✭ 136 (+806.67%)

Mutual labels: crawler, downloader

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+103466.67%)

Mutual labels: crawler, crawling

Instagram Bot

An Instagram bot developed using the Selenium Framework

Stars: ✭ 138 (+820%)

Mutual labels: crawler, crawling

Tbplayer

视频边下边播播，把播放器播放过的数据流缓存到本地，支持拖动。采用avplayer

Stars: ✭ 1,334 (+8793.33%)

Mutual labels: downloader, buffer

Screenshot Stream

Capture screenshot of a website and return it as a stream

Stars: ✭ 228 (+1420%)

Mutual labels: webpage, phantomjs

TumblTwo

TumblTwo, an Improved Fork of TumblOne, a Tumblr Downloader.

Stars: ✭ 57 (+280%)

Mutual labels: crawler, downloader

Arachnid

Powerful web scraping framework for Crystal

Stars: ✭ 68 (+353.33%)

Mutual labels: crawler, crawling

Sasila

一个灵活、友好的爬虫框架

Stars: ✭ 286 (+1806.67%)

Mutual labels: crawler, crawling

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (+1746.67%)

Mutual labels: crawler, crawling

Bilili

🍻 bilibili video (including bangumi) and danmaku downloader | B站视频（含番剧）、弹幕下载器

Stars: ✭ 379 (+2426.67%)

Mutual labels: crawler, downloader

Spidy

The simple, easy to use command line web crawler.

Stars: ✭ 257 (+1613.33%)

Mutual labels: crawler, crawling

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (+3786.67%)

Mutual labels: crawler, crawling

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+34093.33%)

Mutual labels: crawler, crawling

Moodle Downloader 2

A Moodle downloader that downloads course content fast from Moodle (eg. lecture pdfs)

Stars: ✭ 118 (+686.67%)

Mutual labels: crawler, downloader

crawlkit

A crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.

Stars: ✭ 23 (+53.33%)

Mutual labels: phantomjs, crawling

Work crawler

Download comics novels 小说漫画下载工具小説漫画のダウンローダ小說漫畫下載:腾讯漫画大角虫漫画有妖气知音漫客咪咕 SF漫画哦漫画看漫画漫画柜汗汗酷漫動漫伊甸園快看漫画微博动漫 733动漫网大古漫画网漫画DB 無限動漫動漫狂卡推漫画动漫之家动漫屋古风漫画网 36漫画网亲亲漫画网乙女漫画 comico webtoons 咚漫ニコニコ静画 ComicWalker ヤングエースUP モアイ pixivコミックサイコミ;アルファポリスカクヨムハーメルン小説家になろう起点中文网八一中文网顶点小说落霞小说网努努书坊笔趣阁→epub.

Stars: ✭ 1,224 (+8060%)

Mutual labels: crawler, downloader

Goscraper

Golang pkg to quickly return a preview of a webpage (title/description/images)

Stars: ✭ 72 (+380%)

Mutual labels: crawler, webpage

Appcrawler

Android应用市场网络爬虫

Stars: ✭ 25 (+66.67%)

Mutual labels: crawler, phantomjs

Antch

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

Stars: ✭ 198 (+1220%)

Mutual labels: crawler, crawling

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (+246.67%)

Mutual labels: downloader, crawling

Mimo-Crawler

A web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.

Stars: ✭ 22 (+46.67%)

Mutual labels: webpage, crawling

openload dl

A python library and CLI tool that makes easy to download files from openload.co

Stars: ✭ 36 (+140%)

Mutual labels: downloader

Dom

Modern DOM API.

Stars: ✭ 88 (+486.67%)

Mutual labels: webpage

YoutubeSpotifyDL

Youtube and Spotify music downloader with metadata.

Stars: ✭ 66 (+340%)

Mutual labels: downloader

insta-dl

📷 Download Instagram images from a public user.

Stars: ✭ 88 (+486.67%)

Mutual labels: downloader

ulboracms

Ulbora CMS is a self-contained CMS (no database needed) written in Golang. It uses a JSON datastore with content saved in both json files and in memory. You can download and upload a single binary backup file containing content, images, and templates as needed. It also has a built-in mail sender.

Stars: ✭ 42 (+180%)

Mutual labels: webpage

k8s-log

容器日志搜集套件。

Stars: ✭ 15 (+0%)

Mutual labels: buffer

pyCreeper

一个用来快速提取网页内容的信息采集（爬虫）框架，实现了对网页的动态加载与控制。

Stars: ✭ 25 (+66.67%)

Mutual labels: phantomjs

kasthack.osp

Генератор сырых дампов пользователей VK.