A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+210.9%)

Mutual labels: crawler, scraper

Surgeon

Declarative DOM extraction expression evaluator. 👨‍⚕️

Stars: ✭ 653 (+209.48%)

Mutual labels: parser, scraper

Cheerio

Fast, flexible, and lean implementation of core jQuery designed specifically for the server.

Stars: ✭ 24,616 (+11566.35%)

Mutual labels: parser, scraper

Abot

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

Stars: ✭ 1,961 (+829.38%)

Mutual labels: crawler, parsing

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-88.15%)

Mutual labels: crawler, scraper

Jkt

Simple helper to parse JSON based on independent schema

Stars: ✭ 22 (-89.57%)

Mutual labels: parser, parsing

Onion Crawler

Tor website crawler (specific for Alphabay at the time)

Stars: ✭ 15 (-92.89%)

Mutual labels: parser, crawler

Fuzi

A fast & lightweight XML & HTML parser in Swift with XPath & CSS support

Stars: ✭ 894 (+323.7%)

Mutual labels: parser, parsing

Parse Xml

A fast, safe, compliant XML parser for Node.js and browsers.

Stars: ✭ 184 (-12.8%)

Mutual labels: parser, parsing

Logos

Create ridiculously fast Lexers

Stars: ✭ 1,001 (+374.41%)

Mutual labels: parser, parsing

Social Scraper

Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt

Stars: ✭ 47 (-77.73%)

Mutual labels: crawler, scraper

Scrapy Crawlera

Crawlera middleware for Scrapy

Stars: ✭ 281 (+33.18%)

Mutual labels: crawler, scraping

Api Store

Contains all the public APIs listed in Phantombuster's API store. Pull requests welcome!

Stars: ✭ 69 (-67.3%)

Mutual labels: scraping, phantomjs

Php Svg Lib

SVG file parsing / rendering library

Stars: ✭ 1,146 (+443.13%)

Mutual labels: parser, parsing

Goscraper

Golang pkg to quickly return a preview of a webpage (title/description/images)

Stars: ✭ 72 (-65.88%)

Mutual labels: crawler, scraper

Serpscrap

SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.

Stars: ✭ 153 (-27.49%)

Mutual labels: scraper, scraping

Rats

Movie Ratings Synchronization with Python

Stars: ✭ 156 (-26.07%)

Mutual labels: parser, parsing

Arpeggio

Parser interpreter based on PEG grammars written in Python http://textx.github.io/Arpeggio/

Stars: ✭ 204 (-3.32%)

Mutual labels: parser, parsing

Lodestone Nodejs

Character tracking and parser library for nodejs

Stars: ✭ 81 (-61.61%)

Mutual labels: parser, parsing

Mini Yaml

Single header YAML 1.0 C++11 serializer/deserializer.

Stars: ✭ 79 (-62.56%)

Mutual labels: parser, parsing

Formula Parser

Parsing and evaluating mathematical formulas given as strings.

Stars: ✭ 62 (-70.62%)

Mutual labels: parser, parsing

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (-52.61%)

Mutual labels: crawler, scraping

Graphql Go Tools

Tools to write high performance GraphQL applications using Go/Golang.

Stars: ✭ 96 (-54.5%)

Mutual labels: parser, parsing

Webmagic

A scalable web crawler framework for Java.

Stars: ✭ 10,186 (+4727.49%)

Mutual labels: crawler, scraping

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+526.54%)

Mutual labels: crawler, scraper

Google Play Scraper

Node.js scraper to get data from Google Play

Stars: ✭ 1,606 (+661.14%)

Mutual labels: crawler, scraper

Instagram Profilecrawl

💻 Quickly crawl the information (e.g. followers, tags, etc...) of an instagram profile. No login required!

Stars: ✭ 110 (-47.87%)

Mutual labels: crawler, browser

Whois Parser

Go(Golang) module for domain whois information parsing.

Stars: ✭ 123 (-41.71%)

Mutual labels: parser, parsing

Boj Autocommit

When you solve the problem of Baekjoon Online Judge, it automatically commits and pushes to the remote repository.

Stars: ✭ 60 (-71.56%)

Mutual labels: crawler, phantomjs

Onegram

This repository is no longer maintained.

Stars: ✭ 137 (-35.07%)

Mutual labels: crawler, scraper

Udemycoursegrabber

Your will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!

Stars: ✭ 137 (-35.07%)

Mutual labels: scraper, scraping

Parjs

JavaScript parser-combinator library

Stars: ✭ 145 (-31.28%)

Mutual labels: parser, parsing

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+5371.56%)

Mutual labels: crawler, scraper

Instagram Scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot