All Projects β†’ gusty β†’ ScrapeM

gusty / ScrapeM

Licence: Apache-2.0, Unlicense licenses found Licenses found Apache-2.0 LICENSE Unlicense LICENSE.txt
A monadic web scraping library

Programming Languages

F#
602 projects
shell
77523 projects
Batchfile
5799 projects

Projects that are alternatives of or similar to ScrapeM

yellowpages-scraper
Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.
Stars: ✭ 56 (+229.41%)
Mutual labels:  scraper, extract
OLX Scraper
πŸ“» An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-11.76%)
Mutual labels:  scraper, scrapping
freeDictionaryAPI
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
Stars: ✭ 1,352 (+7852.94%)
Mutual labels:  scraper
esaj
Scrapers for many e-SAJ systems
Stars: ✭ 35 (+105.88%)
Mutual labels:  scraper
crohme-data-extractor
A modified extractor for the CROHME handwritten math symbols dataset.
Stars: ✭ 18 (+5.88%)
Mutual labels:  extract
wikipedia for humans
No description or website provided.
Stars: ✭ 44 (+158.82%)
Mutual labels:  scraper
money-parser
Price and currency parsing utility
Stars: ✭ 26 (+52.94%)
Mutual labels:  scrapping
maybe-baby
Minimize defensive coding. A JavaScript implementation of the Maybe monad.
Stars: ✭ 42 (+147.06%)
Mutual labels:  monad
monas
πŸ¦‹ Scala monads for javascript
Stars: ✭ 21 (+23.53%)
Mutual labels:  monad
copycat
A PHP Scraping Class
Stars: ✭ 70 (+311.76%)
Mutual labels:  scraper
angel.co-companies-list-scraping
No description or website provided.
Stars: ✭ 54 (+217.65%)
Mutual labels:  scraper
acefile
read/test/extract ACE 1.0 and 2.0 archives in pure python
Stars: ✭ 67 (+294.12%)
Mutual labels:  extract
youtube
Create a ZIM file from a Youtube channel/username/playlist
Stars: ✭ 25 (+47.06%)
Mutual labels:  scraper
Chrome-Extractor
Python script that will extract all saved passwords from your google chrome database on windows only
Stars: ✭ 51 (+200%)
Mutual labels:  extract
premeStock
Monitors for restocks
Stars: ✭ 53 (+211.76%)
Mutual labels:  scraper
diosts
A Go scraper that validates security.txt files and outputs them in the disclose.io JSON format.
Stars: ✭ 18 (+5.88%)
Mutual labels:  scraper
alea
Coq library for reasoning on randomized algorithms [maintainers=@anton-trunov,@volodeyka]
Stars: ✭ 20 (+17.65%)
Mutual labels:  monad
go-jd
京东Appθ‡ͺεŠ¨η™»ε½•οΌŒεœ¨ηΊΏε•†ε“θ‡ͺεŠ¨δΈ‹ε•
Stars: ✭ 158 (+829.41%)
Mutual labels:  scraper
sypht-golang-client
A Golang client for the Sypht API
Stars: ✭ 33 (+94.12%)
Mutual labels:  extract
VK-Scraper
Scrapes VK user's photos
Stars: ✭ 42 (+147.06%)
Mutual labels:  scraper

ScrapeM

A monadic web scraping library

This library makes web scraping easier by providing ways to automatically maintain state through different request, handling cookies, form submission and http headers.

One function to scrap'em all

This is essentially a single-function library which integrates many existing libraries and present several ways to approach web scraping by using different monads.

All other common functions used here come from different libraries like FSharp.Data, Http.fs and F#+

Scrapes the web with category

It's possible to create stateful linq-style queries which simulates basic user interaction with form submission by using different flavours of State monads. Also sequences expressions are available to integrate the data being extracted from multiple webpages in the same query.

Getting started

Important: At the moment this library is in a 'Prototype' stage

Recommended: Visual Studio 2017 to avoid slow compile time of generic code

In order to try the examples run:

> build.cmd // on windows    
$ ./build.sh  // on unix

Now you can try the sample files:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].