Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Your will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!

Stars: ✭ 137 (-16.46%)

Mutual labels: scraper

Zillow

Zillow Scraper for Python using Selenium

Stars: ✭ 141 (-14.02%)

Mutual labels: scraper

Scraperwiki Python

ScraperWiki Python library for scraping and saving data

Stars: ✭ 146 (-10.98%)

Mutual labels: scraper

Mwoffliner

Scrape any online Mediawiki motorised wiki (like Wikipedia) to your local filesystem

Stars: ✭ 121 (-26.22%)

Mutual labels: scraper

Opensanctions

An open database of international sanctions data, persons of interest and politically exposed persons

Stars: ✭ 157 (-4.27%)

Mutual labels: scraper

Google Play Scraper

Google play scraper for Python inspired by <facundoolano/google-play-scraper>

Stars: ✭ 143 (-12.8%)

Mutual labels: scraper

Serpscrap

SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.

Stars: ✭ 153 (-6.71%)

Mutual labels: scraper

Onegram

This repository is no longer maintained.

Stars: ✭ 137 (-16.46%)

Mutual labels: scraper

Go Jd

京东自动登录，在线商品自动下单

Stars: ✭ 139 (-15.24%)

Mutual labels: scraper

Phpscraper

PHP Scraper - an highly opinionated web-interface for PHP

Stars: ✭ 148 (-9.76%)

Mutual labels: scraper

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+6939.63%)

Mutual labels: scraper

Instagram Scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

Stars: ✭ 2,209 (+1246.95%)

Mutual labels: scraper

Scraper

A scraper that switches between normal mode and gentleman mode, built on Eletron, React

Stars: ✭ 127 (-22.56%)

Mutual labels: scraper

Google2csv

Google2Csv a simple google scraper that saves the results on a csv/xlsx/jsonl file

Stars: ✭ 145 (-11.59%)

Mutual labels: scraper

Datmusic Api

Alternative for VK Audio API

Stars: ✭ 160 (-2.44%)

Mutual labels: scraper

Covid19 mobility

COVID-19 Mobility Data Aggregator. Scraper of Google, Apple, Waze and TomTom COVID-19 Mobility Reports🚶🚘🚉

Stars: ✭ 156 (-4.88%)

Mutual labels: scraper

Nooverviewavailable.com

A survey of Apple developer documentation.

Stars: ✭ 152 (-7.32%)

Mutual labels: scraper

View All Similar Projects ➔

========= scrapelib

.. image:: https://github.com/jamesturk/scrapelib/workflows/Test/badge.svg :target: https://github.com/jamesturk/scrapelib/actions

.. image:: https://coveralls.io/repos/jamesturk/scrapelib/badge.png?branch=master :target: https://coveralls.io/r/jamesturk/scrapelib

.. image:: https://img.shields.io/pypi/v/scrapelib.svg :target: https://pypi.python.org/pypi/scrapelib

.. image:: https://readthedocs.org/projects/scrapelib/badge/?version=latest :target: https://readthedocs.org/projects/scrapelib/?badge=latest :alt: Documentation Status

scrapelib is a library for making requests to less-than-reliable websites, it is implemented (as of 0.7) as a wrapper around requests <http://python-requests.org>_.

scrapelib originated as part of the Open States <http://openstates.org/>_ project to scrape the websites of all 50 state legislatures and as a result was therefore designed with features desirable when dealing with sites that have intermittent errors or require rate-limiting.

Advantages of using scrapelib over alternatives like httplib2 simply using requests as-is:

All of the power of the suberb requests <http://python-requests.org>_ library.
HTTP, HTTPS, and FTP requests via an identical API
support for simple caching with pluggable cache backends
request throttling
configurable retries for non-permanent site failures

Written by James Turk [email protected], thanks to Michael Stephens for initial urllib2/httplib2 version

See https://github.com/jamesturk/scrapelib/graphs/contributors for contributors.

Requirements

python 2.7, >=3.3
requests >= 2.0 (earlier versions may work but aren't tested)

Example Usage

Documentation: http://scrapelib.readthedocs.org/en/latest/

import scrapelib s = scrapelib.Scraper(requests_per_minute=10)

Grab Google front page

s.get('http://google.com')

Will be throttled to 10 HTTP requests per minute

while True: s.get('http://example.com')

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 164

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗