Alternatives and detailed information of beautifulscraper

adregner / beautifulscraper

Licence: other

Python web-scraping library that wraps urllib2 and BeautifulSoup

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to beautifulscraper

single-sign-on-out-jwt-cookie-redis-java-springboot-freemarker

Single Sign Out, Scalable Authentication Example with JSON Web Token (JWT), Spring Boot and Redis

Stars: ✭ 15 (-62.5%)

Mutual labels: cookie

titanium-cookies

Me want cookies. OM NOM NOM.

Stars: ✭ 20 (-50%)

Mutual labels: cookie

http

Basic HTTP primitives which can be shared by servers and clients.

Stars: ✭ 75 (+87.5%)

Mutual labels: cookie

html-table-extractor

extract data from html table

Stars: ✭ 74 (+85%)

Mutual labels: beautifulsoup

react-cookie-law

React Cookie Law is a cookie-info banner compliance with the GDPR and the EU cookie law. It allows the user to give consent in a granular way.

Stars: ✭ 103 (+157.5%)

Mutual labels: cookie

burp-cookie-porter

一个可快速“搬运”cookie的Burp Suite插件

Stars: ✭ 22 (-45%)

Mutual labels: cookie

non-api-fb-scraper

Scrape public FaceBook posts from any group or user into a .csv file without needing to register for any API access

Stars: ✭ 40 (+0%)

Mutual labels: beautifulsoup

macaroons-rs

Macaroons: bearer credentials with caveats for distributed authorization

Stars: ✭ 62 (+55%)

Mutual labels: cookie

PacPaw

Pawn package manager for SA-MP

Stars: ✭ 14 (-65%)

Mutual labels: beautifulsoup

Convenient way to use cookies with PSR-7

Stars: ✭ 17 (-57.5%)

Mutual labels: cookie

next-cookie

Cookie serializer and deserializer library for next.js

Stars: ✭ 190 (+375%)

Mutual labels: cookie

grailer

web scraping tool for grailed.com

Stars: ✭ 30 (-25%)

Mutual labels: beautifulsoup

backcookie

Small backdoor using cookie.

Stars: ✭ 49 (+22.5%)

Mutual labels: cookie

linkedinBot

Automate the process of sending referral request and cold mailing on LinkedIn

Stars: ✭ 25 (-37.5%)

Mutual labels: beautifulsoup

elm-storage

Unified interface for accessing and modifying LocalStorage, SessionStorage and Cookies

Stars: ✭ 13 (-67.5%)

Mutual labels: cookie

Tieba-Birthday-Spider

百度贴吧生日爬虫，可抓取贴吧内吧友生日，并且在对应日期自动发送祝福

Stars: ✭ 28 (-30%)

Mutual labels: beautifulsoup

beautifulsoup.dart

A dart port of the famous python library beautifulsoup

Stars: ✭ 19 (-52.5%)

Mutual labels: beautifulsoup

Manage your cookies on client and server side (Angular Universal)

Stars: ✭ 40 (+0%)

Mutual labels: cookie

F5-BIGIP-Decoder

Detecting and decoding BIGIP cookies in bash

Stars: ✭ 28 (-30%)

Mutual labels: cookie

BenGorCookies

Cookie warning banner that requests user consent, European law compilant. Zero dependencies, fully customizable JavaScript library for IE9+

Stars: ✭ 12 (-70%)

Mutual labels: cookie

View All Similar Projects ➔

BeautifulScraper

Simple wraper around BeautifulSoup for HTML parsing and urllib2 for HTTP(S) request/response handling. BeautifulScraper also overrides some of the default handlers in urllib2 in order to:

Handle cookies properly
Offer full control of included cookies
Return the actual response from the server, un-mangled and not reprocessed

Installation

# pip install beautifulscraper

# git clone git://github.com/adregner/beautifulscraper.git
# cd beautifulscraper/
# python setup.py install

Examples

Getting started is brain-dead simple.

>>> from beautifulscraper import BeautifulScraper
>>> scraper = BeautifulScraper()

Start by requesting something.

>>> body = scraper.go("https://github.com/adregner/beautifulscraper")

The response will be a plain BeautifulSoup object. See their documentation for how to use it.

>>> body.select(".repository-meta-content")[0].text
'\n\n            Python web-scraping library that wraps urllib2 and BeautifulSoup\n          \n'

The headers from the server's response are accessiable.

>>> for header, value in scraper.response_headers.items():
...     print "%s: %s" % (header, value)
...
status: 200 OK
content-length: 36179
set-cookie: _gh_sess=BAh7BzoQX2NzcmZfdG9rZW4iMUNCOWxnbFpVd3EzOENqVk9GTUFXbDlMVUJIbGxsNEVZUFZJNiswRjhwejQ9Og9zZXNzaW9uX2lkIiUyNmQ2ODE5ZDdiZjM3MTA2N2VlZDk3Y2VlMDViYzI2OA%3D%3D--5d31df13d5c0eeb8f3cccb140392124968abc374; path=/; expires=Sat, 01-Jan-2022 00:00:00 GMT; secure; HttpOnly
strict-transport-security: max-age=2592000
connection: close
server: nginx
x-runtime: 98
etag: "1c595b5b6a25eb7f021e68c3476d61da"
cache-control: private, max-age=0, must-revalidate
date: Wed, 31 Oct 2012 02:14:08 GMT
x-frame-options: deny
content-type: text/html; charset=utf-8

So is the response code as an integer.

>>> type(scraper.response_code), scraper.response_code
(<type 'int'>, 200)

The scraper will keep track of all cookies it sees via the cookielib.CookieJar class. You can read the cookies if you'd like. The Cookie object's are just a collection of properties.

>>> scraper.cookies[0].name
'_gh_sess'

See the pydoc for more information.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

adregner / beautifulscraper

Programming Languages

Labels

Projects that are alternatives of or similar to beautifulscraper

BeautifulScraper

Installation

Examples