All Projects → Deer-Spangle → faexport

Deer-Spangle / faexport

Licence: BSD-3-Clause license
The API for Furaffinity you wish existed

Programming Languages

ruby
36898 projects - #4 most used programming language
Haml
164 projects
Makefile
30231 projects
javascript
184084 projects - #8 most used programming language
Dockerfile
14818 projects
CSS
56736 projects

Projects that are alternatives of or similar to faexport

lopez
Crawling and scraping the Web for fun and profit
Stars: ✭ 20 (-67.21%)
Mutual labels:  web-scraping
BookingScraper
🌎 🏨 Scrape Booking.com 🏨 🌎
Stars: ✭ 68 (+11.48%)
Mutual labels:  web-scraping
ioweb
Web Scraping Framework
Stars: ✭ 31 (-49.18%)
Mutual labels:  web-scraping
crawlzone
Crawlzone is a fast asynchronous internet crawling framework for PHP.
Stars: ✭ 70 (+14.75%)
Mutual labels:  web-scraping
Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-70.49%)
Mutual labels:  web-scraping
actor-content-checker
You can use this act to monitor any page's content and get a notification when content changes.
Stars: ✭ 16 (-73.77%)
Mutual labels:  web-scraping
Hi
A Programming language for Web Scraping
Stars: ✭ 14 (-77.05%)
Mutual labels:  web-scraping
TikTokDownloader PyWebIO
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音|TikTok数据爬取工具,支持API调用,在线批量解析及下载。
Stars: ✭ 919 (+1406.56%)
Mutual labels:  web-scraping
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+1065.57%)
Mutual labels:  web-scraping
saveddit
Bulk Downloader for Reddit
Stars: ✭ 130 (+113.11%)
Mutual labels:  web-scraping
core
The complete web scraping toolkit for PHP.
Stars: ✭ 1,110 (+1719.67%)
Mutual labels:  web-scraping
codepen-puppeteer
Use Puppeteer to download pens from Codepen.io as single html pages
Stars: ✭ 22 (-63.93%)
Mutual labels:  web-scraping
tanukai
Furry imageboard / Manga/anime image search engine
Stars: ✭ 25 (-59.02%)
Mutual labels:  furry
2017-summer-workshop
Exercises, data, and more for our 2017 summer workshop (funded by the Estes Fund and in partnership with Project Jupyter and Berkeley's D-Lab)
Stars: ✭ 33 (-45.9%)
Mutual labels:  web-scraping
FurryBot
Furry Bot for Discord
Stars: ✭ 17 (-72.13%)
Mutual labels:  furry
PythonScrapyBasicSetup
Basic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (-6.56%)
Mutual labels:  web-scraping
scrapy-wayback-machine
A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 92 (+50.82%)
Mutual labels:  web-scraping
Python
covers python basic to advance topics, practice questions, logical problems in python, web development using html, css, bootstrap, jquery, DOM, Django 🚀🚀. 💥 🌈
Stars: ✭ 29 (-52.46%)
Mutual labels:  web-scraping
reapr
🕸→ℹ️ Reap Information from Websites
Stars: ✭ 14 (-77.05%)
Mutual labels:  web-scraping
Stock-Market-Predictor
Stock Market Predictor with LSTM network. Web scraping and analyzing tools (ohlc, mean)
Stars: ✭ 28 (-54.1%)
Mutual labels:  web-scraping

FAExport

regression-tests Docker Image Version (latest semver) Uptime Robot status

Simple data export and feeds from FA. Check out the documentation for a full list of functionality. The file lib/faexport/scraper.rb contains all the code required to access data from FA.

This API was originally developed by boothale, but after he had been missing and not responding to emails for many months, deer-spangle has forked it and taken care of it instead.

Authentication

When attempting to use endpoints which require a login cookie to be supplied, or running your own copy of the API, you will need to generate a valid FA cookie string.
A valid FA cookie string looks like this:

"b=3a485360-d203-4a38-97e8-4ff7cdfa244c; a=b1b985c4-d73e-492a-a830-ad238a3693ef"

The cookie a and b values can be obtained by checking your browser's storage inspector while on any FA page.
The storage inspector can be opened by pressing Shift+F9 on Firefox, and on Chrome, by opening the developer tools with F12 and then selecting the "Application" tab, and then "Cookies".
You may want to do this in a private browsing session as logging out of your account will invalidate the cookie and break the scraper. This cookie must be for an account that is set to view the site in classic mode. Modern style cannot be parsed by this API.

To authenticate with the API, you will need to provide that string in the FA_COOKIE header. (Header. Not a cookie)

Development Setup

If you simply run:

make install
make run

It should install required packages, and then run the server, though it may warn of a missing FA_COOKIE environment variable.

You can customise the FA_COOKIE value and PORT by passing them like so:

make FA_COOKIE="b\=...\;a\=..." PORT=9292 run

For ease of development you can remove the need to specify an environment variable for the furaffinity cookie by creating a file named settings.yml in the root directory containing a valid FA cookie:

cookie: "b=3a485360-d203-4a38-97e8-4ff7cdfa244c; a=b1b985c4-d73e-492a-a830-ad238a3693ef"

Deploying - Docker

This application is available as a docker image, so that you don't need to install ruby, and bundler and packages and such. The docker image is available on docker hub here: https://hub.docker.com/r/deerspangle/furaffinity-api

But to deploy a redis image and furaffinity API docker container, linked together, you can run

FA_COOKIE="b\=...\;a\=..." docker-compose up

or simple

make FA_COOKIE="b\=...\;a\=..." deploy

It will default to being exposed on port 80, but you can customise this by passing in the PORT environment variable.

make FA_COOKIE="b\=...\;a\=..." PORT=9292 deploy

If cloudflare protection is online, you can launch a pair of cloudflare bypass containers alongside the API rather easily:

make FA_COOKIE="b\=...\;a\=..." deploy_bypass

Deploying - Heroku

This application can be run on Heroku, just add an instance of Heroku Data for Redis® for caching. Rather than uploading settings.yml, set the environment variable FA_COOKIE to the generated cookie you gathered from FA.

Prometheus metrics and security

There are a number of metrics exposed at /metrics, which can be used for observability and such. Metrics are available for deployed version, error rates, request/response times, and usage patterns between endpoints and format types. Metrics are grouped into API metrics, and scraper metrics. Scraper metrics are prefixed with "faexport_scraper", API endpoint metrics are prefixed with "faexport_endpoint", and all others are prefixed with just "faexport_".

The prometheus metrics endpoint can be secured with basic auth by passing a PROMETHEUS_PASS environment variable. This will set the password for the /metrics endpoint, with a blank username. This environment variable can be passed to locally running instances, or to docker or docker compose.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].