All Projects → openzim → sotoki

openzim / sotoki

Licence: GPL-3.0 license
StackExchange websites to ZIM scraper

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
CSS
56736 projects
javascript
184084 projects - #8 most used programming language
Dockerfile
14818 projects

Projects that are alternatives of or similar to sotoki

ChatSE
An Android application for StackOverflow and StackExchange chat! Revived by Tristan Wiley, base app created by Anubian
Stars: ✭ 22 (-65.62%)
Mutual labels:  stackoverflow, stackexchange
stack-exchange-notifications
Add-ons for Stack Exchange sites, like: askdifferent, askubuntu, serverfault, stackoverflow and superuser
Stars: ✭ 21 (-67.19%)
Mutual labels:  stackoverflow, stackexchange
graphoverflow
Run the entire StackOverflow on Dgraph. Work in progress.
Stars: ✭ 117 (+82.81%)
Mutual labels:  stackoverflow, stackexchange
gutenberg
Scraper for downloading the entire ebooks repository of project Gutenberg
Stars: ✭ 100 (+56.25%)
Mutual labels:  scraper, zim
StackUnderflow
Greasemonkey/Tampermonkey script that brings user blacklisting, starring and other goodies to StackOverflow and StackExchange sites.
Stars: ✭ 16 (-75%)
Mutual labels:  stackoverflow, stackexchange
youtube
Create a ZIM file from a Youtube channel/username/playlist
Stars: ✭ 25 (-60.94%)
Mutual labels:  scraper, zim
stack
An Android app for browsing Stack Overflow and other Stack Exchange sites.
Stars: ✭ 218 (+240.63%)
Mutual labels:  stackoverflow, stackexchange
sox
Stack Overflow Extras: a userscript for the Stack Exchange websites to add a bunch of optional toggle-able features
Stars: ✭ 65 (+1.56%)
Mutual labels:  stackoverflow, stackexchange
stack-search
chrome extansion for enrich stackoverflow results on google search
Stars: ✭ 22 (-65.62%)
Mutual labels:  stackoverflow, stackexchange
TinderBotz
Automated Tinder bot and scraper using selenium in python.
Stars: ✭ 265 (+314.06%)
Mutual labels:  scraper
hackertab.dev
Hackertab turns your New Tab page into a geeky one that keeps you as a developer updated with the best tools, news, jobs and events.
Stars: ✭ 229 (+257.81%)
Mutual labels:  stackoverflow
Zimpedia
Offline reader for Wikipedia
Stars: ✭ 18 (-71.87%)
Mutual labels:  zim
youtube-trending-videos-scraper
A scraper for videos that are trending on YouTube (https://www.youtube.com/feed/trending)
Stars: ✭ 21 (-67.19%)
Mutual labels:  scraper
instagram-hashtag-scraper
NodeJS application for scraping recent top posts from Instagram by hashtag without API access.
Stars: ✭ 17 (-73.44%)
Mutual labels:  scraper
Image-Scraper
Fast concurrent image scraper
Stars: ✭ 35 (-45.31%)
Mutual labels:  scraper
YouTube-MA
💾 YouTube video metadata archiver written in Golang
Stars: ✭ 17 (-73.44%)
Mutual labels:  scraper
immo-feed
A extensible app for scraping property listings
Stars: ✭ 35 (-45.31%)
Mutual labels:  scraper
MalScraper
Scrape everything you can from MyAnimeList.net
Stars: ✭ 132 (+106.25%)
Mutual labels:  scraper
subreddit-comments-dl
Download subreddit comments
Stars: ✭ 57 (-10.94%)
Mutual labels:  scraper
scrapman
Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (-67.19%)
Mutual labels:  scraper

Sotoki

Sotoki (Stack Overflow to Kiwix) is an openZIM scraper to create offline versions of Stack Exchange websites such as Stack Overflow.

It is based on Stack Exchange's Data Dumps hosted by The Internet Archive.

CodeFactor Docker License: GPL v3 PyPI version shields.io

Usage

Sotoki works off a domain that you must provide. That is the domain-name of the stackexchange website you want to scrape. Run sotoki --list-all to get a list of those

Docker

docker run -v my_dir:/output openzim/sotoki sotoki --help

Installation

sotoki is a Python3 software. If you are not using the Docker image, you are advised to use it in a virtual environment to avoid installing software dependencies on your system.

python3 -m venv ./env  # creates a virtual python environment in ./env folder
./env/bin/pip install -U pip  # upgrade pip (package manager). recommended
./env/bin/pip install -U sotoki  # install/upgrade sotoki inside virtualenv

# direct access to in-virtualenv sotoki binary, without shell-attachment
./env/bin/sotoki --help
# alias or link it for convenience
sudo ln -s $(pwd)/env/bin/sotoki /usr/local/bin/

# alternatively, attach virtualenv to shell
source env/bin/activate
sotoki --help
deactivate  # unloads virtualenv from shell

Developers

Anybody is welcome to improve the Sotoki.

To run Sotoki off the git repository, you'll need to download a few external dependencies that we pack in Python releases. Just run python src/sotoki/dependencies.py.

See requirements.txt for the list of python dependencies.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].