All Projects → WikiTeam → Wikiteam

WikiTeam / Wikiteam

Licence: gpl-3.0
Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2020, WikiTeam has preserved more than 250,000 wikis.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Wikiteam

Apps Android Wikipedia
📱The official Wikipedia app for Android!
Stars: ✭ 1,350 (+234.16%)
Mutual labels:  wikipedia, wiki, mediawiki
wikiapi
JavaScript MediaWiki API for node.js
Stars: ✭ 28 (-93.07%)
Mutual labels:  wiki, mediawiki, wikipedia
discord-wiki-bot
Wiki-Bot is a bot with the purpose to easily search for and link to wiki pages. Wiki-Bot shows short descriptions and additional info about the pages and is able to resolve redirects and follow interwiki links.
Stars: ✭ 69 (-82.92%)
Mutual labels:  wiki, mediawiki, wikipedia
Linq To Wiki
.Net library to access MediaWiki API
Stars: ✭ 93 (-76.98%)
Mutual labels:  wikipedia, wiki, mediawiki
Wikipedia Mirror
🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kimix + ZIM dump, and MediaWiki/XOWA + XML dump
Stars: ✭ 160 (-60.4%)
Mutual labels:  wikipedia, wiki, mediawiki
Mediawiki
🌻 The collaborative editing software that runs Wikipedia. Mirror from https://gerrit.wikimedia.org/g/mediawiki/core. See https://mediawiki.org/wiki/Developer_access for contributing.
Stars: ✭ 2,752 (+581.19%)
Mutual labels:  wikipedia, wiki, mediawiki
wikibot
Some MediaWiki bot examples including wikipedia, wikidata using MediaWiki module of CeJS library. 採用 CeJS MediaWiki 自動化作業用程式庫來製作 MediaWiki (維基百科/維基數據) 機器人的範例。
Stars: ✭ 26 (-93.56%)
Mutual labels:  wiki, mediawiki, wikipedia
arctee
Atomic tee
Stars: ✭ 22 (-94.55%)
Mutual labels:  export, backup
meza
Setup an enterprise MediaWiki server with simple commands
Stars: ✭ 38 (-90.59%)
Mutual labels:  wiki, mediawiki
Clickhouse Backup
Tool for easy ClickHouse backup and restore with cloud storages support
Stars: ✭ 359 (-11.14%)
Mutual labels:  dump, backup
bitnami-docker-mediawiki
Bitnami Docker Image for MediaWiki
Stars: ✭ 89 (-77.97%)
Mutual labels:  wiki, mediawiki
browserexport
backup and parse browser history databases (chrome, firefox, safari, and other chrome/firefox derivatives)
Stars: ✭ 54 (-86.63%)
Mutual labels:  export, backup
evernote-backup
Backup & export all Evernote notes and notebooks
Stars: ✭ 104 (-74.26%)
Mutual labels:  export, backup
wikipedia for humans
No description or website provided.
Stars: ✭ 44 (-89.11%)
Mutual labels:  wiki, wikipedia
pockexport
Export/access your Pocket data, including highlights!
Stars: ✭ 124 (-69.31%)
Mutual labels:  export, backup
DiscordWikiBot
Discord bot for Wikimedia projects and MediaWiki wiki sites
Stars: ✭ 30 (-92.57%)
Mutual labels:  mediawiki, wikipedia
wikitopdf
Export a repo's wiki as a PDF ebook.
Stars: ✭ 14 (-96.53%)
Mutual labels:  export, wiki
fb-export
Export (most) of your Facebook data using Node.js and the Graph API.
Stars: ✭ 21 (-94.8%)
Mutual labels:  export, backup
copyvios
A copyright violation detector running on Wikimedia Cloud Services
Stars: ✭ 32 (-92.08%)
Mutual labels:  mediawiki, wikipedia
yii2-db-manager
Database Backup and Restore functionality
Stars: ✭ 96 (-76.24%)
Mutual labels:  backup, dump

WikiTeam

We archive wikis, from Wikipedia to tiniest wikis

WikiTeam software is a set of tools for archiving wikis. They work on MediaWiki wikis, but we want to expand to other wiki engines. As of 2020, WikiTeam has preserved more than 250,000 wikis, several wikifarms, regular Wikipedia dumps and 34 TB of Wikimedia Commons images.

There are thousands of wikis in the Internet. Every day some of them are no longer publicly available and, due to lack of backups, lost forever. Millions of people download tons of media files (movies, music, books, etc) from the Internet, serving as a kind of distributed backup. Wikis, most of them under free licenses, disappear from time to time because nobody grabbed a copy of them. That is a shame that we would like to solve.

WikiTeam is the Archive Team (GitHub) subcommittee on wikis. It was founded and originally developed by Emilio J. Rodríguez-Posada, a Wikipedia veteran editor and amateur archivist. Many people have helped by sending suggestions, reporting bugs, writing documentation, providing help in the mailing list and making wiki backups. Thanks to all, especially to: Federico Leva, Alex Buie, Scott Boyd, Hydriz, Platonides, Ian McEwen, Mike Dupont, balr0g and PiRSquared17.

Documentation Source code Download available backups Community Follow us on Twitter

Quick guide

This is a very quick guide for the most used features of WikiTeam tools. For further information, read the tutorial and the rest of the documentation. You can also ask in the mailing list.

Requirements

Requires Python 2.7.

Confirm you satisfy the requirements:

pip install --upgrade -r requirements.txt

or, if you don't have enough permissions for the above,

pip install --user --upgrade -r requirements.txt

Download any wiki

To download any wiki, use one of the following options:

python dumpgenerator.py http://wiki.domain.org --xml --images (complete XML histories and images)

If the script can't find itself the API and/or index.php paths, then you can provide them:

python dumpgenerator.py --api=http://wiki.domain.org/w/api.php --xml --images

python dumpgenerator.py --api=http://wiki.domain.org/w/api.php --index=http://wiki.domain.org/w/index.php --xml --images

If you only want the XML histories, just use --xml. For only the images, just --images. For only the current version of every page, --xml --curonly.

You can resume an aborted download:

python dumpgenerator.py --api=http://wiki.domain.org/w/api.php --xml --images --resume --path=/path/to/incomplete-dump

See more options:

python dumpgenerator.py --help

Download Wikimedia dumps

To download Wikimedia XML dumps (Wikipedia, Wikibooks, Wikinews, etc) you can run:

python wikipediadownloader.py (download all projects)

See more options:

python wikipediadownloader.py --help

Download Wikimedia Commons images

There is a script for this, but we have uploaded the tarballs to Internet Archive, so it's more useful to reseed their torrents than to re-generate old ones with the script.

Developers

Build Status

You can run tests easily by using the tox command. It is probably already present in your operating system, you would need version 1.6. If it is not, you can download it from pypi with: pip install tox.

Example usage:

$ tox
py27 runtests: commands[0] | nosetests --nocapture --nologcapture
Checking http://wiki.annotation.jp/api.php
Trying to parse かずさアノテーション - ソーシャル・ゲノム・アノテーション.jpg from API
Retrieving image filenames
.    Found 266 images
.
-------------------------------------------
Ran 1 test in 2.253s

OK
_________________ summary _________________
  py27: commands succeeded
  congratulations :)
$
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].