Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Some MediaWiki bot examples including wikipedia, wikidata using MediaWiki module of CeJS library. 採用 CeJS MediaWiki 自動化作業用程式庫來製作 MediaWiki (維基百科/維基數據) 機器人的範例。

Stars: ✭ 26 (-82.31%)

Mutual labels: mediawiki, wikipedia

Mediawiki

MediaWiki API wrapper in python http://pymediawiki.readthedocs.io/en/latest/

Stars: ✭ 89 (-39.46%)

Mutual labels: wikipedia, mediawiki

Apps Android Wikipedia

📱The official Wikipedia app for Android!

Stars: ✭ 1,350 (+818.37%)

Mutual labels: wikipedia, mediawiki

discord-wiki-bot

Wiki-Bot is a bot with the purpose to easily search for and link to wiki pages. Wiki-Bot shows short descriptions and additional info about the pages and is able to resolve redirects and follow interwiki links.

Stars: ✭ 69 (-53.06%)

Mutual labels: mediawiki, wikipedia

DiscordWikiBot

Discord bot for Wikimedia projects and MediaWiki wiki sites

Stars: ✭ 30 (-79.59%)

Mutual labels: mediawiki, wikipedia

Wikipedia Mirror

🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kimix + ZIM dump, and MediaWiki/XOWA + XML dump

Stars: ✭ 160 (+8.84%)

Mutual labels: wikipedia, mediawiki

Wikiteam

Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2020, WikiTeam has preserved more than 250,000 wikis.

Stars: ✭ 404 (+174.83%)

Mutual labels: wikipedia, mediawiki

Wptools

Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis

Stars: ✭ 371 (+152.38%)

Mutual labels: wikipedia, mediawiki

Linq To Wiki

.Net library to access MediaWiki API

Stars: ✭ 93 (-36.73%)

Mutual labels: wikipedia, mediawiki

Jwiki

📖 A library for effortlessly interacting with Wikipedia/MediaWiki

Stars: ✭ 69 (-53.06%)

Mutual labels: wikipedia, mediawiki

Mwclient

Python client library to interface with the MediaWiki API

Stars: ✭ 221 (+50.34%)

Mutual labels: wikipedia, mediawiki

cassandra-GLAM-tools

Support GLAMs in monitoring and evaluating their cooperation with Wikimedia projects

Stars: ✭ 17 (-88.44%)

Mutual labels: mediawiki, wikipedia

Mediawiki

🌻 The collaborative editing software that runs Wikipedia. Mirror from https://gerrit.wikimedia.org/g/mediawiki/core. See https://mediawiki.org/wiki/Developer_access for contributing.

Stars: ✭ 2,752 (+1772.11%)

Mutual labels: wikipedia, mediawiki

wikiapi

JavaScript MediaWiki API for node.js

Stars: ✭ 28 (-80.95%)

Mutual labels: mediawiki, wikipedia

copyvios

A copyright violation detector running on Wikimedia Cloud Services

Stars: ✭ 32 (-78.23%)

Mutual labels: mediawiki, wikipedia

Mwparserfromhell

A Python parser for MediaWiki wikicode

Stars: ✭ 440 (+199.32%)

Mutual labels: wikipedia, mediawiki

Mediawiker

Mediawiker is a plugin for Sublime Text editor that adds possibility to use it as Wiki Editor on Mediawiki based sites like Wikipedia and many other.

Stars: ✭ 120 (-18.37%)

Mutual labels: wikipedia, mediawiki

Isbntools

python app/framework for 'all things ISBN' including metadata, descriptions, covers...

Stars: ✭ 122 (-17.01%)

Mutual labels: wikipedia

View All Similar Projects ➔

Infoboxer

Infoboxer is pure-Ruby Wikipedia (and generic MediaWiki) client and parser, targeting information extraction (hence the name).

It can be useful in tasks like:

get a plaintext abstract of an article (paragraphs before first heading);
get structured data variables from page's infobox;
list page's sections and count paragraphs, images and tables in them;
convert some huge "comparison table" to data;
and much, much more!

The whole idea is: you can have any Wikipedia page as a parsed tree with obvious structure, you can navigate that tree easily, and you have a bunch of hi-level helpers method, so typical information extraction tasks should be super-easy, one-liners in best cases.

(For those already thinking "Why should you do this, we already have DBPedia?" -- please, read "Reasons" page in our wiki.)

Showcase

Infoboxer.wikipedia.
  get('Breaking Bad (season 1)').
  sections('Episodes').templates(name: 'Episode table').
  fetch('episodes').templates(name: /^Episode list/).
  fetch_hashes('EpisodeNumber', 'EpisodeNumber2', 'Title', 'ShortSummary')
# => [{"EpisodeNumber"=>#<Var(EpisodeNumber): 1>, "EpisodeNumber2"=>#<Var(EpisodeNumber2): 1>, "Title"=>#<Var(Title): Pilot>, "ShortSummary"=>#<Var(ShortSummary): Walter White, a 50-year old che...>},
#     {"EpisodeNumber"=>#<Var(EpisodeNumber): 2>, "EpisodeNumber2"=>#<Var(EpisodeNumber2): 2>, "Title"=>#<Var(Title): Cat's in the Bag...>, "ShortSummary"=>#<Var(ShortSummary): Walt and Jesse try to dispose o...>},
#     ...and so on

Do you feel it now?

You also can take a look at Showcase.

Usage

Install gem

Install it as usual: gem 'infoboxer' in your Gemfile, then bundle install.

Or just [sudo] gem install infoboxer if you prefer.

Grab the page

# From English Wikipedia
page = Infoboxer.wikipedia.get('Argentina')
# or
page = Infoboxer.wp.get('Argentina')

# From other language Wikipedia:
page = Infoboxer.wikipedia('fr').get('Argentina')

# From any wiki with the same engine:
page = Infoboxer.wiki('http://companywiki.com').get('Our Product')

See more examples and options at Retrieving pages

Play with page

Basically, page is a tree of Nodes, you can think of it as some kind of DOM.

So, you can navigate it:

# Simple traversing and inspect
node = page.children.first.children.first
node.to_tree
node.to_text

# Various lookups
page.lookup(:Template, name: /^Infobox/)

See Tree navigation basics.

On the top of the basic navigation Infoboxer adds some useful shortcuts for convenience and brevity, which allows things like this:

page.section('Episodes').tables.first

See Navigation shortcuts

To put it all in one piece, also take a look at Data extraction tips and tricks.

infoboxer executable

Just try infoboxer command.

Without any options, it starts IRB session with infoboxer required and included into main namespace.

With -w option, it provides a shortcut to MediaWiki instance you want. Like this:

$ infoboxer -w https://en.wikipedia.org/w/api.php
> get('Argentina')
 => #<Page(title: "Argentina", url: "https://en.wikipedia.org/wiki/Argentina"): ....

You can also use shortcuts like infoboxer -w wikipedia for common wikies (and, just for fun, infoboxer -wikipedia also).

Advanced topics

Reasons for Infoboxer creation;
Parsing quality (TL;DR: very good, but not ideal);
Performance (TL;DR: 0.1-0.4 sec for parsing hugest pages);
Localization (TL;DR: For now, you'll need some work to use Infoboxer's most advanced features with non-English or non-WikiMedia wikis; basic and mid-level features work always);
If you plan to use Wikipedia or sister projects data in production, please consider Wikipedia terms and conditions.

Compatibility

As of now, Infoboxer reported to be compatible with any MRI Ruby since 2.0.0 (1.9.3 previously, dropped since Infoboxer 0.2.0). In Travis-CI tests, JRuby is failing due to bug in old Java 7/Java 8 SSL certificate support (see here), and Rubinius failing 3 specs of 500 by mystery, which is uninvestigated yet.

Therefore, those Ruby versions are excluded from Travis config, though, they may still work for you.

License

MIT.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 147

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (51) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

molybdenum-99 / Infoboxer

Programming Languages

Labels

Projects that are alternatives of or similar to Infoboxer

Infoboxer

Showcase

Usage

Install gem

Grab the page

Play with page

infoboxer executable

Advanced topics

Compatibility

Links

License