All Projects → wpoa → OA-signalling

wpoa / OA-signalling

Licence: GPL-3.0 License
A project to coordinate implementing a system to signal whether references cited on Wikipedia are free to reuse

Programming Languages

Jupyter Notebook
11667 projects
HTML
75241 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to OA-signalling

pywikibot-scripts
Own pywikibot scripts (for Wikimedia projects)
Stars: ✭ 16 (-15.79%)
Mutual labels:  wikipedia, wikidata, pywikibot
Jivesearch
A search engine that doesn't track you.
Stars: ✭ 364 (+1815.79%)
Mutual labels:  wikipedia, wikidata
Wptools
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
Stars: ✭ 371 (+1852.63%)
Mutual labels:  wikipedia, wikidata
equalstreetnames
Map visualizing the streetnames by gender : 50 cities in 11 countries
Stars: ✭ 64 (+236.84%)
Mutual labels:  wikipedia, wikidata
entity-fishing
A machine learning tool for fishing entities
Stars: ✭ 176 (+826.32%)
Mutual labels:  wikipedia, wikidata
go-wikidata
Wikidata API bindings in go.
Stars: ✭ 27 (+42.11%)
Mutual labels:  wikipedia, wikidata
Wikipedia Tools For Google Spreadsheets
Wikipedia Tools for Google Spreadsheets — Install:
Stars: ✭ 96 (+405.26%)
Mutual labels:  wikipedia, wikidata
biblio-glutton
A high performance bibliographic information service
Stars: ✭ 54 (+184.21%)
Mutual labels:  doi, openaccess
wikibot
Some MediaWiki bot examples including wikipedia, wikidata using MediaWiki module of CeJS library. 採用 CeJS MediaWiki 自動化作業用程式庫來製作 MediaWiki (維基百科/維基數據) 機器人的範例。
Stars: ✭ 26 (+36.84%)
Mutual labels:  wikipedia, wikidata
wikirepo
Python based Wikidata framework for easy dataframe extraction
Stars: ✭ 33 (+73.68%)
Mutual labels:  wikipedia, wikidata
wikiapi
JavaScript MediaWiki API for node.js
Stars: ✭ 28 (+47.37%)
Mutual labels:  wikipedia, wikidata
oabot
Adding links to full text in Wikipedia references
Stars: ✭ 33 (+73.68%)
Mutual labels:  wikipedia, open-access
serrano
Low level Ruby client for Crossref
Stars: ✭ 26 (+36.84%)
Mutual labels:  doi
wikidata-taxonomy
command-line tool to extract taxonomies from Wikidata
Stars: ✭ 100 (+426.32%)
Mutual labels:  wikidata
WikidataQueryServiceR
An R package for the Wikidata Query Service API
Stars: ✭ 23 (+21.05%)
Mutual labels:  wikidata
site
Website for the Open Scholarship Strategy
Stars: ✭ 21 (+10.53%)
Mutual labels:  open-access
context-cards
Wikipedia page previews for any site
Stars: ✭ 29 (+52.63%)
Mutual labels:  wikipedia
pronuncify
automate incrementally producing word pronunciation recordings for Wiktionary through Wikimedia Commons
Stars: ✭ 23 (+21.05%)
Mutual labels:  wikimedia-commons
semantic-document-relations
Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles"
Stars: ✭ 21 (+10.53%)
Mutual labels:  wikipedia
CiteUnseen
https://en.wikipedia.org/wiki/User:SuperHamster/CiteUnseen
Stars: ✭ 13 (-31.58%)
Mutual labels:  wikipedia

About

This repo is part of the OA Signalling project that aims to build a system to signal whether references cited on Wikipedia are free to reuse.

Cited sources form an integral part of both scholarly communication and Wikipedia. They are meant to support statements made in the citing articles and invite readers to dive deeper into the subject at hand.

Enhancing the accessibility of cited sources thus contributes to the educational mission of the Wikimedia community. Many sources, however, are not accessible to the average Wikipedia reader due to paywalls in front of them, and many of those that are free to read can not be freely reused.

For scholarly articles, a system that provides article-level licensing information is currently being developed by DOAJ and CrossRef. This resource could be tapped for signalling the openness of references cited on Wikipedia.

It is the aim of this project to provide the technical infrastructure that would enable that, and to engage the Wikimedia and Open Access communities towards implementing it.

Workflow

Here is a short version of the envisaged workflow (components central to the project are marked in bold):

  1. listen to RecentChanges feed across all Wikimedia wikis (cf. event-data-wikipedia-agent)
  2. filter by bibliographic identifier for papers (currently only DOI, long-term also PubMed ID, PMC ID, arXiv ID, JSTOR ID and perhaps others)
  3. check whether paper was cited or uncited (all steps until here are included in CrossRef's live stream of DOI citations in Wikipedia)
  4. handle potential vandalism/ spam, e.g. via Revision scoring
  5. pull paper metadata from suitable source (e.g. from CrossRef/ DataCite for DOIs); Recitation bot does that, and so does Source, M.D.
  6. check whether that paper is available on Wikisource (initially only English, long-term other languages too)
  7. if so, check proper representation of paper and its metadata on Wikisource (as well as on Commons, Wikidata and Wikipedia) and in case of inconsistencies, notify someone (e.g. the original citer and/ or a relevant WikiProject, or simply a tracking page)
  8. if not, check whether that paper is available in JATS (currently only via PubMed Central, but long-term from anywhere); Recitation bot does that
    1. if so, check licensing of the paper
      1. if license is open, convert paper's JATS XML to MediaWiki XML
        1. upload full text to Wikisource (Recitation bot does that — see contribution history, on-wiki page list and tracking categories)
          1. check for consistency with original (perhaps via fuzzy anchoring?)
        2. upload images and media to Wikimedia Commons (requires duplicate detection - many images and videos already there; Recitation bot does that too — see contribution history and tracking categories; there is an unresolved issue with high-res images); for video or audio files (covered by the Open Access Media Importer), put a copy of the original file onto the Schnittserver
      2. if license is not open, notify OA Button (perhaps via OABOT?)
  9. start or update the Wikidata items for paper and/ or authors as necessary, perhaps even for references cited in the paper (bib2wikidata can upload CSL)
  10. check whether the initial citation that was identified through the RecentChanges stream is pulling bibliographic metadata from Wikidata
    1. if so, purge page to refresh display of citation information
    2. if not, update original citation with licensing/ OA Button info and links to Wikisource, Commons, Wikidata, as necessary
  11. keep track of revisions of cited references via CrossMark and notify someone of retractions etc.
  12. keep track of further citations (of the same cited reference) from within and beyond Wikimedia, e.g. via the DOI Event Tracker and notify someone (including the Cite-o-Meter)

Most of the components of this workflow do already exist but need some tweaking or brushing to fit our purposes better or to turn the pieces into a pipeline.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].