All Projects → xlcnd → isbnlib

xlcnd / isbnlib

Licence: other
python library to validate, clean, transform and get metadata of ISBN strings (for devs).

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to isbnlib

Nager.ArticleNumber
C# Validate Article Numbers ASIN, EAN8, EAN13, GTIN, ISBN, ISBN13, SKU, UPC
Stars: ✭ 25 (-85.88%)
Mutual labels:  isbn, ean13
dart barcode
Barcode generation library
Stars: ✭ 79 (-55.37%)
Mutual labels:  isbn, ean13
serrano
Low level Ruby client for Crossref
Stars: ✭ 26 (-85.31%)
Mutual labels:  metadata, doi
siskin
Tasks around metadata.
Stars: ✭ 20 (-88.7%)
Mutual labels:  metadata
attributes
PHP Attributes Reader. Subtree split of the Spiral Attributes component (see spiral/framework)
Stars: ✭ 22 (-87.57%)
Mutual labels:  metadata
metadata
oracle,mysql,sql server 元数据管理表生成
Stars: ✭ 45 (-74.58%)
Mutual labels:  metadata
rfcs
An initiative to structure the world of metadata for Comic Books, Mangas and other graphic novels.
Stars: ✭ 73 (-58.76%)
Mutual labels:  metadata
AvalonXmlAgent.bundle
XML file agent for Plex
Stars: ✭ 32 (-81.92%)
Mutual labels:  metadata
scif
scientific filesystem: a filesystem organization for scientific software and metadata
Stars: ✭ 30 (-83.05%)
Mutual labels:  metadata
BiocPkgTools
Computable build reports, package metadata, and download stats from the Bioconductor project
Stars: ✭ 20 (-88.7%)
Mutual labels:  metadata
openbookstore
Bibliographic search of books and personal manager https://gitlab.com/myopenbookstore/openbookstore
Stars: ✭ 24 (-86.44%)
Mutual labels:  isbn
libbib
An R package providing WorldCat API communication, functions for validating and normalizing bibliographic codes, translation from call numbers to subject, and other related utilities helpful for assessment librarians
Stars: ✭ 21 (-88.14%)
Mutual labels:  isbn
riscv-meta
RISC-V Instruction Set Metadata
Stars: ✭ 33 (-81.36%)
Mutual labels:  metadata
diskover-community
Diskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch
Stars: ✭ 1,257 (+610.17%)
Mutual labels:  metadata
jQuery.EAN13
A jQuery & plain JavaScript library for generating EAN13-barcodes
Stars: ✭ 45 (-74.58%)
Mutual labels:  ean13
metadata-qa-marc
QA catalogue – a metadata quality assessment tool for library catalogue records (MARC, PICA)
Stars: ✭ 59 (-66.67%)
Mutual labels:  metadata
scihub
Copernicus Sentinel Science Hub rolling archive downloader
Stars: ✭ 28 (-84.18%)
Mutual labels:  metadata
django-isbn-field
Django model field to store and validate ISBN numbers.
Stars: ✭ 21 (-88.14%)
Mutual labels:  isbn
where-is-resolver
Попытка ответить на вопрос о резольверах, проверяющих домены из списка РКН
Stars: ✭ 49 (-72.32%)
Mutual labels:  metadata
tinyPornManager
Made for pornhub. Fork from tinyMediaManager v3
Stars: ✭ 57 (-67.8%)
Mutual labels:  metadata
Built Status Bugs PYPI Downloads

Info

isbnlib is a (pure) python library that provides several useful methods and functions to validate, clean, transform, hyphenate and get metadata for ISBN strings.

Install

From the command line, enter (in some cases you have to precede the command with sudo):

$ pip install isbnlib

If you use linux systems, you can install using your distribution package manager (all major distributions have packages python-isbnlib and python3-isbnlib), however (usually) are very old and don't work well anymore!

ISBN

The official form of an ISBN is something like ISBN 979-10-90636-07-1. However for most applications only the numbers are important, you can always 'mask' them if you need (see below). This library works mainly with 'striped' ISBNs (only digits and X) like '0826497527'. You can strip an ISBN-like string by using canonical(isbnlike). You can 'mask' the ISBN by using mask(isbn). So in the examples below, when you see 'isbn' in the argument, it is a 'striped' ISBN, when the argument is an 'isbnlike' it is a string like ISBN 979-10-90636-07-1 or even something dirty like asdf 979-10-90636-07-1 bla bla.

Two important concepts: valid ISBN should be an ISBN that was built according with the rules, this is distinct from issued ISBN that is an ISBN that was already issued to a publisher (this is the usage of the libraries and most of the web services). However isbn.org, probably by legal reasons, merges the two! So, according to isbn-international.org, '9786610326266' is not valid (because the block 978-66... has not been issued yet, however if you use is_isbn13('9786610326266') you will get True (because '9786610326266' follows the rules of an ISBN). But the situation is even murkier, try meta('9786610326266') and you will see that this ISBN was already used!

If possible, work with ISBNs in the isbn-13 format (since 2007, only are issued ISBNs in the isbn-13 format). You can always convert isbn-10 to isbn-13, but not the reverse (read this). Read more about ISBN at isbn-international.org or wikipedia.

Main Functions

is_isbn10(isbn10like)
Validates as ISBN-10.
is_isbn13(isbn13like)
Validates as ISBN-13.
to_isbn10(isbn13)
Transforms isbn-13 to isbn-10.
to_isbn13(isbn10)
Transforms isbn-10 to isbn-13.
canonical(isbnlike)
Keeps only digits and X. You will get strings like 9780321534965 and 954430603X.
clean(isbnlike)
Cleans ISBN (only legal characters).
notisbn(isbnlike, level='strict')
Check with the goal to invalidate isbn-like.
get_isbnlike(text, level='normal')
Extracts all substrings that seem like ISBNs (very useful for scraping).
get_canonical_isbn(isbnlike, output='bouth')
Extracts ISBNs and transform them to the canonical form.
ean13(isbnlike)
Transforms an isbnlike string into an EAN13 number (validated canonical ISBN-13).
info(isbn)
Gets the language or country assigned to this ISBN.
mask(isbn, separator='-')
Mask (hyphenate) a canonical ISBN.
meta(isbn, service='default')
Gives you the main metadata associated with the ISBN. As service parameter you can use: 'goob' uses the Google Books service (no key is needed) and is the default option, 'wiki' uses the wikipedia.org api (no key is needed), 'openl' uses the OpenLibrary.org api (no key is needed). You can enter API keys with config.add_apikey(service, apikey) (see example below). The output can be formatted as bibtex, csl (CSL-JSON), msword, endnote, refworks, opf or json (BibJSON) bibliographic formats with registry.bibformatters. Now, you can extend the functionality of this function by adding plugins, more metadata providers or new bibliographic formatters (check for available plugins).
editions(isbn, service='merge')
Returns the list of ISBNs of editions related with this ISBN. By default uses 'merge' (merges 'openl', 'thingl' and 'wiki'), but other providers are available: 'openl' (uses the search API from Open Library), 'thingl' (uses the service ThingISBN from LibraryThing), 'wiki' (uses the service Citation from Wikipedia) and 'any' (first tries 'wiki', if no data then 'openl').
isbn_from_words(words)
Returns the most probable ISBN from a list of words (for your geographic area).
goom(words)
Returns a list of references from Google Books multiple references.
classify(isbn)
Returns a dictionary of classifiers for a canonical ISBN. For the meaning of these classifiers see OCLC. Most of the data in the underlying service are for books in english.
doi(isbn)
Returns a DOI's ISBN-A from a ISBN-13.
doi2tex(DOI)
Returns metadata formatted as BibTeX for a given DOI.
ren(filename)
Renames a file using metadata from an ISBN in his filename.
desc(isbn)
Returns a small description of the book. Almost all data available are for US books!
cover(isbn)
Returns a dictionary with the url for cover. Almost all data available are for US books!

See files test_core and test_ext for a lot of examples.

Plugins

You can extend the functionality of the library by adding plugins (for now, just new metadata providers or new bibliographic formatters).

For available plugins check here.

After install, your plugin will blend transparently in isbnlib (you will have more options in meta and bibformatters).

For Devs

API's Main Namespaces

In the namespace isbnlib you have access to the core functions: is_isbn10, is_isbn13, to_isbn10, to_isbn13, canonical, clean, notisbn, get_isbnlike, get_canonical_isbn, mask, info, check_digit10, check_digit13, doi and ean13.

In addition, you have access to metadata functions, namely: meta, editions, ren, desc, cover, goom, classify, doi2tex and isbn_from_words.

The exceptions raised by these methods can all be caught using ISBNLibException.

You can extend the lib by using the classes and functions exposed in namespace isbnlib.dev, namely:

  • WEBService a class that handles the access to web services (just by passing an url) and supports gzip. You can subclass it to extend the functionality... but probably you don't need to use it! It is used in the next class.
  • WEBQuery a class that uses WEBService to retrieve and parse data from a web service. You can build a new provider of metadata by subclassing this class. His main methods allow passing custom functions (handlers) that specialize them to specific needs (data_checker and parser). It implements a throttling mechanism with a default rate of one call per second per service.
  • Metadata a class that structures, cleans and 'validates' records of metadata. His method merge allows to implement a simple merging procedure for records from different sources. The main features of this class, can be implemented by a call to the stdmeta function instead!
  • vias exposes several functions to put calls to services, just by passing the name and a pointer to the service's query function. vias.parallel allows to put threaded calls. You can use vias.serial to make serial calls and vias.multi to use several cores. The default is vias.serial.

The exceptions raised by these methods can all be caught using ISBNLibDevException (or, more general, ISBNLibException). You shouldn't raise this exception in your code, only raise the specific exceptions exposed in isbnlib.dev whose name ends in Error.

In isbnlib.dev.helpers you can find several methods, that we found very useful, some of then are only used in isbntools (an app and framework that uses isbnlib).

With isbnlib.config you can read and set configuration options: change timeouts with seturlopentimeout and setthreadstimeout, access api keys with apikeys and add new one with add_apikey, access and set generic and user-defined options with options.get('OPTION1') and set_option.

Finally, from isbnlib.registry you can change the metadata service to be used by default (setdefaultservice), add a new service (add_service), access bibliographic formatters for metadata (bibformatters), set the default formatter (setdefaultbibformatter), add new formatters (add_bibformatter) and set a new cache (set_cache) (e.g. to switch off the cache set_cache(None)). The cache only works for calls through metadata functions. These changes only work for the 'current session', so should be done always before calling other methods.

Let us concretize these points with a small example.

Suppose you want a small script to get metadata using Open Library formatted in BibTeX.

A minimal script would be:

from isbnlib import meta
from isbnlib.registry import bibformatters

SERVICE = "openl"

# now you can use the service
isbn = "9780446310789"
bibtex = bibformatters["bibtex"]
print(bibtex(meta(isbn, SERVICE)))

Patterns of Usage

The library implements a very simple API with sensible defaults, but there are cases that need your attention (see case 3 below).

  1. You only need core functions:
# import the core functions you need
from isbnlib import canonical, is_isbn10, is_isbn13

isbn = canonical("978-0446310789")
if is_isbn13(isbn):
    ...
...
  1. You need also metadata functions, with default config:
from isbnlib import canonical, meta, description

isbn = canonical("978-0446310789")
data = meta(isbn)
...
  1. You need also metadata functions, with special config:

    Lets suppose you need to add an api key for a metadata plugin and change the cache too.

from myapp.utils import MyCache

# import the functions you need, plus 'config' and 'registry'
from isbnlib import canonical, config, meta, registry

# you should use 'config' first
config.add_apikey("isbndb", "kjshdfkjahsdflkjh")

# then 'registry'
registry.set_cache(MyCache())

# Only now you should use metadata functions
# (there are no adaptions for core functions,
#  so they can be used at any moment)
isbn = canonical("978-0446310789")
data = meta(isbn, service="isbndb")
...
  1. You want to build a plugin or use isbnlib.dev in your code:

    You should study very carefully the public methods in dir(isbnlib.dev), start with this template and follow the instructions there. For inspiration take a look at goob.

    Most of the public bibliographic catalog services return data in SRU or Unimarc format. It is very easy to write a customer plugin for these services, just use porbase (SRU) or sbn (Unimarc) as templates and consult this project.

Caveats

  1. These classes are optimized for one-call to services and not for batch calls.
  2. If you inspect the library, you will see that there are a lot of private modules (their name starts with '_'). These modules should not be accessed directly since, with high probability, your program will break with a further version of the library!

Projects using isbnlib

Open Library https://github.com/internetarchive/openlibrary

NYPL Library Simplified https://github.com/NYPL-Simplified

RERO ILS https://github.com/rero/rero-ils

CERN CDS RDM https://github.com/CERNDocumentServer/cds-rdm

ResearchHub https://github.com/ResearchHub/researchhub-backend

Manubot https://github.com/manubot

isbntools https://github.com/xlcnd/isbntools

isbnsrv https://github.com/xlcnd/isbnsrv

See the full list here.

Help

If you need help, please take a look at github or post a question on stackoverflow .

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].