All Projects → mrmiguez → pymods

mrmiguez / pymods

Licence: MIT license
process MODS records from Python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pymods

Islandora-Metadata-Interest-Group
The purpose of the Islandora Metadata Interest Group (IMIG) is to investigate and provide metadata solutions that help improve metadata creation, maintenance and enhancement in Islandora.
Stars: ✭ 29 (+70.59%)
Mutual labels:  metadata, mods
ModelCataloguePlugin
MetadataExchange Community Edition
Stars: ✭ 16 (-5.88%)
Mutual labels:  metadata
node-taglib2
A rewrite of the now unmaintained node-taglib
Stars: ✭ 40 (+135.29%)
Mutual labels:  metadata
BlockHashLoc
Recover files using lists of blocks hashes, bypassing the File System entirely
Stars: ✭ 45 (+164.71%)
Mutual labels:  metadata
Awesome-meta-tags
📙 Awesome collection of meta tags
Stars: ✭ 18 (+5.88%)
Mutual labels:  metadata
dart-tags
ID3 Tag parser written on the pure dart language.
Stars: ✭ 35 (+105.88%)
Mutual labels:  metadata
scif
scientific filesystem: a filesystem organization for scientific software and metadata
Stars: ✭ 30 (+76.47%)
Mutual labels:  metadata
itemadapter
Common interface for data container classes
Stars: ✭ 47 (+176.47%)
Mutual labels:  metadata
idr-metadata
Curated metadata for all studies published in the Image Data Resource
Stars: ✭ 12 (-29.41%)
Mutual labels:  metadata
django-music-publisher
Software for managing music metadata, registration/licencing of musical works and royalty processing.
Stars: ✭ 46 (+170.59%)
Mutual labels:  metadata
game sdk
Unity游戏接入外部sdk框架
Stars: ✭ 22 (+29.41%)
Mutual labels:  mods
isbnlib
python library to validate, clean, transform and get metadata of ISBN strings (for devs).
Stars: ✭ 177 (+941.18%)
Mutual labels:  metadata
TerrariaOverhaul
A huge game mechanics changing mod for Terraria.
Stars: ✭ 125 (+635.29%)
Mutual labels:  mods
rfcs
An initiative to structure the world of metadata for Comic Books, Mangas and other graphic novels.
Stars: ✭ 73 (+329.41%)
Mutual labels:  metadata
PoW-Loader
The Modloader for a community-driven English patch for the Chinese video game "Path of Wuxia"
Stars: ✭ 21 (+23.53%)
Mutual labels:  mods
Telemachus-1
No description or website provided.
Stars: ✭ 33 (+94.12%)
Mutual labels:  mods
doctrine-phpstorm-meta
PhpStorm meta data for expected arguments completion.
Stars: ✭ 35 (+105.88%)
Mutual labels:  metadata
RazorSharp
Low-level utilities and tools for working with the CLR and memory.
Stars: ✭ 31 (+82.35%)
Mutual labels:  metadata
JellyfinJav
JAV metadata providers for Jellyfin
Stars: ✭ 66 (+288.24%)
Mutual labels:  metadata
VRCMods
⚙️ New features & quality of life improvements for VRChat
Stars: ✭ 21 (+23.53%)
Mutual labels:  mods

pymods Build Status

pymods is utility module for working with the Library of Congress's MODS XML standard: Metadata Description Schema (MODS). It is a utility wrapper for the lxml module specific to deserializing data out of MODSXML into python data types.

If you need a module to serialize data into MODSXML, see the other pymods by Matt Cordial.

Installing

Recommended:

pip install pymods

Using

Basics

XML is parsed using the MODSReader class:

mods_records = pymods.MODSReader('some_file.xml')

Individual records are stored as an iterator of the MODSRecord object:

In [5]: for record in mods_records:
  ....:    print(record)
  ....:
<Element {http://www.loc.gov/mods/v3}mods at 0x47a69f8>
<Element {http://www.loc.gov/mods/v3}mods at 0x47fd908>
<Element {http://www.loc.gov/mods/v3}mods at 0x47fda48>

MODSReader will work with mods:modsCollection documents, outputs from OAI-PMH feeds, or individual MODSXML documents with mods:mods as the root element.

pymods.MODSRecord

The MODSReader class parses each mods:mods element into a pymods.MODSRecord object. pymods.MODSRecord is a custom wrapper class for the lxml.ElementBase class. All children of pymods.Record inherit the lxml._Element and lxml.ElementBase methods.

In [6]: record = next(pymods.MODSReader('example.xml'))
In [7]: print(record.nsmap)
{'dcterms': 'http://purl.org/dc/terms/', 'xsi': 'http://www.w3.org/2001/XMLSchema-instance', None: 'http://www.loc.gov/mods/v3', 'flvc': 'info:flvc/manifest/v1', 'xlink': 'http://www.w3.org/1999/xlink', 'mods': 'http://www.loc.gov/mods/v3'}
In [8]: for child in record.iterdescendants():
  ....:    print(child.tag)
    
{http://www.loc.gov/mods/v3}identifier
{http://www.loc.gov/mods/v3}extension
{info:flvc/manifest/v1}flvc
{info:flvc/manifest/v1}owningInstitution
{info:flvc/manifest/v1}submittingInstitution
{http://www.loc.gov/mods/v3}titleInfo
{http://www.loc.gov/mods/v3}title
{http://www.loc.gov/mods/v3}name
{http://www.loc.gov/mods/v3}namePart
{http://www.loc.gov/mods/v3}role
{http://www.loc.gov/mods/v3}roleTerm
{http://www.loc.gov/mods/v3}roleTerm
{http://www.loc.gov/mods/v3}typeOfResource
{http://www.loc.gov/mods/v3}genre
...

Methods

All functions return data either as a string, list, list of named tuples. See the appropriate docstring or the API documentation for details.

>>> record.genre?
Type:        property
String form: <property object at 0x0000000004812C78>
Docstring:
Accesses mods:genre element.
:return: A list containing Genre elements with term, authority,
    authorityURI, and valueURI attributes.

Examples

Importing

from pymods import MODSReader, MODSRecord

Parsing a file

In [10]: mods = MODSReader('example.xml')
In [11]: for record in mods:
   ....:    print(record.dates)
   ....:
[Date(text='1966-12-08', type='{http://www.loc.gov/mods/v3}dateCreated')]
None
[Date(text='1987-02', type='{http://www.loc.gov/mods/v3}dateIssued')]

Simple tasks

Generating a title list

In [14]: for record in mods:
   ....:     print(record.titles)
   ....:
['Fire Line System']
['$93,668.90. One Mill Tax Apportioned by Various Ways Proposed']
['Broward NOW News: National Organization for Women, February 1987']

Creating a subject list

In [17]: for record in mods:
   ....:     for subject in record.subjects:
   ....:         print(subject.text)
   ....:
Concert halls
Architecture
Architectural drawings
Structural systems
Structural systems drawings
Structural drawings
Safety equipment
Construction
Mechanics
Structural optimization
Architectural design
Fire prevention--Safety measures
Taxes
Tax payers
Tax collection
Organizations
Feminism
Sex discrimination against women
Women's rights
Equal rights amendments
Women--Societies and clubs
National Organization for Women

More complex tasks

Creating a list of subject URI's only for LCSH subjects

In [18]: for record in mods:
   ....:     for subject in record.subjects:
   ....:         if 'lcsh' == subject.authority:
   ....:             print(subject.uri)
   ....:
http://id.loc.gov/authorities/subjects/sh85082767
http://id.loc.gov/authorities/subjects/sh88004614
http://id.loc.gov/authorities/subjects/sh85132810
http://id.loc.gov/authorities/subjects/sh85147343

Get URLs for objects using a No Copyright US rightsstatement.org URI

In [23]: for record in mods:
   ....:     for rights_elem in record.rights
   ....:         if rights_elem.uri == 'http://rightsstatements.org/vocab/NoC-US/1.0/':
   ....:             print(record.purl)
   ....:
http://purl.flvc.org/fsu/fd/FSU_MSS0204_B01_F10_09
http://purl.flvc.org/fsu/fd/FSU_MSS2008003_B18_F01_004
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].