All Projects → microformats → microformats-ruby

microformats / microformats-ruby

Licence: CC0-1.0 license
Ruby gem that parse HTML containing microformats/microformats2 and returns Ruby objects, a Ruby hash or a JSON hash

Programming Languages

ruby
36898 projects - #4 most used programming language
HTML
75241 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to microformats-ruby

bllip-parser
BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Stars: ✭ 217 (+143.82%)
Mutual labels:  parsing
tdop.github.io
Reprinting Vaughan Pratt's Paper on Top Down Operator Precedence Parsing
Stars: ✭ 99 (+11.24%)
Mutual labels:  parsing
cmake-reflection-template
A template for simple C++ reflection done with CMake and Python (no other external tools)
Stars: ✭ 37 (-58.43%)
Mutual labels:  parsing
parson
Yet another PEG parser combinator library and DSL
Stars: ✭ 52 (-41.57%)
Mutual labels:  parsing
SteamLicenseParser
📦 Parsers your Steam licenses and generates some stats
Stars: ✭ 23 (-74.16%)
Mutual labels:  parsing
rita-dsl
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format
Stars: ✭ 60 (-32.58%)
Mutual labels:  parsing
DrawRacket4Me
DrawRacket4Me draws trees and graphs from your code, making it easier to check if the structure is what you wanted.
Stars: ✭ 43 (-51.69%)
Mutual labels:  parsing
floaxie
Floating point printing and parsing library based on Grisu2 and Krosh algorithms
Stars: ✭ 28 (-68.54%)
Mutual labels:  parsing
pyrser
A PEG Parsing Tool
Stars: ✭ 32 (-64.04%)
Mutual labels:  parsing
python3-mal
Python interface to MyAnimeList
Stars: ✭ 18 (-79.78%)
Mutual labels:  parsing
indigenous-android
An open social app with support for IndieWeb, Mastodon, Pleroma and Pixelfed.
Stars: ✭ 89 (+0%)
Mutual labels:  indieweb
JagTag
📝 JagTag is a simple - yet powerful and customizable - interpretted text parsing language!
Stars: ✭ 40 (-55.06%)
Mutual labels:  parsing
waxseal
Big official brass stamp to make signing gems dead simple.
Stars: ✭ 21 (-76.4%)
Mutual labels:  rubygems
ltreesitter
Standalone tree sitter bindings for the Lua language
Stars: ✭ 62 (-30.34%)
Mutual labels:  parsing
sexp-grammar
Invertible parsing for S-expressions
Stars: ✭ 28 (-68.54%)
Mutual labels:  parsing
chat.indieweb.org
chat.indieweb.org
Stars: ✭ 14 (-84.27%)
Mutual labels:  indieweb
node-c-parser
A recursive decent parser for C programming language codes
Stars: ✭ 33 (-62.92%)
Mutual labels:  parsing
elite-journal
Parsing the Elite: Dangerous journal and putting it into a cool format.
Stars: ✭ 34 (-61.8%)
Mutual labels:  parsing
vital
Design Framework
Stars: ✭ 53 (-40.45%)
Mutual labels:  rubygems
abnf parsec
ABNF in, parser out
Stars: ✭ 42 (-52.81%)
Mutual labels:  parsing

Microformats Logo Microformats Ruby

A Ruby gem for parsing HTML documents containing microformats.

Gem Downloads Build Maintainability Coverage

Key Features

Getting Started

Before installing and using microformats-ruby, you'll want to have Ruby 2.4.10 (or newer) installed. It's recommended that you use a Ruby version management tool like rbenv, chruby, or rvm.

microformats-ruby is developed using Ruby 2.7.1 and is additionally tested against versions 2.4, 2.5, 2.6, 2.7, 3.0, and 3.1 using github Actions.

Installation

If you're using Bundler to manage gem dependencies, add microformats-ruby to your project's Gemfile:

source 'https://rubygems.org'

gem 'microformats', '~> 4.0', '>= 4.2.1'

…and then run:

bundle install

You may also install microformats-ruby directly using:

gem install microformats

Usage

An example working with a basic h-card:

source = '<div class="h-card"><p class="p-name">Jessica Lynn Suttles</p></div>'
collection = Microformats.parse(source)

# Get a copy of the canonical microformats hash structure
collection.to_hash

# The above as JSON in a string
collection.to_json

# Return a string if there is only one item found
collection.card.name #=> "Jessica Lynn Suttles"

Below is a more complex markup structure using an h-entry with a nested h-card:

source = '<article class="h-entry">
  <h1 class="p-name">Microformats 2</h1>
  <div class="h-card p-author">
    <p class="p-name"><span class="p-first-name">Jessica</span> Lynn Suttles</p>
  </div>
</article>'

collection = Microformats.parse(source)

collection.entry.name.to_s #=> "Microformats 2"

# Accessing nested microformats
collection.entry.author.name.to_s #=> "Jessica Lynn Suttles"

# Accessing nested microformats can use shortcuts or expanded method
collection.entry.author.name #=> "Jessica Lynn Suttles"
collection.entry.properties.author.properties.name.to_s #=> "Jessica Lynn Suttles"

# Use `_` instead of `-` to return property values
collection.entry.author.first_name #=> "Jessica"
collection.rel_urls #=> {}

Using the same markup patterns as above, here's an h-entry with multiple authors, each marked up as h-cards:

source = '<article class="h-entry">
  <h1 class="p-name">Microformats 2</h1>
  <div class="h-card p-author">
    <p class="p-name"><span class="p-first-name">Jessica</span> Lynn Suttles</p>
  </div>
  <div class="h-card p-author">
    <p class="p-name"><span class="p-first-name">Brandon</span> Edens</p>
  </div>
</article>'

collection = Microformats.parse(source)

# Arrays of items will always return the first item by default
collection.entry.author.name #=> "Jessica Lynn Suttles"
collection.entry.author(1).name #=> "Brandon Edens"

# Get the actual array of items by using `:all`
collection.entry.author(:all).count #=> 2
collection.entry.author(:all)[1].name #=> "Brandon Edens"

Command Line Interface

microformats-ruby also includes a command like program that will parse HTML and return a JSON representation of the included microformats.

microformats http://tantek.com

The program accepts URLs, file paths, or strings of HTML as an argument. Additionally, the script accepts piped input from other programs:

curl http://tantek.com | microformats

Implementation Status

Status Specification or Parsing Rule
Parse a document for microformats
Parsing a p- property
Parsing a u- property
Parsing a dt- property
Parsing an e- property
Parsing for implied properties
Nested properties
Nested microformat with associated property
Nested microformat without associated property
Recognize dynamically created properties
Support for rel attribute values
Normalizing u-* property values
Parse the value class pattern
Recognize vendor extensions
Support for classic microformats
Recognize the include pattern

Improving microformats-ruby

Have questions about using microformats-ruby? Found a bug? Have ideas for new or improved features? Want to pitch in and write some code?

Check out CONTRIBUTING.md for more on how you can help!

Acknowledgments

The microformats-ruby logo is derived from the microformats logo mark by Rémi Prévost.

microformats-ruby is written and maintained by:

License

microformats-ruby is dedicated to the public domain using the Creative Commons CC0 1.0 Universal license.

The authors waive all of their rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. You can copy, modify, and distribute the work, even for commercial purposes, all without asking permission.

See LICENSE for more details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].