All Projects → borderless → unfurl

borderless / unfurl

Licence: other
Extract rich metadata from URLs

Programming Languages

typescript
32286 projects

Projects that are alternatives of or similar to unfurl

node-htmlmetaparser
A `htmlparser2` handler for parsing rich metadata from HTML. Includes HTML metadata, JSON-LD, RDFa, microdata, OEmbed, Twitter cards and AppLinks.
Stars: ✭ 44 (+7.32%)
Mutual labels:  metadata, rdfa, json-ld, microdata
Codemeta
Minimal metadata schemas for science software and code, in JSON-LD
Stars: ✭ 218 (+431.71%)
Mutual labels:  metadata, json-ld
Unfurl
Scraper for oEmbed, Twitter Cards and Open Graph metadata - fast and Promise-based ⚡️
Stars: ✭ 193 (+370.73%)
Mutual labels:  metadata, scraper
ControlledVocabularyManager
Rails application with Blazegraph for managing controlled vocabularies in RDF.
Stars: ✭ 20 (-51.22%)
Mutual labels:  rdf, json-ld
Image search
Python Library to download images and metadata from popular search engines.
Stars: ✭ 86 (+109.76%)
Mutual labels:  metadata, scraper
Codebook
Cook rmarkdown codebooks from metadata on R data frames
Stars: ✭ 105 (+156.1%)
Mutual labels:  metadata, json-ld
YouTube-MA
💾 YouTube video metadata archiver written in Golang
Stars: ✭ 17 (-58.54%)
Mutual labels:  metadata, scraper
Tropy
Research photo management
Stars: ✭ 337 (+721.95%)
Mutual labels:  metadata, rdf
jsonld-context-parser.js
Parses JSON-LD contexts
Stars: ✭ 20 (-51.22%)
Mutual labels:  rdf, json-ld
Awesome-meta-tags
📙 Awesome collection of meta tags
Stars: ✭ 18 (-56.1%)
Mutual labels:  metadata, microdata
mayktso
🌌 mayktso: encounters at an endpoint
Stars: ✭ 19 (-53.66%)
Mutual labels:  rdf, rdfa
Scrape
Distributed Scraper
Stars: ✭ 65 (+58.54%)
Mutual labels:  metadata, scraper
Emby.plugins.javscraper
Emby/Jellyfin 的一个日本电影刮削器插件,可以从某些网站抓取影片信息。
Stars: ✭ 864 (+2007.32%)
Mutual labels:  metadata, scraper
Owmeta
Unified, simple data access python library for data & facts about C. elegans anatomy
Stars: ✭ 134 (+226.83%)
Mutual labels:  metadata, rdf
Puree
Metadata extraction from the Pure Research Information System.
Stars: ✭ 8 (-80.49%)
Mutual labels:  metadata, extraction
MangDL
The most inefficient Manga downloader for PC
Stars: ✭ 40 (-2.44%)
Mutual labels:  metadata, scraper
seomate
SEO, mate! It's important. That's why SEOMate provides the tools you need to craft all the meta tags, sitemaps and JSON-LD microdata you need - in one highly configurable, open and friendly package - with a super-light footprint.
Stars: ✭ 31 (-24.39%)
Mutual labels:  metadata, json-ld
rafy-rs
Rust library to download YouTube content and retrieve metadata
Stars: ✭ 46 (+12.2%)
Mutual labels:  metadata, content
tinyPornManager
Made for pornhub. Fork from tinyMediaManager v3
Stars: ✭ 57 (+39.02%)
Mutual labels:  metadata, scraper
takefive.css
The most advanced pure CSS lightbox – not one single line of JavaScript has been wasted
Stars: ✭ 123 (+200%)
Mutual labels:  rdfa, microdata

Unfurl

NPM version NPM downloads Build status Build coverage

Extract rich metadata from URLs.

Installation

npm install @borderless/unfurl --save

Usage

Unfurl attempts to parse and extract rich structured metadata from URLs.

import { scraper, urlScraper } from "@borderless/unfurl";
import * as plugins from "@borderless/unfurl/dist/plugins";

Scraper

Accepts a request function and a list of plugins to use. The request is expected to return a "page" object, which is the same shape as the input to scrape(page).

const scrape = scraper({
  request,
  plugins: [plugins.htmlmetaparser, plugins.exifdata],
});

const res = await fetch("http://example.com"); // E.g. `popsicle`.

await scrape({
  url: res.url,
  status: res.status,
  headers: res.headers.asObject(),
  body: res.stream(), // Must stream the request instead of buffering to support large responses.
});

URL Scraper

Simpler wrapper around scraper that automatically makes a request(url) for the page.

const scrape = urlScraper({ request });

await scrape("http://example.com");

License

Apache 2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].