All Projects → inventaire → Entities Search Engine

inventaire / Entities Search Engine

Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Entities Search Engine

Apm Server
APM Server
Stars: ✭ 878 (+2151.28%)
Mutual labels:  elasticsearch
Rom Elasticsearch
Elasticsearch adapter for rom-rb
Stars: ✭ 30 (-23.08%)
Mutual labels:  elasticsearch
Diskover
File system crawler, disk space usage, file search engine and file system analytics powered by Elasticsearch
Stars: ✭ 977 (+2405.13%)
Mutual labels:  elasticsearch
Angularcomponentplotly
Angular Component for Plotly, ASP.NET Core, Elasticsearch
Stars: ✭ 14 (-64.1%)
Mutual labels:  elasticsearch
Hot Comment
Go、Gin、Elasticsearch开发的云音乐歌手、歌曲、评论搜索API,线上演示地址在右边:
Stars: ✭ 28 (-28.21%)
Mutual labels:  elasticsearch
Elastic Graph Recommender
Building recommenders with Elastic Graph!
Stars: ✭ 33 (-15.38%)
Mutual labels:  elasticsearch
Office365 Management Api Elk
An API connector for the Office 365 Management API and the Elastic Stack
Stars: ✭ 13 (-66.67%)
Mutual labels:  elasticsearch
Real Time Stream Processing Engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Stars: ✭ 37 (-5.13%)
Mutual labels:  elasticsearch
Bcash Instadump
CLI tools for insta-dumping bcash in exchange for bitcoins (via ShapeShift), creating bcash-compatible transactions, and more
Stars: ✭ 29 (-25.64%)
Mutual labels:  dump
Linux Tutorial
《Java 程序员眼中的 Linux》
Stars: ✭ 7,757 (+19789.74%)
Mutual labels:  elasticsearch
Laravel Scout Elastic
Elastic Driver for Laravel Scout
Stars: ✭ 886 (+2171.79%)
Mutual labels:  elasticsearch
Elasticsearch Java Rest
Elasticsearch Java Rest 手册
Stars: ✭ 27 (-30.77%)
Mutual labels:  elasticsearch
News Please
news-please - an integrated web crawler and information extractor for news that just works.
Stars: ✭ 969 (+2384.62%)
Mutual labels:  elasticsearch
Flexsearch
Next-Generation full text search library for Browser and Node.js
Stars: ✭ 8,108 (+20689.74%)
Mutual labels:  elasticsearch
Elasticsplunk
A Search command to explore Elasticsearch data within Splunk.
Stars: ✭ 35 (-10.26%)
Mutual labels:  elasticsearch
Eliot
Eliot: the logging system that tells you *why* it happened
Stars: ✭ 874 (+2141.03%)
Mutual labels:  elasticsearch
Elastic data
Elasticsearch datasets ready for bulk loading
Stars: ✭ 30 (-23.08%)
Mutual labels:  elasticsearch
Estab
Export elasticsearch as TSV or line delimited JSON.
Stars: ✭ 37 (-5.13%)
Mutual labels:  elasticsearch
Openwisp Monitoring
Network monitoring system written in Python and Django, designed to be extensible, programmable, scalable and easy to use by end users: once the system is configured, monitoring checks, alerts and metric collection happens automatically.
Stars: ✭ 37 (-5.13%)
Mutual labels:  elasticsearch
3d kibana charts vis
3D Kibana Charts: Pie Chart, Bars Chart, Bubbles Chart
Stars: ✭ 34 (-12.82%)
Mutual labels:  elasticsearch

⚠️ This repository has been archived as now the inventaire server itself takes care of keeping Elasticsearch entities and wikidata indexes updated

Entities Search Engine

Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities (see entities map), and keep those up-to-date, to answer questions like "give me all the humans with a name starting by xxx" in a super snappy way, typically for the needs of an autocomplete field.

For the Wikidata-only version see the archived branch #wikidata-subset-search-engine branch.

Summary

Setup

see setup

Dependencies

see setup to install dependencies:

  • NodeJs >= v6.4
  • ElasticSearch (this repo was developed targeting ElasticSearch v2.4, but it should work with newer version with some minimal changes)
  • Nginx
  • Let's Encrypt
  • already installed in any good nix system: curl, gzip

Start server

see Wikidata and Inventaire per-entity import

Data imports

from scratch

add

Wikidata entities

3 ways to import Wikidata entities data into your ElasticSearch instance

Inventaire entities

update

To update any entity, simply re-add it, typically by posting its URI (ex: 'wd:Q180736' for a Wikidata entity, or 'inv:9cf5fbb9affab552cd4fb77712970141' for an Inventaire one) to the server

remove

To un-index entities that were mistakenly added, pass the path of a results json file, supposedly made of an array of ids. All those ids' documents will be deleted

index=wikidata
type=humans
ids_json_array=./queries/results/mistakenly_added_wikidata_humans_ids.json
npm run delete-from-results $index $type $ids_json_array

index=entities-prod
type=works
ids_json_array=./queries/results/mistakenly_added_inventaire_works_ids.json
npm run delete-from-results $index $type $ids_json_array

importing dumps

You can import dumps from inventaire.io prod elasticsearch instance:

# Download Wikidata dump
wget -c https://dumps.inventaire.io/wd/elasticsearch/wikidata_data.json.gz
gzip -d wikidata_data.json.gz
# elasticdump should have been installed when running `npm install`
# --limit: increasing batches size
./node_modules/.bin/elasticdump --input=./wikidata_data.json --output=http://localhost:9200/wikidata --limit 2000

# Same for Inventaire
wget -c https://dumps.inventaire.io/inv/elasticsearch/entities_data.json.gz
gzip -d entities_data.json.gz
./node_modules/.bin/elasticdump --input=./entities_data.json --output=http://localhost:9200/entities --limit 2000

Query ElasticSearch

curl "http://localhost:9200/wikidata/humans/_search?q=Victor%20Hugo"

References

Donate

We are developing and maintaining tools to work with Wikidata from NodeJS, the browser, or simply the command line, with quality and ease of use at heart. Any donation will be interpreted as a "please keep going, your work is very much needed and awesome. PS: love". Donate

See Also

You may also like

inventaire banner

Do you know inventaire.io? It's a web app to share books with your friends, built on top of Wikidata! And its libre software too.

License

AGPL-3.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].