Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → magda-io → Magda

magda-io / Magda

Licence: other

A federated, open-source data catalog for all your big data and small data

Programming Languages

184084 projects - #8 most used programming language

5932 projects

Labels

nodejs kubernetes postgresql elasticsearch open-data

Projects that are alternatives of or similar to Magda

Customizations for Adminer, the best database management tool written in PHP.

Stars: ✭ 99 (-48.7%)

Mutual labels: postgresql, elasticsearch

Configurable Extract, Transform, and Load

Stars: ✭ 125 (-35.23%)

Mutual labels: postgresql, elasticsearch

Spring Boot 2.x Examples

Spring Boot 2.x code examples

Stars: ✭ 104 (-46.11%)

Mutual labels: postgresql, elasticsearch

ASP.NET Core NLog MS SQL Server PostgreSQL MySQL Elasticsearch

Stars: ✭ 54 (-72.02%)

Mutual labels: postgresql, elasticsearch

Python fixtures and daemon managing tools for functional testing

Stars: ✭ 161 (-16.58%)

Mutual labels: postgresql, elasticsearch

Spring Examples

SpringBoot Examples

Stars: ✭ 67 (-65.28%)

Mutual labels: postgresql, elasticsearch

PG数据同步工具（Java实现）

Stars: ✭ 122 (-36.79%)

Mutual labels: postgresql, elasticsearch

macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.

Stars: ✭ 5,590 (+2796.37%)

Mutual labels: postgresql, elasticsearch

Netflix like full-stack application with SPA client and backend implemented in service oriented architecture

Stars: ✭ 156 (-19.17%)

Mutual labels: postgresql, elasticsearch

Universal cheminformatics libraries, utilities and database search tools

Stars: ✭ 146 (-24.35%)

Mutual labels: postgresql, elasticsearch

Vagrant configuration for PHP7, Phalcon 3.x and Zephir development.

Stars: ✭ 43 (-77.72%)

Mutual labels: postgresql, elasticsearch

Inshop CRM / ERP API. It's powerful framework allows to build systems for business with different workflows. It has on board multi language support, clients management, projects & tasks, documents, simple accounting, inventory management, orders & invoice management, possibilities to integrate with third party software, REST API, and many other features.

Stars: ✭ 178 (-7.77%)

Mutual labels: postgresql, elasticsearch

Great Big Example Application

A full-stack example app built with JHipster, Spring Boot, Kotlin, Angular 4, ngrx, and Webpack

Stars: ✭ 899 (+365.8%)

Mutual labels: postgresql, elasticsearch

Sync data between persistence engines, like ETL only not stodgy

Stars: ✭ 1,175 (+508.81%)

Mutual labels: postgresql, elasticsearch

NewsBlur is a personal news reader that brings people together to talk about the world. A new sound of an old instrument.

Stars: ✭ 5,862 (+2937.31%)

Mutual labels: postgresql, elasticsearch

Haproxy Configs

80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.

Stars: ✭ 106 (-45.08%)

Mutual labels: postgresql, elasticsearch

FeedHQ is a web-based feed reader

Stars: ✭ 525 (+172.02%)

Mutual labels: postgresql, elasticsearch

Research. Shared.

Stars: ✭ 528 (+173.58%)

Mutual labels: postgresql, elasticsearch

Easy Django integration with Elasticsearch through ZomboDB Postgres Extension

Stars: ✭ 136 (-29.53%)

Mutual labels: postgresql, elasticsearch

Usaspending Api

Server application to serve U.S. federal spending data via a RESTful API

Stars: ✭ 166 (-13.99%)

Mutual labels: postgresql, elasticsearch

View All Similar Projects ➔

Magda

Magda is a data catalog system that will provide a single place where all of an organization's data can be catalogued, enriched, searched, tracked and prioritized - whether big or small, internally or externally sourced, available as files, databases or APIs. Magda is designed specifically around the concept of federation - providing a single view across all data of interest to a user, regardless of where the data is stored or where it was sourced from. The system is able to quickly crawl external data sources, track changes, make automatic enhancements and make notifications when changes occur, giving data users a one-stop shop to discover all the data that's available to them.

Current Status

Magda is under active development by a small team - we often have to prioritise between making the open-source side of the project more robust and adding features to our own deployments, which can mean newer features aren't documented well, or require specific configuration to work. If you run into problems using Magda, we're always happy to help on Spectrum.

As an open data search engine

Magda has been used in production for over a year by data.gov.au, and is relatively mature for use in this use case.

As a data catalogue

Over the past 18 months, our focus has been to develop Magda into a more general-purpose data catalogue for use within organisations. If you want to use it as a data catalog, please do, but expect some rough edges! If you'd like to contribute to the project with issues or PRs, we love to recieve them.

Features

Powerful and scalable search based on ElasticSearch
Quick and reliable aggregation of external sources of datasets
An unopinionated central store of metadata, able to cater for most metadata schemas
Federated authentication via passport.js - log in via Google, Facebook, WSFed, AAF, CKAN, and easily create new providers.
Based on Kubernetes for cloud agnosticism - deployable to nearly any cloud, on-premises, or on a local machine.
Easy (as long as you know Kubernetes) installation and upgrades
Extensions are based on adding new docker images to the cluster, and hence can be developed in any language

Currently Under Development

A heavily automated, quick and easy to use data cataloguing process intended to produce high-quality metadata for discovery
A robust, policy-based authorization system built on Open Policy Agent - write flexible policies to restrict access to datasets and have them work across the system, including by restricting search results to what you're allowed to see.
Storage of datasets

Our current roadmap is available at https://magda.io/docs/roadmap

Architecture

Magda is built around a collection of microservices that are distributed as docker containers. This was done to provide easy extensibility - Magda can be customised by simply adding new services using any technology as docker images, and integrating them with the rest of the system via stable HTTP APIs. Using Helm and Kubernetes for orchestration means that configuration of a customised Magda instance can be stored and tracked as plain text, and instances with identical configuration can be quickly and easily reproduced.

Registry

Magda revolves around the Registry - an unopinionated datastore built on top of Postgres. The Registry stores records as a set of JSON documents called aspects. For instance, a dataset is represented as a record with a number of aspects - a basic one that records the name, description and so on as well as more esoteric ones that might not be present for every dataset, like temporal coverage or determined data quality. Likewise, distributions (the actual data files, or URLs linking to them) are also modelled as records, with their own sets of aspects covering both basic metadata once again, as well as more specific aspects like whether the URL to the file worked when last tested.

Most importantly, aspects are able to be declared dynamically by other services by simply making a call with a name, description and JSON schema. This means that if you have a requirement to store extra information about a dataset or distribution you can easily do so by declaring your own aspect. Because the system isn't opinionated about what a record is beyond a set of aspects, you can also use this to add new entities to the system that link together - for instance, we've used this to store projects with a name and description that link to a number of datasets.

Connectors

Connectors go out to external datasources and copy their metadata into the Registry, so that they can be searched and have other aspects attached to them. A connector is simply a docker-based microservice that is invoked as a job. It scans the target datasource (usually an open-data portal), then completes and shuts down. We have connectors for a number of existing open data formats, otherwise you can easily write and run your own.

Minions

A minion is a service that listens for new records or changes to existing records, performs some kind of operation and then writes the result back to the registry. For instance, we have a broken link minion that listens for changes to distributions, retrieves the URLs described, records whether they were able to be accessed successfully and then writes that back to the registry in its own aspect.

Other aspects exist that are written to by many minions - for instance, we have a "quality" aspect that contains a number of different quality ratings from different sources, which are averaged out and used by search.

Search

Datasets and distributions in the registry are ingested into an ElasticSearch cluster, which indexes a few core aspects of each and exposes an API.

User Interface

Magda provides a user interface, which is served from its own microservice and consumes the APIs. We're planning to make the UI itself extensible with plugins at some point in the future.

To try the last version (with prebuilt images)

Use https://github.com/magda-io/magda-config

To build and run from source

https://magda.io/docs/building-and-running

To get help with developing or running Magda

Start a discussion at https://spectrum.chat/magda. There's not a lot on there yet, but we monitor it closely :).

Want to get help deploying it into your organisation?

Email us at [email protected].

Want to contribute?

Great! Take a look at https://github.com/magda-io/magda/blob/master/.github/CONTRIBUTING.md :).

Documentation links

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 193

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (344) 🔗