All Projects → projectnessie → Nessie

projectnessie / Nessie

Licence: apache-2.0
Nessie provides Git-like capabilities for your Data Lake

Programming Languages

java
68154 projects - #9 most used programming language

Labels

Projects that are alternatives of or similar to Nessie

React Native Quiet
🤫 Quiet for React Native.
Stars: ✭ 158 (-10.23%)
Mutual labels:  data
Onyx
Distributed, masterless, high performance, fault tolerant data processing
Stars: ✭ 2,019 (+1047.16%)
Mutual labels:  data
Grafter
Linked Data & RDF Manufacturing Tools in Clojure
Stars: ✭ 174 (-1.14%)
Mutual labels:  data
Dop
JavaScript implementation for Distributed Object Protocol
Stars: ✭ 163 (-7.39%)
Mutual labels:  data
Covid 19 Uk Data
Coronavirus (COVID-19) UK Historical Data
Stars: ✭ 169 (-3.98%)
Mutual labels:  data
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-2.84%)
Mutual labels:  data
Anaconda Project
Tool for encapsulating, running, and reproducing data science projects
Stars: ✭ 153 (-13.07%)
Mutual labels:  data
Openintro
📦 R package for data and supplemental functions for OpenIntro resources
Stars: ✭ 176 (+0%)
Mutual labels:  data
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+2694.89%)
Mutual labels:  data
Everypolitician Data
data for national legislatures worldwide
Stars: ✭ 174 (-1.14%)
Mutual labels:  data
Pandas Datareader
Extract data from a wide range of Internet sources into a pandas DataFrame.
Stars: ✭ 2,183 (+1140.34%)
Mutual labels:  data
Pytubes
A module for getting data into python from large data sources
Stars: ✭ 164 (-6.82%)
Mutual labels:  data
General Store
Simple, flexible store implementation for Flux. #hubspot-open-source
Stars: ✭ 171 (-2.84%)
Mutual labels:  data
Gobblin
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Stars: ✭ 2,006 (+1039.77%)
Mutual labels:  data
Databay
Databay is a Python interface for scheduled data transfer. It facilitates transfer of (any) data from A to B, on a scheduled interval.
Stars: ✭ 175 (-0.57%)
Mutual labels:  data
Holiday Cn
📅🇨🇳 中国法定节假日数据 自动每日抓取国务院公告
Stars: ✭ 157 (-10.8%)
Mutual labels:  data
Exportsheetdata
Add-on for Google Sheets that allows sheets to be exported as JSON or XML.
Stars: ✭ 170 (-3.41%)
Mutual labels:  data
Fake2db
Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.
Stars: ✭ 2,113 (+1100.57%)
Mutual labels:  data
Ncov2019 data crawler
疫情数据爬虫,2019新型冠状病毒数据仓库,轨迹数据,同乘数据,报道
Stars: ✭ 175 (-0.57%)
Mutual labels:  data
Lfai Landscape
🌄 Open Source AI Landscape - provides overview of top tier projects in the open source AI ecosystem, shows projects through GitHub data, funding or market cap, first and last commits, contributor count and much other information.
Stars: ✭ 172 (-2.27%)
Mutual labels:  data

Project Nessie

Build Status codecov Maven Central PyPI Docker

Project Nessie is a system to provide Git like capability for Iceberg Tables, Delta Lake Tables, Hive Tables and Sql Views.

More information can be found at projectnessie.org.

Using Nessie

You can quickly get started with Nessie by using our small, fast docker image.

docker pull projectnessie/nessie
docker run -p 19120:19120 projectnessie/nessie

Then install the Nessie CLI tool

pip install pynessie

From there, you can use one of our technology integrations such those for

Have fun! We have a Google Group and a Slack channel we use for both developers and users. Check them out here.

Building and Developing Nessie

Requirements

  • JDK 11 or higher: JDK11 or higher is needed to build Nessie (artifacts are built for Java 8)

Installation

Clone this repository and run maven:

git clone https://github.com/projectnessie/nessie
cd nessie
./mvnw clean install

Delta Lake artifacts

Nessie required some minor changes to Delta for full support of branching and history. These changes are currently being integrated into the mainline repo. Until these have been merged we have provided custom builds in our fork which can be downloaded from a separate maven repository.

Distribution

To run:

  1. configuration in servers/quarkus-server/src/main/resources/application.properties
  2. execute ./mvnw quarkus:dev
  3. go to http://localhost:19120

UI

To run the ui (from ui directory):

  1. If you are running in test ensure that setupProxy.js points to the correct api instance. This ensures we avoid CORS issues in testing
  2. npm install will install dependencies
  3. npm run start to start the ui in development mode via node

To deploy the ui (from ui directory):

  1. npm install will install dependencies
  2. npm build will minify and collect the package for deployment in build
  3. the build directory can be deployed to any static hosting environment or run locally as serve -s build

Docker image

When running mvn clean install a docker image will be created at projectnessie/nessie which can be started with docker run -p 19120:19120 projectnessie/nessie and the relevant environment variables. Environment variables are specified as per https://github.com/eclipse/microprofile-config/blob/master/spec/src/main/asciidoc/configsources.asciidoc#default-configsources

AWS Lambda

You can also deploy to AWS lambda function by following the steps in servers/lambda/README.md

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].