Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,459 (+3222.97%)

Mutual labels: jdbc, impala

Kyuubi

Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark

Stars: ✭ 363 (+390.54%)

Mutual labels: jdbc, odbc

datar

A Grammar of Data Manipulation in python

Stars: ✭ 142 (+91.89%)

Mutual labels: dplyr, tidyverse

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-74.32%)

Mutual labels: hadoop, jdbc

Trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+6090.54%)

Mutual labels: hadoop, jdbc

Sqli

orm sql interface, Criteria, CriteriaBuilder, ResultMapBuilder

Stars: ✭ 1,644 (+2121.62%)

Mutual labels: jdbc, impala

liquibase-impala

Liquibase extension to add Impala Database support

Stars: ✭ 23 (-68.92%)

Mutual labels: hadoop, impala

Tidyheatmap

Draw heatmap simply using a tidy data frame

Stars: ✭ 151 (+104.05%)

Mutual labels: dplyr, tidyverse

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+102.7%)

Mutual labels: hadoop, apache

yarn-prometheus-exporter

Export Hadoop YARN (resource-manager) metrics in prometheus format

Stars: ✭ 44 (-40.54%)

Mutual labels: hadoop, apache

Linkis

Stars: ✭ 2,323 (+3039.19%)

Mutual labels: jdbc, impala

Drill

Apache Drill is a distributed MPP query layer for self describing data

Stars: ✭ 1,619 (+2087.84%)

Mutual labels: hadoop, jdbc

Tez

Apache Tez

Stars: ✭ 313 (+322.97%)

Mutual labels: hadoop, apache

Moderndive book

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse

Stars: ✭ 527 (+612.16%)

Mutual labels: dplyr, tidyverse

Tidyquant

Bringing financial analysis to the tidyverse

Stars: ✭ 635 (+758.11%)

Mutual labels: dplyr, tidyverse

Addax

Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.

Stars: ✭ 615 (+731.08%)

Mutual labels: hadoop, impala

R4ds Exercise Solutions

Exercise solutions to "R for Data Science"

Stars: ✭ 226 (+205.41%)

Mutual labels: dplyr, tidyverse

Nutch

Apache Nutch is an extensible and scalable web crawler

Stars: ✭ 2,277 (+2977.03%)

Mutual labels: hadoop, apache

eeguana

A package for manipulating EEG data in R.

Stars: ✭ 16 (-78.38%)

Mutual labels: dplyr, tidyverse

CSSS508

CSSS508: Introduction to R for Social Scientists

Stars: ✭ 28 (-62.16%)

Mutual labels: dplyr, tidyverse

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (+2102.7%)

Mutual labels: hadoop, impala

parcours-r

Valise pédagogique pour la formation à R

Stars: ✭ 25 (-66.22%)

Mutual labels: dplyr, tidyverse

apache-flink-jdbc-streaming

Sample project for Apache Flink with Streaming Engine and JDBC Sink

Stars: ✭ 22 (-70.27%)

Mutual labels: jdbc, apache

Timetk

A toolkit for working with time series in R

Stars: ✭ 371 (+401.35%)

Mutual labels: dplyr, tidyverse

Tidylog

Tidylog provides feedback about dplyr and tidyr operations. It provides wrapper functions for the most common functions, such as filter, mutate, select, and group_by, and provides detailed output for joins.

Stars: ✭ 428 (+478.38%)

Mutual labels: dplyr, tidyverse

casewhen

Create reusable dplyr::case_when() functions

Stars: ✭ 64 (-13.51%)

Mutual labels: dplyr, tidyverse

Hive

Apache Hive

Stars: ✭ 4,031 (+5347.3%)

Mutual labels: hadoop, apache

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (-71.62%)

Mutual labels: hadoop, impala

hive-bigquery-storage-handler

Hive Storage Handler for interoperability between BigQuery and Apache Hive

Stars: ✭ 16 (-78.38%)

Mutual labels: hadoop, apache

jmx exporter-cloudera-hadoop

Prometheus jmx_exporter configurations for Cloudera Hadoop

Stars: ✭ 33 (-55.41%)

Mutual labels: hadoop

hadoop-ecosystem

Visualizations of the Hadoop Ecosystem

Stars: ✭ 20 (-72.97%)

Mutual labels: hadoop

R-data-wrangling

Materials for my my R data workshop. https://cengel.github.io/R-data-wrangling/

Stars: ✭ 17 (-77.03%)

Mutual labels: tidyverse

skein

A tool and library for easily deploying applications on Apache YARN

Stars: ✭ 128 (+72.97%)

Mutual labels: hadoop

java

📚 Recursos para aprender Java

Stars: ✭ 31 (-58.11%)

Mutual labels: jdbc

expresso-php

Fast and simple Docker setup for all your PHP development. Quick but not dirty.

Stars: ✭ 31 (-58.11%)

Mutual labels: apache

xxhadoop

Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !

Stars: ✭ 37 (-50%)

Mutual labels: hadoop

DBISProject

Library Management System using Java and MySQL

Stars: ✭ 27 (-63.51%)

Mutual labels: jdbc

trafficserver-ingress-controller

Apache Traffic Server Ingress Controller for Kubernetes

Stars: ✭ 29 (-60.81%)

Mutual labels: apache

pypyodbc

pypyodbc is a pure Python cross platform ODBC interface module (pyodbc compatible as of 2017)

Stars: ✭ 39 (-47.3%)

Mutual labels: odbc

datasqueeze

Hadoop utility to compact small files

Stars: ✭ 18 (-75.68%)

Mutual labels: hadoop

openwhisk-runtime-go

Apache OpenWhisk Runtime Go supports Apache OpenWhisk functions written in Go

Stars: ✭ 31 (-58.11%)

Mutual labels: apache

hadoop-etl-udfs

The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL

Stars: ✭ 17 (-77.03%)

Mutual labels: hadoop

spwrap

Simple Stored Procedure call wrapper with no framework dependencies.

Stars: ✭ 24 (-67.57%)

Mutual labels: jdbc

Neo

Orm框架：基于ActiveRecord思想开发的至简化的java的Orm框架

Stars: ✭ 35 (-52.7%)

Mutual labels: jdbc

memex-gate

General Architecture for Text Engineering

Stars: ✭ 47 (-36.49%)

Mutual labels: hadoop

KBC--Kaun-Banega-Crorepati

It is Core Java based Game based on Indian television game show having best animation as possible in Core java 5000+ lines

Stars: ✭ 38 (-48.65%)

Mutual labels: jdbc

docker-oxid6

Docker Container with PHP7, MySQL 5.7 and OXID eShop 6

Stars: ✭ 30 (-59.46%)

Mutual labels: apache

uima-uimaj

Apache UIMA Java SDK

Stars: ✭ 50 (-32.43%)

Mutual labels: apache

sparkucx

A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer

Stars: ✭ 32 (-56.76%)

Mutual labels: hadoop

disq

A library for manipulating bioinformatics sequencing formats in Apache Spark

Stars: ✭ 29 (-60.81%)

Mutual labels: hadoop

corc

An ORC File Scheme for the Cascading data processing platform.

Stars: ✭ 14 (-81.08%)

Mutual labels: hadoop

odbc2parquet

A command line tool to query an ODBC data source and write the result into a parquet file.

Stars: ✭ 95 (+28.38%)

Mutual labels: odbc

BigInsights-on-Apache-Hadoop

Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix

Stars: ✭ 21 (-71.62%)

Mutual labels: hadoop

1-60 of 736 similar projects

›

next*5