Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+92.97%)

Mutual labels: big-data

Esper Tv

Esper instance for TV news analysis

Stars: ✭ 37 (-71.09%)

Mutual labels: big-data

Aws Etl Orchestrator

A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.

Stars: ✭ 245 (+91.41%)

Mutual labels: big-data

Delta

An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.

Stars: ✭ 3,903 (+2949.22%)

Mutual labels: big-data

Kafka Ui

Open-Source Web GUI for Apache Kafka Management

Stars: ✭ 230 (+79.69%)

Mutual labels: big-data

Richdem

High-performance Terrain and Hydrology Analysis

Stars: ✭ 127 (-0.78%)

Mutual labels: big-data

Eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

Stars: ✭ 235 (+83.59%)

Mutual labels: big-data

Fluid

Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud

Stars: ✭ 265 (+107.03%)

Mutual labels: big-data

Lite Virtual List

Virtual list component library supporting waterfall flow based on vue

Stars: ✭ 223 (+74.22%)

Mutual labels: big-data

Predictionio Template Text Classifier

Text Classification Engine

Stars: ✭ 30 (-76.56%)

Mutual labels: big-data

Usql

U-SQL Examples and Issue Tracking

Stars: ✭ 221 (+72.66%)

Mutual labels: big-data

Morpheus

Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.

Stars: ✭ 303 (+136.72%)

Mutual labels: big-data

Awkward 0.x

Manipulate arrays of complex data structures as easily as Numpy.

Stars: ✭ 216 (+68.75%)

Mutual labels: big-data

Dataengineeringproject

Example end to end data engineering project.

Stars: ✭ 82 (-35.94%)

Mutual labels: big-data

Helicalinsight

Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.

Stars: ✭ 214 (+67.19%)

Mutual labels: big-data

Couchdb Fauxton

Apache CouchDB

Stars: ✭ 295 (+130.47%)

Mutual labels: big-data

Attic Predictionio Sdk Python

PredictionIO Python SDK

Stars: ✭ 196 (+53.13%)

Mutual labels: big-data

Qcportal

A client interface to the QCArchive Project (read-only image of QCFractal)

Stars: ✭ 29 (-77.34%)

Mutual labels: big-data

Data Science Live Book

An open source book to learn data science, data analysis and machine learning, suitable for all ages!

Stars: ✭ 193 (+50.78%)

Mutual labels: big-data

Smooks

An extensible Java framework for building XML and non-XML streaming applications

Stars: ✭ 293 (+128.91%)

Mutual labels: big-data

Gun

An open source cybersecurity protocol for syncing decentralized graph data.

Stars: ✭ 15,172 (+11753.13%)

Mutual labels: big-data

Spark R Notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 109 (-14.84%)

Mutual labels: big-data

Flume

Mirror of Apache Flume

Stars: ✭ 2,200 (+1618.75%)

Mutual labels: big-data

Flink

Apache Flink is an open source project of The Apache Software Foundation (ASF). The Apache Flink project originated from the Stratosphere research project.

Stars: ✭ 17,781 (+13791.41%)

Mutual labels: big-data

Dvid

Distributed, Versioned, Image-oriented Dataservice

Stars: ✭ 174 (+35.94%)

Mutual labels: big-data

Spark

Apache Spark - A unified analytics engine for large-scale data processing

Stars: ✭ 31,618 (+24601.56%)

Mutual labels: big-data

Attic Predictionio

PredictionIO, a machine learning server for developers and ML engineers.

Stars: ✭ 12,522 (+9682.81%)

Mutual labels: big-data

Trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+3478.91%)

Mutual labels: big-data

Keyvi

Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.

Stars: ✭ 161 (+25.78%)

Mutual labels: big-data

Uproot4

ROOT I/O in pure Python and NumPy.

Stars: ✭ 80 (-37.5%)

Mutual labels: big-data

Presto

The official home of the Presto distributed SQL query engine for big data

Stars: ✭ 12,957 (+10022.66%)

Mutual labels: big-data

Parquet Dotnet

🏐 Apache Parquet for modern .NET

Stars: ✭ 276 (+115.63%)

Mutual labels: big-data

Spark.jl

Julia binding for Apache Spark

Stars: ✭ 153 (+19.53%)

Mutual labels: big-data

Phoenix

Mirror of Apache Phoenix

Stars: ✭ 867 (+577.34%)

Mutual labels: big-data

Fili

Easily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.

Stars: ✭ 151 (+17.97%)

Mutual labels: big-data

Datahub

The Metadata Platform for the Modern Data Stack

Stars: ✭ 4,232 (+3206.25%)

Mutual labels: big-data

Parquetviewer

Simple windows desktop application for viewing & querying Apache Parquet files

Stars: ✭ 145 (+13.28%)

Mutual labels: big-data

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-8.59%)

Mutual labels: big-data

Hydrograph

A visual ETL development and debugging tool for big data

Stars: ✭ 144 (+12.5%)

Mutual labels: big-data

bigstatsr

R package for statistical tools with big matrices stored on disk.

Stars: ✭ 139 (+8.59%)

Mutual labels: big-data

Storm Doc Zh

Apache Storm 官方文档中文版

Stars: ✭ 142 (+10.94%)

Mutual labels: big-data

Sparkjni

A heterogeneous Apache Spark framework.

Stars: ✭ 11 (-91.41%)

Mutual labels: big-data

Belajarpython.com

Open Source Indonesian Python Programming Tutorial Site

Stars: ✭ 141 (+10.16%)

Mutual labels: big-data

mmtf-workshop-2018

Structural Bioinformatics Training Workshop & Hackathon 2018

Stars: ✭ 50 (-60.94%)

Mutual labels: big-data

Hazelcast Go Client

Hazelcast IMDG Go Client

Stars: ✭ 140 (+9.38%)

Mutual labels: big-data

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-38.28%)

Mutual labels: big-data

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-89.06%)

Mutual labels: big-data

Azuredatalake

Samples and Docs for Azure Data Lake Store and Analytics

Stars: ✭ 128 (+0%)

Mutual labels: big-data

Griffon Vm

Griffon Data Science Virtual Machine

Stars: ✭ 128 (+0%)

Mutual labels: big-data

Report

自动化配置报表平台。演示地址http://58.87.112.247/report 账号 visitor密码123456

Stars: ✭ 123 (-3.91%)

Mutual labels: big-data

Just Dashboard

📊 📋 Dashboards using YAML or JSON files

Stars: ✭ 1,511 (+1080.47%)

Mutual labels: big-data

Logisland

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-24.22%)

Mutual labels: big-data

Attic Lens

Mirror of Apache Lens

Stars: ✭ 58 (-54.69%)

Mutual labels: big-data

Courses

Quiz & Assignment of Coursera

Stars: ✭ 454 (+254.69%)

Mutual labels: big-data

GDLibrary

Matlab library for gradient descent algorithms: Version 1.0.1

Stars: ✭ 50 (-60.94%)

Mutual labels: big-data

301-360 of 369 similar projects

first

‹

›