All Projects → centurion → Similar Projects or Alternatives

226 Open source projects that are alternatives of or similar to centurion

Make record oriented data to columnar format.

Stars: ✭ 28 (-91.25%)

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

Stars: ✭ 86 (-73.12%)

Mutual labels: bigdata, parquet

bigdata-tech-index

Big Data Technology Index

Stars: ✭ 24 (-92.5%)

Mutual labels: bigdata

DaFlow

Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.

Stars: ✭ 24 (-92.5%)

Mutual labels: parquet

StreamBench

Measuring the performance of popular streaming engines with Yahoo's Streaming Benchmark

Stars: ✭ 52 (-83.75%)

Mutual labels: bigdata

hadoopoffice

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

Stars: ✭ 56 (-82.5%)

Mutual labels: bigdata

BigDataTools

tools for bigData

Stars: ✭ 36 (-88.75%)

Mutual labels: bigdata

gan deeplearning4j

Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.

Stars: ✭ 19 (-94.06%)

Mutual labels: bigdata

taller SparkR

Taller SparkR para las Jornadas de Usuarios de R

Stars: ✭ 12 (-96.25%)

Mutual labels: bigdata

amas

Amas is recursive acronym for “Amas, monitor alert system”.

Stars: ✭ 77 (-75.94%)

Mutual labels: bigdata

meetups-archivos

Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …

Stars: ✭ 60 (-81.25%)

Mutual labels: bigdata

parquet-extra

A collection of Apache Parquet add-on modules

Stars: ✭ 30 (-90.62%)

Mutual labels: parquet

IMCtermite

Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats

Stars: ✭ 20 (-93.75%)

Mutual labels: parquet

flokkr

Documentation placeholder and utilities for all the other containers.

Stars: ✭ 30 (-90.62%)

Mutual labels: bigdata

learning-spark

Tidy up Spark and Hadoop tutorials.

Stars: ✭ 28 (-91.25%)

Mutual labels: bigdata

room-renting

用Python爬取安居客房源信息，并用高德地图进行可视化

Stars: ✭ 16 (-95%)

Mutual labels: bigdata

Notes

This is a learning note | Java基础，JVM，源码，大数据，面经

Stars: ✭ 69 (-78.44%)

Mutual labels: bigdata

UnROOT.jl

Native Julia I/O package to work with CERN ROOT files

Stars: ✭ 52 (-83.75%)

Mutual labels: bigdata

anovos

Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark

Stars: ✭ 77 (-75.94%)

Mutual labels: bigdata

v6.dooring.public

可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.

Stars: ✭ 323 (+0.94%)

Mutual labels: bigdata

zdh web

大数据采集,抽取平台

Stars: ✭ 292 (-8.75%)

Mutual labels: bigdata

Parquet.jl

Julia implementation of Parquet columnar file format reader

Stars: ✭ 93 (-70.94%)

Mutual labels: parquet

dt-sql-parser

SQL Parsers for BigData, built with antlr4.

Stars: ✭ 135 (-57.81%)

Mutual labels: bigdata

SparkProgrammingInScala

Apache Spark Course Material

Stars: ✭ 57 (-82.19%)

Mutual labels: bigdata

2019 egu workshop jupyter notebooks

Short course on interactive analysis of Big Earth Data with Jupyter Notebooks

Stars: ✭ 29 (-90.94%)

Mutual labels: bigdata

pyorc

Python module for Apache ORC file format

Stars: ✭ 58 (-81.87%)

Mutual labels: orc

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-88.44%)

Mutual labels: bigdata

odbc2parquet

A command line tool to query an ODBC data source and write the result into a parquet file.

Stars: ✭ 95 (-70.31%)

Mutual labels: parquet

parquet-usql

A custom extractor designed to read parquet for Azure Data Lake Analytics

Stars: ✭ 13 (-95.94%)

Mutual labels: parquet

databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Stars: ✭ 19 (-94.06%)

Mutual labels: parquet

datasphere-service

an open source dataworks platform

Stars: ✭ 20 (-93.75%)

Mutual labels: bigdata

coolplayflink

Flink: Stateful Computations over Data Streams

Stars: ✭ 14 (-95.62%)

Mutual labels: bigdata

Spark

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .

Stars: ✭ 55 (-82.81%)

Mutual labels: parquet

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-89.37%)

Mutual labels: bigdata

datacatalog-tag-manager

Python package to manage Google Cloud Data Catalog tags, loading metadata from external sources -- currently supports the CSV file format

Stars: ✭ 17 (-94.69%)

Mutual labels: bigdata

Exposure

Exposure是一个帮助做曝光统计需求的库，可以很方便的对曝光事件进行埋点，在现有代码上少量侵入即可实现曝光埋点。支持RV的线性布局、网格布局、瀑布流布局、横向滑动RV，ScrollView等各种滚动布局。支持配置item的有效曝光面积。

Stars: ✭ 51 (-84.06%)

Mutual labels: bigdata

terraform-aws-kinesis-firehose

This code creates a Kinesis Firehose in AWS to send CloudWatch log data to S3.

Stars: ✭ 25 (-92.19%)

Mutual labels: parquet

ETL-Starter-Kit

📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.

Stars: ✭ 21 (-93.44%)

Mutual labels: bigdata

qs-hadoop

大数据生态圈学习

Stars: ✭ 18 (-94.37%)

Mutual labels: bigdata

SparkTwitterAnalysis

An Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.

Stars: ✭ 29 (-90.94%)

Mutual labels: bigdata

dockerfiles

Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )

Stars: ✭ 29 (-90.94%)

Mutual labels: bigdata

meepo

异构存储数据迁移

Stars: ✭ 29 (-90.94%)

Mutual labels: parquet

the-apache-ignite-book

All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above

Stars: ✭ 65 (-79.69%)

Mutual labels: bigdata

cds

Data syncing in golang for ClickHouse.

Stars: ✭ 839 (+162.19%)

Mutual labels: bigdata

jhdf

A pure Java HDF5 library

Stars: ✭ 83 (-74.06%)

Mutual labels: bigdata

bqv

The simplest tool to manage views of BigQuery.

Stars: ✭ 22 (-93.12%)

Mutual labels: bigdata

albis

Albis: High-Performance File Format for Big Data Systems

Stars: ✭ 20 (-93.75%)

Mutual labels: parquet

awesome-bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

Stars: ✭ 11,093 (+3366.56%)

Mutual labels: bigdata

163-bigdate-note

bigdata note

Stars: ✭ 38 (-88.12%)

Mutual labels: bigdata

pulsar-user-group-loc-cn

Workspace for China local user group.

Stars: ✭ 19 (-94.06%)

Mutual labels: bigdata

greycat

GreyCat - Data Analytics, Temporal data, What-if, Live machine learning

Stars: ✭ 104 (-67.5%)

Mutual labels: bigdata

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-94.06%)

Mutual labels: parquet

lectures-hse-spark

Масштабируемое машинное обучение и анализ больших данных с Apache Spark

Stars: ✭ 20 (-93.75%)

Mutual labels: bigdata

parquet2

Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

Stars: ✭ 157 (-50.94%)

Mutual labels: parquet

TiBigData

TiDB connectors for Flink/Hive/Presto