All Projects → Parquet Mr → Similar Projects or Alternatives

420 Open source projects that are alternatives of or similar to Parquet Mr

Succinct
Enabling queries on compressed data.
Stars: ✭ 257 (-79.89%)
Mutual labels:  big-data
Detecting-Malicious-URL-Machine-Learning
No description or website provided.
Stars: ✭ 47 (-96.32%)
Mutual labels:  big-data
Node Parquet
NodeJS module to access apache parquet format files
Stars: ✭ 46 (-96.4%)
Mutual labels:  parquet
Clickhouse
ClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+1550.16%)
Mutual labels:  big-data
bandar-log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-98.44%)
Mutual labels:  big-data
Vue Virtual Scroll List
⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+150.47%)
Mutual labels:  big-data
Kafka Streams
equivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨
Stars: ✭ 613 (-52.03%)
Mutual labels:  big-data
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (-80.67%)
Mutual labels:  big-data
bigtable
TypeScript Bigtable Client with 🔋🔋 included.
Stars: ✭ 13 (-98.98%)
Mutual labels:  big-data
Aws Etl Orchestrator
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (-80.83%)
Mutual labels:  big-data
Panoptes
A Global Scale Network Telemetry Ecosystem
Stars: ✭ 80 (-93.74%)
Mutual labels:  big-data
Kafka Ui
Open-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (-82%)
Mutual labels:  big-data
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-98.9%)
Mutual labels:  big-data
Eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (-81.61%)
Mutual labels:  big-data
Oozie
Mirror of Apache Oozie
Stars: ✭ 602 (-52.9%)
Mutual labels:  big-data
Lite Virtual List
Virtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (-82.55%)
Mutual labels:  big-data
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-91.31%)
Mutual labels:  big-data
Usql
U-SQL Examples and Issue Tracking
Stars: ✭ 221 (-82.71%)
Mutual labels:  big-data
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (-19.8%)
Mutual labels:  big-data
Real Time Social Media Mining
DevOps pipeline for Real Time Social/Web Mining
Stars: ✭ 22 (-98.28%)
Mutual labels:  big-data
predictionio-template-java-ecom-recommender
PredictionIO E-Commerce Recommendation Engine Template (Java-based parallelized engine)
Stars: ✭ 36 (-97.18%)
Mutual labels:  big-data
Helicalinsight
Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
Stars: ✭ 214 (-83.26%)
Mutual labels:  big-data
Giraph
Mirror of Apache Giraph
Stars: ✭ 569 (-55.48%)
Mutual labels:  big-data
Attic Predictionio Sdk Python
PredictionIO Python SDK
Stars: ✭ 196 (-84.66%)
Mutual labels:  big-data
ibmpairs
open source tools for interaction with IBM PAIRS:
Stars: ✭ 23 (-98.2%)
Mutual labels:  big-data
Data Science Live Book
An open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (-84.9%)
Mutual labels:  big-data
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-94.91%)
Mutual labels:  big-data
Gun
An open source cybersecurity protocol for syncing decentralized graph data.
Stars: ✭ 15,172 (+1087.17%)
Mutual labels:  big-data
spark-acid
ACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-92.88%)
Mutual labels:  big-data
Pachyderm
Reproducible Data Science at Scale!
Stars: ✭ 5,305 (+315.1%)
Mutual labels:  big-data
Pretzel
Javascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-97.97%)
Mutual labels:  big-data
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (-73.16%)
Mutual labels:  parquet
GDLibrary
Matlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (-96.09%)
Mutual labels:  big-data
Dvid
Distributed, Versioned, Image-oriented Dataservice
Stars: ✭ 174 (-86.38%)
Mutual labels:  big-data
AverageShiftedHistograms.jl
⚡ Lightning fast density estimation in Julia ⚡
Stars: ✭ 52 (-95.93%)
Mutual labels:  big-data
Attic Predictionio
PredictionIO, a machine learning server for developers and ML engineers.
Stars: ✭ 12,522 (+879.81%)
Mutual labels:  big-data
Quilt
Quilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (-21.21%)
Mutual labels:  parquet
Keyvi
Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
Stars: ✭ 161 (-87.4%)
Mutual labels:  big-data
hadoop-data-ingestion-tool
OLAP and ETL of Big Data
Stars: ✭ 17 (-98.67%)
Mutual labels:  big-data
Presto
The official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+913.85%)
Mutual labels:  big-data
Couchdb
Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
Stars: ✭ 5,166 (+304.23%)
Mutual labels:  big-data
Spark.jl
Julia binding for Apache Spark
Stars: ✭ 153 (-88.03%)
Mutual labels:  big-data
alluxio-py
Alluxio Python client - Access Any Data Source with Python
Stars: ✭ 18 (-98.59%)
Mutual labels:  big-data
Fili
Easily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
Stars: ✭ 151 (-88.18%)
Mutual labels:  big-data
Cookbook
The Data Engineering Cookbook
Stars: ✭ 9,829 (+669.09%)
Mutual labels:  big-data
centurion
Kotlin Bigdata Toolkit
Stars: ✭ 320 (-74.96%)
Mutual labels:  parquet
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (-88.73%)
Mutual labels:  big-data
Arkime
Arkime (formerly Moloch) is an open source, large scale, full packet capturing, indexing, and database system.
Stars: ✭ 4,994 (+290.77%)
Mutual labels:  big-data
Storm Doc Zh
Apache Storm 官方文档中文版
Stars: ✭ 142 (-88.89%)
Mutual labels:  big-data
meepo
异构存储数据迁移
Stars: ✭ 29 (-97.73%)
Mutual labels:  parquet
Egads
A Java package to automatically detect anomalies in large scale time-series data
Stars: ✭ 997 (-21.99%)
Mutual labels:  big-data
airavata-django-portal
Mirror of Apache Airavata Django Portal
Stars: ✭ 20 (-98.44%)
Mutual labels:  big-data
Stroom
Stroom is a highly scalable data storage, processing and analysis platform.
Stars: ✭ 344 (-73.08%)
Mutual labels:  big-data
lcbo-api
A crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (-88.11%)
Mutual labels:  big-data
hotmap
WebGL Heatmap Viewer for Big Data and Bioinformatics
Stars: ✭ 13 (-98.98%)
Mutual labels:  big-data
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-93.58%)
Mutual labels:  big-data
Sparksql Protobuf
Read SparkSQL parquet file as RDD[Protobuf]
Stars: ✭ 82 (-93.58%)
Mutual labels:  parquet
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-93.82%)
Mutual labels:  big-data
Appdocs
Application Performance Optimization Summary
Stars: ✭ 1,169 (-8.53%)
Mutual labels:  big-data
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-95.54%)
Mutual labels:  big-data
301-360 of 420 similar projects