All Projects → mukunku → Parquetviewer

mukunku / Parquetviewer

Licence: gpl-3.0
Simple windows desktop application for viewing & querying Apache Parquet files

Projects that are alternatives of or similar to Parquetviewer

Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+22.07%)
Mutual labels:  big-data, apache-spark, parquet
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+90.34%)
Mutual labels:  big-data, apache-spark, parquet
Morpheus
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Stars: ✭ 303 (+108.97%)
Mutual labels:  big-data, apache-spark
Mist
Serverless proxy for Spark cluster
Stars: ✭ 309 (+113.1%)
Mutual labels:  big-data, apache-spark
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (-0.69%)
Mutual labels:  big-data, apache-spark
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-65.52%)
Mutual labels:  big-data, apache-spark
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (-5.52%)
Mutual labels:  big-data, apache-spark
Parquet Format
Apache Parquet
Stars: ✭ 800 (+451.72%)
Mutual labels:  big-data, parquet
SparkProgrammingInScala
Apache Spark Course Material
Stars: ✭ 57 (-60.69%)
Mutual labels:  big-data, apache-spark
Drill
Apache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+1016.55%)
Mutual labels:  big-data, parquet
Amazon S3 Find And Forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Stars: ✭ 115 (-20.69%)
Mutual labels:  big-data, parquet
Scala Spark Tutorial
Project for James' Apache Spark with Scala course
Stars: ✭ 121 (-16.55%)
Mutual labels:  big-data, apache-spark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-23.45%)
Mutual labels:  big-data, apache-spark
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-91.03%)
Mutual labels:  big-data, apache-spark
pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (-20.69%)
Mutual labels:  big-data, apache-spark
Parquet Cpp
Apache Parquet
Stars: ✭ 339 (+133.79%)
Mutual labels:  big-data, parquet
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (-3.45%)
Mutual labels:  big-data, parquet
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-83.45%)
Mutual labels:  apache-spark, parquet
SynapseML
Simple and Distributed Machine Learning
Stars: ✭ 3,355 (+2213.79%)
Mutual labels:  big-data, apache-spark
Parquet Mr
Apache Parquet
Stars: ✭ 1,278 (+781.38%)
Mutual labels:  big-data, parquet

ParquetViewer

Simple Windows desktop application for viewing & querying Apache Parquet files.

Main UI

Please also checkout the Wiki for a detailed user guide: https://github.com/mukunku/ParquetViewer/wiki

Summary

This is a quick and dirty utility that I created to easily view Apache Parquet files on Windows desktop machines.

If you'd like to add any new features feel free to send a pull request.

Some Key Features:

  • Run simple sql-like queries on chunks of the file
  • Generate ansi sql schema for opened files

Limitations

This application can only open Parquet files located on the Windows machine the app is running on. It cannot connect to HDFS to read parquet data.

Complex types such as structs, arrays and maps are not supported at this time.

Download

Pre-compiled releases can be found here: https://github.com/mukunku/ParquetViewer/releases

Visit the Wiki for details on how to use the utility: https://github.com/mukunku/ParquetViewer/wiki

Technical Details

The latest version of this project was written in C# using Visual Studio 2019 and .NET 4.6.1

Acknowledgements

This utility would not be possible without: https://github.com/elastacloud/parquet-dotnet

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].