RoapiCreate full-fledged APIs for static datasets without writing a single line of code.
Stars: β 253 (+75.69%)
Vscode Data PreviewData Preview πΈ extension for importing π€ viewing π slicing πͺ dicing π² charting π & exporting π₯ large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: β 245 (+70.14%)
Awkward 0.xManipulate arrays of complex data structures as easily as Numpy.
Stars: β 216 (+50%)
graphiqueGraphQL service for arrow tables and parquet data sets.
Stars: β 28 (-80.56%)
Arrow.jlPure Julia implementation of the apache arrow data format (https://arrow.apache.org/)
Stars: β 92 (-36.11%)
Android ExpandiconNice and simple customizable implementation of Google style up/down expand arrow.
Stars: β 871 (+504.86%)
ArrowΞrrow - Functional companion to Kotlin's Standard Library
Stars: β 4,771 (+3213.19%)
Leader LineDraw a leader line in your web page.
Stars: β 1,872 (+1200%)
CudfcuDF - GPU DataFrame Library
Stars: β 4,370 (+2934.72%)
PucketBucketing and partitioning system for Parquet
Stars: β 29 (-79.86%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: β 97 (-32.64%)
Array ApiRFC document, tooling and other content related to the array API standard
Stars: β 26 (-81.94%)
Amazon S3 Find And ForgetAmazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Stars: β 115 (-20.14%)
PyjanitorClean APIs for data cleaning. Python implementation of R package Janitor
Stars: β 647 (+349.31%)
Parquet MrApache Parquet
Stars: β 1,278 (+787.5%)
IcebergIceberg is a table format for large, slow-moving tabular data
Stars: β 393 (+172.92%)
PetastormPetastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Stars: β 1,108 (+669.44%)
ArrowπΉ Parse JSON with style
Stars: β 355 (+146.53%)
PystoreFast data store for Pandas time-series data
Stars: β 325 (+125.69%)
Parquet IndexSpark SQL index for Parquet tables
Stars: β 109 (-24.31%)
RumbleβοΈ Rumble 1.11.0 "Banyan Tree"π³ for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: β 58 (-59.72%)
Elasticsearch loaderA tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: β 300 (+108.33%)
KglabGraph-Based Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries β atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, pyarrow, etc.
Stars: β 98 (-31.94%)
Pydata.krPyData Korea 곡μ ννμ΄μ§μ
λλ€. (μ€λΉμ€)
Stars: β 13 (-90.97%)
DrillApache Drill is a distributed MPP query layer for self describing data
Stars: β 1,619 (+1024.31%)
Neural Image CaptioningImplementation of Neural Image Captioning model using Keras with Theano backend
Stars: β 12 (-91.67%)
Pyvtreatvtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
Stars: β 92 (-36.11%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: β 1,642 (+1040.28%)
React ArcherπΉ Draw arrows between React elements π
Stars: β 666 (+362.5%)
Open ArrowOpen Arrow is an open-source font that contains 112 arrow symbols from U+2190 to U+21ff
Stars: β 89 (-38.19%)
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: β 611 (+324.31%)
Parquet GoGo package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Stars: β 114 (-20.83%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: β 406 (+181.94%)
Bigdata File ViewerA cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: β 86 (-40.28%)
SkaleHigh performance distributed data processing engine
Stars: β 390 (+170.83%)
Eel SdkBig Data Toolkit for the JVM
Stars: β 140 (-2.78%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: β 372 (+158.33%)
DistributedA distributed task scheduler for Dask
Stars: β 1,168 (+711.11%)
OapOptimized Analytics Package for Spark* Platform
Stars: β 343 (+138.19%)
BlazingsqlBlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
Stars: β 1,652 (+1047.22%)
ArrowsArrows is an animated custom view to give feedback about your UI sliding panels.
Stars: β 338 (+134.72%)
DaskParallel computing with task scheduling
Stars: β 9,309 (+6364.58%)
Parquet4sRead and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: β 125 (-13.19%)
Gcs ToolsGCS support for avro-tools, parquet-tools and protobuf
Stars: β 57 (-60.42%)
RatatoolA tool for data sampling, data generation, and data diffing
Stars: β 279 (+93.75%)
Parquet Dotnetπ Apache Parquet for modern .NET
Stars: β 276 (+91.67%)
PymapdPython client for OmniSci GPU-accelerated SQL engine and analytics platform
Stars: β 109 (-24.31%)
Node ParquetNodeJS module to access apache parquet format files
Stars: β 46 (-68.06%)
ArrowApache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
Stars: β 8,828 (+6030.56%)
StumpySTUMPY is a powerful and scalable Python library for modern time series analysis
Stars: β 2,019 (+1302.08%)
Scattertext PydataNotebooks for the Seattle PyData 2017 talk on Scattertext
Stars: β 132 (-8.33%)