All Projects → hortonworks → hive-json

hortonworks / hive-json

Licence: other
A rough prototype of a tool for discovering Apache Hive schemas from JSON documents.

Programming Languages

java
68154 projects - #9 most used programming language
python
139335 projects - #7 most used programming language

Hive JSON Schema Finder

This project is a rough prototype that I've written to analyze large collections of JSON documents and discover their Apache Hive schema. I've used it to anaylyze the githubarchive.org's log data.

To build the project, use Maven (3.0.x) from http://maven.apache.org/.

Building the jar:

% mvn package

Run the program:

% bin/find-json-schema *.json.gz

I've uploaded the discovered schema for githubarchive.org to https://gist.github.com/omalley/5125691.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].