All Projects → elite-libs → DataAnalyzer.app

elite-libs / DataAnalyzer.app

Licence: other
✨🚀 DataAnalyzer.app - Convert JSON/CSV to Typed Data Interfaces - Automatically!

Programming Languages

typescript
32286 projects
SCSS
7915 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to DataAnalyzer.app

Graphback
Graphback - Out of the box GraphQL server and client
Stars: ✭ 323 (+1304.35%)
Mutual labels:  schema, generator
Pollinate
Template your base files and generate new projects from Git(Hub).
Stars: ✭ 213 (+826.09%)
Mutual labels:  schema, generator
Schema Generator
PHP Model Scaffolding from Schema.org and other RDF vocabularies
Stars: ✭ 379 (+1547.83%)
Mutual labels:  schema, generator
Vue Form Generator
📋 A schema-based form generator component for Vue.js
Stars: ✭ 2,853 (+12304.35%)
Mutual labels:  schema, generator
Textrude
Code generation from YAML/JSON/CSV models via SCRIBAN templates
Stars: ✭ 79 (+243.48%)
Mutual labels:  csv, code-generation
Flatfiles
Reads and writes CSV, fixed-length and other flat file formats with a focus on schema definition, configuration and speed.
Stars: ✭ 275 (+1095.65%)
Mutual labels:  schema, csv
Omniparser
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
Stars: ✭ 148 (+543.48%)
Mutual labels:  schema, csv
Glad
Multi-Language Vulkan/GL/GLES/EGL/GLX/WGL Loader-Generator based on the official specs.
Stars: ✭ 2,296 (+9882.61%)
Mutual labels:  generator, code-generation
granate
Code generator for graphql
Stars: ✭ 21 (-8.7%)
Mutual labels:  schema, code-generation
pony-capnp
Cap’n Proto plugin for generating serializable Pony classes. 🐴 - 🎩'n 🅿️
Stars: ✭ 19 (-17.39%)
Mutual labels:  schema, code-generation
Ssl Checker
Python script that collects SSL/TLS information from hosts
Stars: ✭ 94 (+308.7%)
Mutual labels:  csv, analyzer
transferdb
TransferDB 支持异构数据库 schema 转换、全量数据导出导入以及增量数据同步功能( Oracle 数据库 -> MySQL/TiDB 数据库)
Stars: ✭ 30 (+30.43%)
Mutual labels:  schema, csv
Simple Excel
Read and write simple Excel and CSV files
Stars: ✭ 502 (+2082.61%)
Mutual labels:  csv, generator
Mimesis
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Stars: ✭ 3,439 (+14852.17%)
Mutual labels:  schema, generator
Datamodel Code Generator
Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
Stars: ✭ 393 (+1608.7%)
Mutual labels:  csv, generator
Specs
Technical specifications and guidelines for implementing Frictionless Data.
Stars: ✭ 403 (+1652.17%)
Mutual labels:  schema, csv
Laravel Code Generator
An intelligent code generator for Laravel framework that will save you time! This awesome tool will help you generate resources like views, controllers, routes, migrations, languages and/or form-requests! It is extremely flexible and customizable to cover many on the use cases. It is shipped with cross-browsers compatible template, along with a client-side validation to modernize your application.
Stars: ✭ 485 (+2008.7%)
Mutual labels:  generator, code-generation
Swiftcolorgen
A tool that generate code for Swift projects, designed to improve the maintainability of UIColors
Stars: ✭ 152 (+560.87%)
Mutual labels:  generator, code-generation
Vue Form Json Schema
Create forms using JSON schema. Bring your components!
Stars: ✭ 253 (+1000%)
Mutual labels:  schema, generator
awesome-csv
Awesome Comma-Separated Values (CSV) - What's Next? - Frequently Asked Questions (F.A.Q.s) - Libraries & Tools
Stars: ✭ 46 (+100%)
Mutual labels:  schema, csv

DataAnalyzer.app

DataAnalyzer understands and converts any JSON (or CSV) data into type-aware code for any language!

A passion project by Dan Levy

Introduction

If you consume popular APIs or utilize Component libraries, you've probably had the tedious task of re-implementing data structures you (hopefully) found in documentation.

What happens in the all-too-common case when the docs are wrong, outdated or missing?

With internal or private APIs the situation is generally worse.

When facing unreliable docs, often all you can count on is the actual HTTP response data.

Solution

DataAnalyzer.app to the rescue!

It can ingest raw data and generate intelligent type-aware code.

The schema-analyzer uses a highly extensible adapter/template pattern which can accommodate almost any kind of output. (e.g. SQL, ORM, GraphQL, Classes/Interfaces, Swagger JSON, JSON Schema to Protocol Buffers, and much more.)

Contributors: View todo & idea list

Issues/Requests/PRs welcome! 💜💙💚💛🧡♥️

"Wait, there's more!"

DataAnalyzer has 3 Powerful Features to Explore:

1. Analyze column type & size stats from any JSON/CSV!

2. Generate auto-typed code & database interfaces, instantly!

3. Visualize results, explore & understand your data structure!

Features

The primary goal is to support any input JSON/CSV and infer as much as possible. More data will generally yield better results.

Completed

  • Heuristic type analysis for arrays of objects.
  • Nested data structure & multi-table relational output.
  • Browser-based (local, no server used.)
  • Automatic type detection for:
    • ID - Identifier column, by name and unique Integer check
    • BigInt/BigNumber
    • ObjectId (MongoDB's 96 bit/12 Byte ID. 32bit timestamp + 24bit MachineID + 16bit ProcessID + 24bit Counter)
    • UUID/GUID (Common 128 bit/16 Byte ID. Stored as a hex string, dash delimited in parts: 8, 4, 4, 4, 12)
    • Boolean (detects obvious strings true, false, Y, N)
    • Date (Smart detection via comprehensive regex pattern)
    • Timestamp (integer, number of milliseconds since unix epoch)
    • Currency (62 currency symbols supported)
    • Float (w/ scale & precision measurements)
    • Number
    • Null (sparse column data helps w/ certain inferences)
    • String (big text and variable character length awareness)
    • Array (includes min/max/avg length)
    • Object
    • Url/String
    • Latitude & Longitude (Coordinate pairs, GIS support)
    • Phone number (international patterns? configuration?)
    • Specialty Types
    • Email (falls back to string)
  • Detects column size minimum, maximum and average
  • Includes data points at the 30th, 60th and 90th percentiles (for detecting outliers and enum types!)
  • Handles some error/outliers
  • Quantify # of unique values per column
  • Identify enum Fields w/ Values
  • Identify Not Null fields
  • Normalize structured JSON into flat typed objects.

Note: CSV files must include column names.

TODO

  • Move the demo button to a hover menu over the Data Input Panel (with a Clear button.)
  • Change the OutputButtons to use a scrolling grid of icons.
  • Convert CSSin-JS to use Linaria.

Bugs by area/function

  • Library.TypeMatcher('Timestamp'): Aggregate calculations fail with only 1 match. Should fall back to the value.
  • SQL.writer: Use actual nested table ID Column in FOREIGN KEY.
  • SQL.writer: Null/nullable fields emit correctly.

Better code generator support

  • Render output using handlebars templates.
  • Support multiple output files.
  • Use AI to name subtypes based on their column names (When assigning the full-qualified Path or de-duplicating types)
  • Add fuzzy matching of types if fields meet similarity threshold.
WHEN
  Completed TypeSummary processing.
  And SubType Shapes (column names) have `>= X%` similar columns.
GIVEN
  Nested type shapes:
    'latitude|longitude'
    'latitude|longitude|title'
    'latitude|longitude|url'
THEN
  1. Return an adjusted type with combined fields
    `latitude|longitude|title*|url*`
  2. Determine a new suggested name.
  3. Apply the Rename & Update fields with unified/composite type.

Type inference & detection

  • Range option for precise Timestamp detection.
  • Option to visit Hypermedia URLs to discover nested types?
  • Custom type matchers/regex patterns.
  • De-duplicate similar shaped objects (example below)
type PokemonGame struct {
    Name string
    Url string
}

type PokemonMove struct {
    Name string
    Url string
}

Becomes the generic (possibly prefixed struct):

type NameUrl struct {
    Name string
    Url string
}

Web App Interface

  • Migrate leftover Bootstrap utility classes to Material.
  • Add a "Schema Editor" table-like view to tune & view the results.
  • Fix options & overall menu
  • Add App Bar for config, or use router, modal - anything to get away from z-index BS.
  • Complete Web Worker for Background Processing.
  • Add confirmation for processing lots of data. (Rows and raw MB limit?)
  • Setup plausible analytics.

Code Writers

  • Add TypeScript+Mongoose Support (Possibly write all templates in TypeScript first, using tsc to emit JS code as needed?)
  • SQL CREATE TABLE
  • Added Zod support (like Yup or Joi)
  • JSON Schemas (for libraries like ajv)
  • Swagger yaml Reader/Writer
  • Binary Encoders (protocol buffers, thrift, avro)
  • Java Persistence API
  • Rails Models

Project Goals

The primary goal is to support any input JSON/CSV and infer as much as possible. More data will generally yield better results.

  • Support SQL & noSQL systems!
  • Automatic type detection!
  • Detects String & Number size constraints (for SQL, Binary encoding)!
  • Handles error/outliers intelligently
  • Ignores error/outlier records!
  • Smart field name formatting, snake-case vs. camel-case!
  • Detects unique columns!
  • Detects enum Fields!
  • Detects Not Null fields!
  • Extensible design, add new output/target with ease!
  • Nested data structure & multi-table relational output!

Output Support

Tips & Notes

For enum detection, adjust the relevant thresholds if you know (approximately) the expected number of unique enum values. For more accurate results, provide a randomized sample of 100+ rows. Accuracy increases (and speed decreases) greatly with 1,000+ rows.

  • Enumeration detection.
    • Can set a required row count (default 100 rows)
    • The next enum limit is the max number of unique values allowed?
      • For example, with 10 max enum items:
      • Only fields with a uniqueCount <= 10 will 'match' as enumerations and include an enum property.
  • Not Null detection.

For more info on the Schema Analyzer (core library) powering the DataAnalyzer.app, check out the schema-analyzer docs!

Included Type Matchers

Some of these (Email) are aliases of a base type (String). See code for more details on structure/relationship.

  • Unknown
  • ObjectId
  • UUID
  • Boolean
  • Date
  • Timestamp
  • Currency
  • Float
  • Number
  • BigNumber
  • Email
  • String
  • Array
  • Object
  • Null

Similar/Alternative Projects

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].