All Projects → bogdandm → json2python-models

bogdandm / json2python-models

Licence: MIT license
Generate Python model classes (pydantic, attrs, dataclasses) based on JSON datasets with typing module support

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to json2python-models

erdantic
Entity relationship diagrams for Python data model classes like Pydantic
Stars: ✭ 63 (-47.06%)
Mutual labels:  dataclasses, pydantic
prisma-client-py
Prisma Client Python is an auto-generated and fully type-safe database client designed for ease of use
Stars: ✭ 739 (+521.01%)
Mutual labels:  typing, pydantic
ormsgpack
Msgpack serialization/deserialization library for Python, written in Rust using PyO3 and rust-msgpack. Reboot of orjson. msgpack.org[Python]
Stars: ✭ 88 (-26.05%)
Mutual labels:  dataclasses, pydantic
climatecontrol
Python library for loading settings and config data from files and environment variables
Stars: ✭ 20 (-83.19%)
Mutual labels:  dataclasses, pydantic
mlx
Machine Learning eXchange (MLX). Data and AI Assets Catalog and Execution Engine
Stars: ✭ 132 (+10.92%)
Mutual labels:  models, datasets
farabio
🤖 PyTorch toolkit for biomedical imaging ❤️
Stars: ✭ 48 (-59.66%)
Mutual labels:  models, datasets
desert
Deserialize to objects while staying DRY
Stars: ✭ 136 (+14.29%)
Mutual labels:  attrs, dataclasses
torchgeo
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
Stars: ✭ 1,125 (+845.38%)
Mutual labels:  models, datasets
ck-env
CK repository with components and automation actions to enable portable workflows across diverse platforms including Linux, Windows, MacOS and Android. It includes software detection plugins and meta packages (code, data sets, models, scripts, etc) with the possibility of multiple versions to co-exist in a user or system environment:
Stars: ✭ 67 (-43.7%)
Mutual labels:  models, datasets
pydbantic
A single model for shaping, creating, accessing, storing data within a Database
Stars: ✭ 137 (+15.13%)
Mutual labels:  models, pydantic
pydantic-factories
Simple and powerful mock data generation using pydantic or dataclasses
Stars: ✭ 380 (+219.33%)
Mutual labels:  dataclasses, pydantic
coqpit
Simple but maybe too simple config management through python data classes. We use it for machine learning.
Stars: ✭ 67 (-43.7%)
Mutual labels:  typing, dataclasses
dsge
Course on Dynamic Stochastic General Equilibrium (DSGE): Models, Solution, Estimation (graduate level)
Stars: ✭ 41 (-65.55%)
Mutual labels:  models
pytermgui
Python TUI framework with mouse support, modular widget system, customizable and rapid terminal markup language and more!
Stars: ✭ 1,270 (+967.23%)
Mutual labels:  typing
code-type
Practice code-typing with top 1000 keywords of the most popular programming languages.
Stars: ✭ 31 (-73.95%)
Mutual labels:  typing
Spatio-Temporal-papers
This project is a collection of recent research in areas such as new infrastructure and urban computing, including white papers, academic papers, AI lab and dataset etc.
Stars: ✭ 180 (+51.26%)
Mutual labels:  datasets
datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
Stars: ✭ 274 (+130.25%)
Mutual labels:  datasets
dataclasses-jsonschema
JSON schema generation from dataclasses
Stars: ✭ 145 (+21.85%)
Mutual labels:  dataclasses
morelia server
Server for MoreliaTalk network
Stars: ✭ 25 (-78.99%)
Mutual labels:  pydantic
felicity
Javascript object constructors and sample data based on Joi schema.
Stars: ✭ 107 (-10.08%)
Mutual labels:  models

json2python-models

PyPI version Build Coverage Status Codacy Badge

Example

json2python-models is a Python tool that can generate Python models classes (pydantic, dataclasses, attrs) from JSON datasets.

Features

  • Full typing module support
  • Types merging - if some field contains data of different types this will be represent as Union type
  • Fields and models names generation (unicode support included)
  • Similar models generalization
  • Handling recursive data structures (i.e family tree)
  • Detecting string serializable types (i.e. datetime or just stringify numbers)
  • Detecting fields containing string constants (Literal['foo', 'bar'])
  • Generation models as list (flat models structure) or tree (nested models)
  • Specifying when dictionaries should be processed as dict type (by default every dict is considered as some model)
  • CLI API with a lot of options

Table of Contents

Examples

Part of Path of Exile public items API

from pydantic import BaseModel, Field
from typing import List, Optional
from typing_extensions import Literal


class Tab(BaseModel):
    id_: str = Field(..., alias="id")
    public: bool
    stash_type: Literal["CurrencyStash", "NormalStash", "PremiumStash"] = Field(..., alias="stashType")
    items: List['Item']
    account_name: Optional[str] = Field(None, alias="accountName")
    last_character_name: Optional[str] = Field(None, alias="lastCharacterName")
    stash: Optional[str] = None
    league: Optional[Literal["Hardcore", "Standard"]] = None

F1 Season Results

----- Show -----

driver_standings.json
[
    {
        "season": "2019",
        "round": "3",
        "DriverStandings": [
            {
                "position": "1",
                "positionText": "1",
                "points": "68",
                "wins": "2",
                "Driver": {
                    "driverId": "hamilton",
                    "permanentNumber": "44",
                    "code": "HAM",
                    "url": "http://en.wikipedia.org/wiki/Lewis_Hamilton",
                    "givenName": "Lewis",
                    "familyName": "Hamilton",
                    "dateOfBirth": "1985-01-07",
                    "nationality": "British"
                },
                "Constructors": [
                    {
                        "constructorId": "mercedes",
                        "url": "http://en.wikipedia.org/wiki/Mercedes-Benz_in_Formula_One",
                        "name": "Mercedes",
                        "nationality": "German"
                    }
                ]
            },
            ...
        ]
    }
]
json2models -f pydantic -l DriverStandings - driver_standings.json
r"""
generated by json2python-models v0.2.0 at Mon May  4 17:46:30 2020
command: /opt/projects/json2python-models/venv/bin/json2models -f pydantic -s flat -l DriverStandings - driver_standings.json
"""
from pydantic import BaseModel, Field
from typing import List
from typing_extensions import Literal

class DriverStandings(BaseModel):
    season: int
    round_: int = Field(..., alias="round")
    DriverStandings: List['DriverStanding']

class DriverStanding(BaseModel):
    position: int
    position_text: int = Field(..., alias="positionText")
    points: int
    wins: int
    driver: 'Driver' = Field(..., alias="Driver")
    constructors: List['Constructor'] = Field(..., alias="Constructors")

class Driver(BaseModel):
    driver_id: str = Field(..., alias="driverId")
    permanent_number: int = Field(..., alias="permanentNumber")
    code: str
    url: str
    given_name: str = Field(..., alias="givenName")
    family_name: str = Field(..., alias="familyName")
    date_of_birth: str = Field(..., alias="dateOfBirth")
    nationality: str

class Constructor(BaseModel):
    constructor_id: str = Field(..., alias="constructorId")
    url: str
    name: str
    nationality: Literal["Austrian", "German", "American", "British", "Italian", "French"]

Swagger

----- Show -----

swagger.json from any online API (I tested file generated by drf-yasg and another one for Spotify API)

It requires a bit of tweaking:

  • Some fields store routes/models specs as dicts
  • There are a lot of optinal fields so we reduce merging threshold
  • Disable string literals
json2models -f dataclasses -m Swagger testing_tools/swagger.json \
    --dict-keys-fields securityDefinitions paths responses definitions properties \
    --merge percent_50 number --max-strings-literals 0
r"""
generated by json2python-models v0.2.0 at Mon May  4 18:08:09 2020
command: /opt/projects/json2python-models/json_to_models/__main__.py -s flat -f dataclasses -m Swagger testing_tools/swagger.json --max-strings-literals 0 --dict-keys-fields securityDefinitions paths responses definitions properties --merge percent_50 number
"""
from dataclasses import dataclass, field
from json_to_models.dynamic_typing import FloatString
from typing import Any, Dict, List, Optional, Union


@dataclass
class Swagger:
    swagger: FloatString
    info: 'Info'
    host: str
    schemes: List[str]
    base_path: str
    consumes: List[str]
    produces: List[str]
    security_definitions: Dict[str, 'Parameter_SecurityDefinition']
    security: List['Security']
    paths: Dict[str, 'Path']
    definitions: Dict[str, 'Definition_Schema']


@dataclass
class Info:
    title: str
    description: str
    version: str


@dataclass
class Security:
    api_key: Optional[List[Any]] = field(default_factory=list)
    basic: Optional[List[Any]] = field(default_factory=list)


@dataclass
class Path:
    parameters: List['Parameter_SecurityDefinition']
    post: Optional['Delete_Get_Patch_Post_Put'] = None
    get: Optional['Delete_Get_Patch_Post_Put'] = None
    put: Optional['Delete_Get_Patch_Post_Put'] = None
    patch: Optional['Delete_Get_Patch_Post_Put'] = None
    delete: Optional['Delete_Get_Patch_Post_Put'] = None


@dataclass
class Property:
    type_: str
    format_: Optional[str] = None
    xnullable: Optional[bool] = None
    items: Optional['Item_Schema'] = None


@dataclass
class Property_2E:
    type_: str
    title: Optional[str] = None
    read_only: Optional[bool] = None
    max_length: Optional[int] = None
    min_length: Optional[int] = None
    items: Optional['Item'] = None
    enum: Optional[List[str]] = field(default_factory=list)
    maximum: Optional[int] = None
    minimum: Optional[int] = None
    format_: Optional[str] = None


@dataclass
class Item:
    title: Optional[str] = None
    type_: Optional[str] = None
    ref: Optional[str] = None
    max_length: Optional[int] = None
    min_length: Optional[int] = None


@dataclass
class Parameter_SecurityDefinition:
    name: Optional[str] = None
    in_: Optional[str] = None
    required: Optional[bool] = None
    schema: Optional['Item_Schema'] = None
    description: Optional[str] = None
    type_: Optional[str] = None


@dataclass
class Delete_Get_Patch_Post_Put:
    operation_id: str
    description: str
    parameters: List['Parameter_SecurityDefinition']
    responses: Dict[str, 'Response']
    tags: List[str]


@dataclass
class Item_Schema:
    ref: str


@dataclass
class Response:
    description: str
    schema: Optional[Union['Item_Schema', 'Definition_Schema']] = None


@dataclass
class Definition_Schema:
    type_: str
    required: Optional[List[str]] = field(default_factory=list)
    properties: Optional[Dict[str, Union['Property', 'Property_2E']]] = field(default_factory=dict)
    ref: Optional[str] = None

Github-actions config files

----- Show -----

Github-actions model based on files from starter-workflows

json2models -m Actions "./starter-workflows/ci/*.yml" -s flat -f pydantic -i yaml --dkf env with jobs
r"""
generated by json2python-models v0.2.3 at Tue Jul 13 19:52:43 2021
command: /opt/projects/json2python-models/venv/bin/json2models -m Actions ./starter-workflows/ci/*.yml -s flat -f pydantic -i yaml --dkf env with jobs
"""
from pydantic import BaseModel, Field
from typing import Dict, List, Optional, Union
from typing_extensions import Literal


class Actions(BaseModel):
    on: Union['On', List[Literal["push"]]]
    jobs: Dict[str, 'Job']
    name: Optional[str] = None
    env: Optional[Dict[str, Union[int, str]]] = {}


class On(BaseModel):
    push: Optional['Push'] = None
    pull_request: Optional['PullRequest'] = None
    release: Optional['Release'] = None
    schedule: Optional[List['Schedule']] = []
    workflow_dispatch: Optional[None] = None


class Push(BaseModel):
    branches: List[Literal["$default-branch"]]
    tags: Optional[List[Literal["v*.*.*"]]] = []


class PullRequest(BaseModel):
    branches: List[Literal["$default-branch"]]


class Release(BaseModel):
    types: List[Literal["created", "published"]]


class Schedule(BaseModel):
    cron: Literal["$cron-daily"]


class Job(BaseModel):
    runson: Literal["${{ matrix.os }}", "macOS-latest", "macos-latest", "ubuntu-18.04", "ubuntu-latest", "windows-latest"] = Field(..., alias="runs-on")
    steps: List['Step']
    name: Optional[str] = None
    environment: Optional[Literal["production"]] = None
    outputs: Optional['Output'] = None
    container: Optional['Container'] = None
    needs: Optional[Literal["build"]] = None
    permissions: Optional['Permission'] = None
    strategy: Optional['Strategy'] = None
    defaults: Optional['Default'] = None
    env: Optional[Dict[str, str]] = {}


class Step(BaseModel):
    uses: Optional[str] = None
    name: Optional[str] = None
    with_: Optional[Dict[str, Union[bool, float, str]]] = Field({}, alias="with")
    run: Optional[str] = None
    env: Optional[Dict[str, str]] = {}
    workingdirectory: Optional[str] = Field(None, alias="working-directory")
    id_: Optional[Literal["build-image", "composer-cache", "deploy-and-expose", "image-build", "login-ecr", "meta", "push-to-registry", "task-def"]] = Field(None, alias="id")
    if_: Optional[str] = Field(None, alias="if")
    shell: Optional[Literal["Rscript {0}"]] = None


class Output(BaseModel):
    route: str = Field(..., alias="ROUTE")
    selector: str = Field(..., alias="SELECTOR")


class Container(BaseModel):
    image: Literal["crystallang/crystal", "erlang:22.0.7"]


class Permission(BaseModel):
    contents: Literal["read"]
    packages: Literal["write"]


class Strategy(BaseModel):
    matrix: Optional['Matrix'] = None
    maxparallel: Optional[int] = Field(None, alias="max-parallel")
    failfast: Optional[bool] = Field(None, alias="fail-fast")


class Matrix(BaseModel):
    rversion: Optional[List[float]] = Field([], alias="r-version")
    pythonversion: Optional[List[float]] = Field([], alias="python-version")
    deno: Optional[List[Literal["canary", "v1.x"]]] = []
    os: Optional[List[Literal["macOS-latest", "ubuntu-latest", "windows-latest"]]] = []
    rubyversion: Optional[List[float]] = Field([], alias="ruby-version")
    nodeversion: Optional[List[Literal["12.x", "14.x", "16.x"]]] = Field([], alias="node-version")
    configuration: Optional[List[Literal["Debug", "Release"]]] = []


class Default(BaseModel):
    run: 'Run'


class Run(BaseModel):
    shell: Literal["bash"]

Example with preamble

----- Show -----

A simple example to demonstrate adding extra code before the class list.

json2models -f pydantic --preamble "# set up defaults
USERNAME = 'user'
SERVER_IP = '127.0.0.1'
" -m Swagger testing_tools/swagger.json
r"""
generated by json2python-models v0.2.5 at Tue Aug 23 08:55:09 2022
command: json2models -f pydantic --preamble # set up defaults
USERNAME = 'user'
SERVER_IP = '127.0.0.1'
 -m Swagger testing_tools/swagger.json -o output.py
"""
from pydantic import BaseModel, Field
from typing import Any, List, Literal, Optional, Union


# set up defaults
USERNAME = 'user'
SERVER_IP = '127.0.0.1'



class Swagger(BaseModel):
    # etc.

Installation

Beware: this project supports only python3.7 and higher.

To install it, use pip:

pip install json2python-models

Or you can build it from source:

git clone https://github.com/bogdandm/json2python-models.git
cd json2python-models
python setup.py install

Usage

CLI

For regular usage CLI tool is the best option. After you install this package you can use it as json2models <arguments> or python -m json_to_models <arguments>. I.e.:

json2models -m Car car_*.json -f attrs > car.py

Arguments:

  • -h, --help - Show help message and exit

  • -m, --model - Model name and its JSON data as path or unix-like path pattern. *, ** or ? patterns symbols are supported.

    • Format: -m <Model name> [<JSON files> ...]
    • Example: -m Car audi.json reno.json or -m Car audi.json -m Car reno.json (results will be the same)
  • -l, --list - Like -m but given json file should contain list of model data (dataset). If this file contains dict with nested list than you can pass <JSON key> to lookup. Deep lookups are supported by dot-separated path. If no lookup needed pass - as <JSON key>.

    • Format: -l <Model name> <JSON key> <JSON file>
    • Example: -l Car - cars.json -l Person fetch_results.items.persons result.json
    • Note: Models names under these arguments should be unique.
  • -i, --input-format - Input file format (parser). Default is JSON parser. Yaml parser requires PyYaml or ruamel.yaml to be installed. Ini parser uses builtin configparser. To implement new one - add new method to cli.FileLoaders (and create pull request :) )

    • Format: -i {json, yaml, ini}
    • Example: -i yaml
    • Default: -i json
  • -o, --output - Output file

    • Format: -o <FILE>
    • Example: -o car_model.py
  • -f, --framework - Model framework for which python code is generated. base (default) mean no framework so code will be generated without any decorators and additional meta-data.

    • Format: -f {base, pydantic, attrs, dataclasses, custom}
    • Example: -f pydantic
    • Default: -f base
  • -s, --structure - Models composition style.

    • Format: -s {flat, nested}
    • Example: -s nested
    • Default: -s flat
  • --preamble - Additional material to be

    • Format: --preamble "<formatted python code string to be added after module imports>"
    • Example:
    --preamble "# set up defaults
    USERNAME = 'user'
    SERVER = '127.0.0.1'"
    • Optional
  • --datetime - Enable datetime/date/time strings parsing.

    • Default: disabled
    • Warning: This can lead to 6-7 times slowdown on large datasets. Be sure that you really need this option.
  • --disable-unicode-conversion, --no-unidecode - Disable unicode conversion in field labels and class names

    • Default: enabled
  • --strings-converters - Enable generation of string types converters (i.e. IsoDatetimeString or BooleanString).

    • Default: disabled
  • --max-strings-literals - Generate Literal['foo', 'bar'] when field have less than NUMBER string constants as values.

    • Format: --max-strings-literals <NUMBER>
    • Default: 10 (generator classes could override it)
    • Example: --max-strings-literals 5 - only 5 literals will be saved and used to code generation
    • Note: There could not be more than 15 literals per field (for performance reasons)
    • Note: attrs code generator do not use Literals and just generate str fields instead
  • --merge - Merge policy settings. Possible values are:

    • Format: --merge MERGE_POLICY [MERGE_POLICY ...]
    • Possible values (MERGE_POLICY):
      • percent[_<percent>] - two models had a certain percentage of matched field names. Custom value could be i.e. percent_95.
      • number[_<number>] - two models had a certain number of matched field names.
      • exact - two models should have exact same field names to merge.
    • Example: --merge percent_95 number_20 - merge if 95% of fields are matched or 20 of fields are matched
    • Default: --merge percent_70 number_10
  • --dict-keys-regex, --dkr - List of regular expressions (Python syntax). If all keys of some dict are match one of the pattern then this dict will be marked as dict field but not nested model.

    • Format: --dkr RegEx [RegEx ...]
    • Example: --dkr node_\d+ \d+_\d+_\d+
    • Note: ^ and $ (string borders) tokens will be added automatically but you have to escape other special characters manually.
    • Optional
  • --dict-keys-fields, --dkf - List of model fields names that will be marked as dict fields

    • Format: --dkf FIELD_NAME [FIELD_NAME ...]
    • Example: --dkf "dict_data" "mapping"
    • Optional
  • --code-generator - Absolute import path to GenericModelCodeGenerator subclass.

    • Format: --code-generator CODE_GENERATOR
    • Example: -f mypackage.mymodule.DjangoModelsGenerator
    • Note: Is ignored without -f custom but is required with it.
  • --code-generator-kwargs - List of GenericModelCodeGenerator subclass arguments (for __init__ method, see docs of specific subclass). Each argument should be in following format: argument_name=value or "argument_name=value with space". Boolean values should be passed in JS style: true or false

    • Format: --code-generator-kwargs [NAME=VALUE [NAME=VALUE ...]]
    • Example: --code-generator-kwargs kwarg1=true kwarg2=10 "kwarg3=It is string with spaces"
    • Optional

One of model arguments (-m or -l) is required.

Low level API

-

Tests

To run tests you should clone project and run setup.py script:

git clone https://github.com/bogdandm/json2python-models.git
cd json2python-models
python setup.py test -a '<pytest additional arguments>'

Also I would recommend you to install pytest-sugar for pretty printing test results

Test examples

You can find out some examples of usage of this project at testing_tools/real_apis/...

Each file contains functions to download data from some online API (references included at the top of file) and main function that generates and prints code. Some examples may print debug data before actual code. Downloaded data will be saved at testing_tools/real_apis/<name of example>/<dataset>.json

Built With

Test tools:

Contributing

Feel free to open pull requests with new features or bug fixes. Just follow few rules:

  1. Always use some code formatter (black or PyCharm built-in)
  2. Keep code coverage above 95-98%
  3. All existing tests should be passed (including test examples from testing_tools/real_apis)
  4. Use typing module
  5. Fix codacy issues from your PR

License

This project is licensed under the MIT License - see the LICENSE file for details

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].