Implementation Details¶
File Formats¶
JSON Dataset¶
The PydanticJsonDataset dumps your
model as a self-describing JSON file.
In order for the dataset to be self-describing, we add the field "class" to your model, which is your class's full import path.
So if you have a Python class defined in your_module called Foo, the resulting
JSON file will be:
1 2 3 4 5 | |
Note: All
json_encodersdefined on your model will still be used.
Folder and Zip Datasets¶
The PydanticZipDataset is based on the
PydanticFolderDataset and just zips
the folder.
The directory structure is as the following:
1 2 3 4 5 | |
The meta.json file has 3 main fields:
"model_class"is the class import path, as in the JSON dataset."model_info"is the JSON serialization of the model, except that all types are "encoded" to the string"__DATA_PLACEHOLDER__"."catalog"is the pseudo-definition of the Kedro catalog. The difference is in therelative_pathargument.
The rest of the files/folders are the relative paths specified in the catalog.
TODO: Is that all? Do we add model_schema or something similar?
This is up to change as pydantic-kedro gets more mature.