Implementation Details¶
File Formats¶
JSON Dataset¶
The PydanticJsonDataSet
dumps your
model as a self-describing JSON file.
In order for the dataset to be self-describing, we add the field "class"
to your model, which is your class's full import path.
So if you have a Python class defined in your_module
called Foo
, the resulting
JSON file will be:
1 2 3 4 5 |
|
Note: All
json_encoders
defined on your model will still be used.
Folder and Zip Datasets¶
The PydanticZipDataSet
is based on the
PydanticFolderDataSet
and just zips
the folder.
The directory structure is as the following:
1 2 3 4 5 |
|
The meta.json
file has 3 main fields:
"model_class"
is the class import path, as in the JSON dataset."model_info"
is the JSON serialization of the model, except that all types are "encoded" to the string"__DATA_PLACEHOLDER__"
."catalog"
is the pseudo-definition of the Kedro catalog. The difference is in therelative_path
argument.
The rest of the files/folders are the relative paths specified in the catalog
.
TODO: Is that all? Do we add model_schema
or something similar?
This is up to change as pydantic-kedro
gets more mature.