2025-04-10

(Un)commonly used Pydantic features

In the spirit of parse, don't validate, here's some Pydantic features I use a lot that I don't see very wide usage of.

These days I tend to feed all external data through Pydantic as early as possible, for those sweet, sweet types.

A classic place to use these features would be parsing a csv. Instead of manually validating numbers, datetimes etc, just set up your types and:

typed_rows = [MyType.validate_python(row) for row in csv.reader(...)]

Features used:

Custom parsing/serialisation of weird strings.
- Just raise a ValueError on errors - Pydantic will handle 'em
- You only need to do the initial parsing - in this case, Pydantic itself is still doing list[str] -> list[float]
Usage of TypeAdaptor for when the thing we're trying to deserialize is list-like and we can't use a pydantic.BaseModel.
Usage of .dump_python(..., mode="json") for eg. preparing data for an ORM insert.

import datetime as dt
from typing import Annotated, Any
import pydantic


def underscore_split(v: Any) -> Any:
    if isinstance(v, str):
        return v.split("_")
    return v


def underscore_serialize(vs: list[float]) -> str:
    return "_".join(str(v) for v in vs)


UnderscoreFloats = Annotated[
    list[float],
    pydantic.BeforeValidator(underscore_split),
    pydantic.PlainSerializer(underscore_serialize),
]
MyTuple = pydantic.TypeAdapter(
    tuple[int, dt.datetime, UnderscoreFloats],
)

my_tuple = MyTuple.validate_python(("42", "2012-01-30", "3.14_2.72"))
assert my_tuple == (42, dt.datetime(2012, 1, 30), [3.14, 2.72])
assert MyTuple.dump_python(my_tuple, mode="json") == [42, "2012-01-30T00:00:00", "3.14_2.72"]