2020-03-15




I've since addressed a lot of these points in an experimental Python library.


What nested SQL queries might look like

A friend was talking to me about ORM design recently. I'm a heavy SQLAlchemy user, but see all the reasons people dislike ORMs, so on a recent project I had a go at dipping back down to raw SQL land to see how I'd cope.

I ended up finding the experience pretty frustrating - I'm sure someone will have written it before, but there's definitely some Greenspun's eleventh rule about any sufficiently complex SQL string munging codebase turning into a half-arsed, bug-ridden ORM. I would include in this (although to a smaller degree) sufficiently complex code using only SQL query builder libraries (like knex or SQLAlchemy core).

Here's the features that postgres (or whatevever) would have to provide for me to ditch the ORM - although I'm still imagining using some kind of SQL query builder. They're vaguely in order of preference, also I've in no way thought through the details :-)

Nested join queries

When I do joins between things that map to entites, I don't want to then have to manually munge them back into nested entites in my host language, I want to do something like:

SELECT NESTED {
    customer.name,
    customer.email,
    [
        {
            product.id,
            product.cost,
        }
    ]
    sum(product.cost) AS total_cost,
}
FROM customer
LEFT OUTER JOIN NESTED product USING(customer_id)
GROUP BY customer.customer_id

Looks kinda similar to graphql I guess, but I want it in SQL!

A slightly absurd thing to me is that despite all its normalisation chat, the result of a SELECT x.*, y.* FROM x JOIN y SQL query will have x.* values repeated over and over. I guess this has negative performance implications when x is wide and there are many ys for each x.

Nested insertion

Without this I have to do:

INSERT INTO customer (name, email) VALUES ... RETURNING customer_id

Then do a separate query to insert products using the customer_ids I got back. In this case with only one level of nesting, this is fine, but things get very hairy when you start wanting to insert things with 2+ levels of nesting.

I want to do something like:

INSERT INTO customer
INSERT NESTED product
VALUES
{
    customer.name='oli',
    customer.email='oli@oli.com',
    [
        {
            product.customer_id=customer.customer_id,
            product.cost=4.23,
        }
    ]
}

The syntax here seems harder to get right - this seems a bit ambiguous/inflexible.

Problems/Prior work