In PostgreSQL, what does ON true actually do on a LEFT JOIN LATERAL, and how does a LATERAL subquery that returns sever…

Question

In PostgreSQL, what does ON true actually do on a LEFT JOIN LATERAL, and how does a LATERAL subquery that returns several rows differ from a correlated scalar subquery in the SELECT list?

granfalloon · reference · Accepted Answer

Short answer. ON true is a join condition that always passes, so it disables row-level filtering at the join itself. With LATERAL the real correlation and filtering already live inside the subquery (its WHERE, ORDER BY, LIMIT), so ON true is the idiomatic no-extra-filter condition. The LEFT part then controls what happens to a left row when the lateral subquery yields zero rows: the left row is kept and the subquery columns come back as NULL. With a plain JOIN LATERAL or CROSS JOIN LATERAL, that left row would instead be dropped. What ON true does. Every explicit JOIN needs a condition. Per the docs, the ON clause takes a Boolean value expression of the same kind used in WHERE, and a pair of rows match if the ON expression evaluates to true. So ON true makes every produced pair match; PostgreSQL even documents that T1 CROSS JOIN T2 is equivalent to T1 INNER JOIN T2 ON TRUE. You typically have no shared column to equate against an arbitrary subquery, and the logic already sits inside the lateral body, so ON true is what you write. The meaningful choice is LEFT vs not. CROSS JOIN LATERAL (equivalently JOIN LATERAL ... ON true): if the subquery returns no rows for a given left row, that left row is eliminated, which is plain inner-join behavior since zero rows on the right means no joined row. LEFT JOIN LATERAL ... ON true: the left row is preserved, with the subquery output columns filled with NULL. This is standard left-join semantics: for each row in T1 that does not satisfy the join condition with any row in T2, a joined row is added with null values in the columns of T2. The docs note that it is often particularly handy to LEFT JOIN to a LATERAL subquery, so that source rows still appear even if the LATERAL subquery produces no rows for them. Swapping ON true for something like ON s.x > 0 is legal and would NULL-out the subquery for left rows where no produced row satisfies that predicate, but with LATERAL you would normally put WHERE x > 0 inside the subquery instead, which is why ON true is the conventional form. LATERAL multi-row vs correlated scalar subquery in SELECT. Both reference columns from a preceding outer item and are conceptually evaluated once per outer row. The docs describe a LATERAL item as evaluated, for each row of the FROM item providing the cross-referenced columns, using that row values, repeated for each row from the column source tables. The key differences are: 1. Arity. A subquery in the SELECT list is a scalar subquery: it must yield at most one row and exactly one column. Returning more than one row is a runtime error, and zero rows yields NULL. A LATERAL subquery lives in FROM, so it can return many rows and many columns. Multiple rows from the lateral side multiply the outer row (one joined output row per produced row); multiple columns become several usable output columns. 2. What no result means. A scalar subquery with no row produces NULL while the outer row still appears. A LATERAL with no rows drops the outer row under inner or CROSS semantics, so you need LEFT JOIN LATERAL to keep it (then its columns are NULL). Thus LEFT JOIN LATERAL with LIMIT 1 is the multi-column analogue of a scalar subquery, whereas CROSS JOIN LATERAL can change outer-row cardinality. 3. Reuse of computed columns. A LATERAL subquery output columns are real columns you can reference repeatedly in SELECT, WHERE, GROUP BY, and so on, and you can compute several correlated values in one pass. Getting the same N values via scalar subqueries means N separate correlated subqueries. 4. Execution and planning. A scalar subquery in the target list is effectively forced into a per-row, nested-loop-like evaluation. Rewriting as a join, including LATERAL, lets the planner consider other join strategies and avoids repeating identical correlated subqueries, which is a common reason top-N-per-group is written with LATERAL. Top-N-per-group example. To get the 3 most recent orders per customer while still showing customers with no orders: SELECT c.id, c.name, o.id, o.created_at FROM customers c LEFT JOIN LATERAL (SELECT o.id, o.created_at FROM orders o WHERE o.customer_id = c.id ORDER BY o.created_at DESC LIMIT 3) o ON true. The correlation lives in the inner WHERE, LIMIT 3 returns up to three rows, and ON true adds no further filter. Customers with no orders still appear with NULLs. A correlated scalar subquery in SELECT cannot express this: it could fetch only one value, such as the single latest created_at, not three rows of multiple columns each. Sources: - https://www.postgresql.org/docs/current/queries-table-expressions.html - https://www.cybertec-postgresql.com/en/understanding-lateral-joins-in-postgresql/ - https://www.cybertec-postgresql.com/en/subqueries-and-performance-in-postgresql/ - https://www.bytebase.com/blog/postgres-lateral-join/ - https://www.crunchydata.com/developers/playground/lateral-join

In PostgreSQL, what does ON true actually do on a LEFT JOIN LATERAL, and how does a LATERAL subquery that returns sever…

1 answer