Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.

adrian17 7 daysReload

> The SQL above results in a plan similar to the DuckDB optimized plan, but it is wordier and more error-prone to write, which can potentially lead to bugs.

FWIW, aside from manual filter pushdown, I consider the JOIN variant the canonical / "default" way to merge multiple tables; it keeps all the join-related logic in one place, while mixing both joining conditions and filtering conditions in WHERE always felt more error-prone to me.

lmeyerov 7 daysReload

Always a fan of query plan articles!

Note: the dig at dataframe libs is worth some care in case you think that means duckdb can optimize and they cannot

Dask, Polars, and others pick a lazy default in order to make distribution and other optimizations easier. When staying in their pure fragments ('vectorized'), the same scheduler rewriting opportunity is here.

This is a subtle but important distinction when looking at these frameworks. We are making our new graph query language 'gfql' to be dataframe-native so it can run naturally & natively as a step of pipelines people are already doing, but also to ensure we automatically run as optimized CPU/GPU columnar opts. At the same time, because of the intent to allow room for query plan optimization, we are staying declarative / lazy, even if the generated & interpreted code uses an eager DF runtime . I'm optimistic about output target lazy DF systems doing query planner work for us long-term here, but for the eager framework targets, the query planning has to be on our side.

unwind 7 daysReload

Meta: I think the title would be better here if it came out and said query optimizers.

That gives a less subtle clue that it's about databases than looking at the domain.

ikesau 7 daysReload

> This means your optimizations need to be applied by hand, which is sustainable if your data starts changing.

Seems like a missing "un" here

Compelling article! I've already found DuckDB to be the most ergonomic tool for quick and dirty wrangling, it's good to know it can handle massive jobs too.

kwillets 6 daysReload

One possible hand optimization is to push the aggregation below the joins, which makes the latter a few hundred rows.