Derived columns #367

annie · 2023-04-27T17:54:50Z

Part of https://github.com/observablehq/observablehq/issues/11623

Description

This PR adds support for derived columns in the __table function. In this first version, we won't support derived columns for database sources.

Notable changes:

Pulled the logic for inferring types and coercing rows into a separate applyTypes function. We now run this twice, first on the source dataset, and second on the derived dataset, before merging the two. Type inference/coercion must be done separately on the derived dataset because it may depend on values in the source dataset already being coerced.
Added a .fullSchema property to the return value of __table. This property contains the schema information for all columns in the dataset, regardless of whether or not they are selected (.schema only contains the schema info for selected columns). In https://github.com/observablehq/observablehq/pull/11214, we switch to using the cell value to get the table schema, because derived columns aren't available on the original dataset and their types may be dynamic, so we need to look at their evaluated runtime values. We look at .fullSchema when fetching the table schema so that we always have type information for the full set of columns, so that users can e.g. reselect a deselected column in the Columns menu.

Review notes

I would love some feedback on the .fullSchema change! I'm not sure if it's the best way to make all the column types available, and I'm also not sure if I'm missing any major pitfalls with switching to use the cell value to fetch the table schema, instead of looking at the original data source as we do today. The main pitfall I experienced when testing is that, if there's an error in a derived formula, we no longer have a table schema available because the cell throws an error. I addressed that in the monorepo PR by adding a fallback that goes back to using the loaded data source + an approximation of the derived columns schema, which I think is the best we can do in that case. But perhaps there are other issues with this approach that I'm missing...

src/table.js

annie · 2023-05-10T18:03:33Z

just pushed up a new commit that catches runtime errors and returns them in the function output (although it seems like i need to update tests!). will put up a corresponding monorepo PR shortly!

update: monorepo PR here

mkfreeman and others added 13 commits February 15, 2023 12:41

Basic handling of column derivations

88dc5e0

Derive value before running __table

e227e49

Add applyTypes method

e805f51

Return the full schema on the source

1e0c4d9

Merge branch 'main' into mkfreeman/derive-columns

aa3f98b

move derivations into __table

b212f2e

cleanup; add comments

b01f5f0

use hidden flag for deselected columns; rm fullSchema

84e1ee4

add test for derive; clean up other tests

dcb7b43

allow derived columns to reference previously derived columns

dd8c1f6

go back to using .fullSchema

bfba41e

refine comment

3cbd557

fix derivedSource; refine fullSchema; add fullSchema to tests

4f8e535

annie requested review from mbostock and libbey-observable April 27, 2023 17:54

handle usage of renamed columns in derived formulas

b094acb

libbey-observable reviewed May 9, 2023

View reviewed changes

src/table.js Outdated Show resolved Hide resolved

catch and return runtime errors

8eebe3c

use array instead of map for columnErrors; update unit tests

0b1f2db

annie requested a review from libbey-observable May 11, 2023 17:28

libbey-observable approved these changes May 11, 2023

View reviewed changes

Merge branch 'main' into mkfreeman/derive-columns

65fb665

annie merged commit 63bce4e into main May 11, 2023

annie deleted the mkfreeman/derive-columns branch May 11, 2023 22:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Derived columns #367

Derived columns #367

Uh oh!

annie commented Apr 27, 2023 •

edited

Loading

Uh oh!

Uh oh!

annie commented May 10, 2023 •

edited

Loading

Uh oh!

Uh oh!

Derived columns #367

Derived columns #367

Uh oh!

Conversation

annie commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Review notes

Uh oh!

Uh oh!

annie commented May 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

annie commented Apr 27, 2023 •

edited

Loading

annie commented May 10, 2023 •

edited

Loading