Skip to content

window skipNaN #996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 15, 2022
Merged

window skipNaN #996

merged 5 commits into from
Jul 15, 2022

Conversation

Fil
Copy link
Contributor

@Fil Fil commented Jul 15, 2022

Re: #993 (comment)

There is a case for saying that having a window containing a NaN is the same thing as a window that is on an edge: the values "before the beginning" or "after the end" are unavailable and we either ignore them (and apply the reducer on valid values) or return NaN.

Put it differently, if you accept windows that "extend", then you probably don't care too much about NaNs and statistical consistency, and prefer a “best guess” result with no holes.

In that sense, we should support strict: true to mean "no extension, and return NaN if any value is NaN", and strict: false the opposite. (This being the default, we would have gone from a very strict concept of window transform to a very lenient one.)

Of course this is opinionated, but it's consistent with the notion that a time series (employment 1990-2017) is in itself a window in time.

@Fil Fil requested a review from mbostock July 15, 2022 10:45
Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment re. mean.

Also I think we can do this without passing a new parameter to the map function. I will give it a shot and push to this PR if successful.

Thanks for doing this! Teamwork! 👏

map(I, S, T) {
sum.map(I, S, T);
map(I, S, T, skipNaN) {
sum.map(I, S, T, skipNaN);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we won’t be able to write mean in terms of sum here because if we ignore NaN then the denominator should change (e.g., if k is 5 but one value in the window is NaN then the denominator should be 4 and the mean is the sum divided by 4). Otherwise we are saying that undefined values are zero (which could be valid but should be an alternate representation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦🏻

mbostock and others added 4 commits July 15, 2022 16:03
* generalize non-strict reducers

* tweak

* don’t coerce for mode

* non-strict reducers

* DRY

* mode can be undefined, not NaN

* fix first/last defined logic

* don’t coerce for first and last

* numbers cannot be null

* shorter

* fix comment

Co-authored-by: Philippe Rivière <[email protected]>
@mbostock mbostock merged commit 9d9ba91 into main Jul 15, 2022
@mbostock mbostock deleted the fil/window-nan branch July 15, 2022 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants