Skip to content

RTK Query - query level customization/normalization of cache key #1668

@ethanresnick

Description

@ethanresnick

The Problem

Imagine a query where args is an options object, and some of the keys have default values. For example, maybe there's a getPosts query that has args of { query: string, recentOnly?: boolean }; when the caller omits recentOnly in the args, the query function acts as though it were false.

The result is that there are two ways to fetch the same data:

useGetPostsQuery({ query: "react" }); // recentOnly is implicity false
useGetPostsQuery({ query: "react", recentOnly: false }); // recentOnly provided explicitly

Conceptually, both of these hooks fetch the same data, but, right now, those two calls will generate different cache keys. This presents a few issues:

  1. Most obviously, the cache just won't quite as work as well. I.e., if one component calls useGetPostsQuery({ query: "react" }), and another does useGetPostsQuery({ query: "react", recentOnly: false }), the second component won't be able benefit from the cached response from the first hook invocation. Moreover, these two cached responses can easily get out of sync (e.g., if one's polling for updates and the other isn't), when there's really no need for them to.

  2. I think the bigger issue, though, is that now manual cache updates are unreliable. The cached data to update might exist under either of the two cache keys, or both, depending on how exactly the query was invoked. Any code doing manual/optimistic updates has to make sure to account for all possible cache keys, which is bug-prone. (Plus, as there are more optional properties in args, the number of possible cache keys for the same data explodes.)

  3. Because the cached responses are not linked to one another and can get out of sync (e.g., if the manual/optimistic update code does not account for all possible cache keys), refactors that will seem to developers like they ought to be safe — like replacing one version of the hook call from my example with the other — will end up causing breaking changes.

Proposed Solution

This is a classic case where two different representations (i.e., { query: "react" } and { query: "react", recentOnly: false }) actually mean the same thing, so the query ought to be able to define sort of normalization step so that they resolve to the same cache key.

I propose extending the API for defining a query to support a normalizeQueryArgs property, like this:

getPosts: build.query<Post[], { query: string, recentOnly: boolean }>({
   query: (args) => ({ url: `/posts?limit=${args.recentOnly? "50" : "all"}` }),
   normalizeQueryArgs: (rawArgs: { query: string, recentOnly?: boolean }) => ({ recentOnly: false, ...rawArgs })
})     

The idea is that the argument passed to a useXXXQuery hook, or to a direct initiate call, would go through normalizeQueryArgs, with the result value being passed to the query function. The cache key generation logic would stay identical to today, but it would now be based on a normalized representation of the arguments.

Potential Alternatives

I can think of a couple alternatives here:

First, RTK query's docs could simply warn of these caching issues, and say that any query where args allows multiple values to have the same meaning is an anti-pattern. Then, users would have to avoid/refactor such queries.

My hypothetical getPosts query might get refactored by splitting it into two queries, like getLatestPosts({ query: ".." }) and getAllTimePosts({ query: "..." }). Or, maybe it would be refactored so that recentOnly is always required, so that there are no longer two different cases (missing + false) with the same meaning.

The problem with this approach, I think, is that queries with optional arguments are useful, and their refactored forms are worse/less ergonomic. Consider splitting getPosts into getLatestPosts and getAllTimePosts: if the only difference between getLatestsPosts and getAllTimePosts is the value used for the ?limit query parameter when calling the API (as in my example above), then having two queries is going to introduce a lot of needless duplication (in transformResponse, providesTags, etc). Meanwhile, making keys that could have default values (like recentOnly) required just makes every caller have to do more typing, and requires updating every query call site if a new could-have-been-optional key is added to the options object.

Another approach could be handling this in the global serializeQueryArgs function, but that doesn't seem right, as that function applies to all queries (so it'd need conditional logic based on which query's arguments it's dealing with). It's also defined very far away from the place where each query defines its arguments, so it'd be hard to keep it correct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions