Skip to content

[ADD] util/update_table_from_dict #297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Pirols
Copy link
Contributor

@Pirols Pirols commented Jul 18, 2025

A recurrent challenge in writing upgrade scripts is that of updating values in a table based on some form of already available mapping from the id (or another identifier) to the new value, this is often addressed with an iterative solution in the form:

for key, value in mapping.items():
    cr.execute(
        """
        UPDATE table
           SET col = %s
         WHERE key_col = %s
        """,
        [value, key],
    )

or in a more efficient (only issuing a single query) but hacky way:

cr.execute(
    """
    UPDATE table
       SET col = (%s::jsonb)->>(key_col::text)
     WHERE key_col = ANY(%s)
    """,
    [json.dumps(mapping), list(mapping)],
)

With the former being ineffective for big mappings and the latter often requiring some comments at review time to get it right. This commit introduces a util meant to make it easier to efficiently perform such updates.

@robodoo
Copy link
Contributor

robodoo commented Jul 18, 2025

Pull request status dashboard

@Pirols
Copy link
Contributor Author

Pirols commented Jul 18, 2025

Usage example: https://github.com/odoo/upgrade/pull/8079

@KangOl
Copy link
Contributor

KangOl commented Jul 18, 2025

upgradeci retry with always only base

@Pirols Pirols force-pushed the master-add_update_table_from_dict-pied branch 5 times, most recently from 1e5c9a3 to 31ecb56 Compare July 22, 2025 13:15
@Pirols Pirols requested review from KangOl and aj-fuentes July 22, 2025 14:55
@Pirols Pirols force-pushed the master-add_update_table_from_dict-pied branch 3 times, most recently from e3ab180 to 876a5d9 Compare July 22, 2025 15:02
@Pirols Pirols force-pushed the master-add_update_table_from_dict-pied branch 3 times, most recently from 5c81d74 to 20f04c9 Compare July 29, 2025 15:27
:param str key_col: column to match against keys of `mapping`
:param int bucket_size: maximum number of rows to update per single query
"""
_validate_table(table)
Copy link
Contributor

@KangOl KangOl Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like to have a validation of the mapping

Suggested change
_validate_table(table)
_validate_table(table)
assert isinstance(mapping, dict) and all(isinstance(key, int) and isinstance(value, (list, tuple)) for key, value in mapping.items())

Copy link
Contributor Author

@Pirols Pirols Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep key_col and limit its potential targets to integer columns or remove it entirely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think key_col should always be an int, but I like to have it variable. It would allow to update a table based in a FK instead of a PK. To decide you can grep the repo and see if we currently use the dict trick for other columns than id... I think we did but my memory could betray me.

Copy link
Contributor Author

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then maybe it's worth taking the type from the key_col in the table directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should not be necessary since JSONB_EACH returns the key as a text1.
Unless we want to validate the type of the keys of mapping. In which case I would need to define (or reuse, if already available) a mapping from pg types to python types. Wdyt?

Footnotes

  1. https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I just noticed you are casting the key_col to text before comparing with the actual json dict key, then it's fine. :) What I had in mind was casting the dict key to the column type. In principle my suggestion was to block/fail if we get an invalid json dict (incompatible with the table), I have no strong opinion against current approach TBH. @KangOl wdyt?

@Pirols Pirols force-pushed the master-add_update_table_from_dict-pied branch from 20f04c9 to 4bbaf95 Compare July 29, 2025 17:08
@Pirols Pirols force-pushed the master-add_update_table_from_dict-pied branch 4 times, most recently from 54d1ef6 to dbf2889 Compare July 30, 2025 09:53
A recurrent challenge in writing upgrade scripts is that of updating values in a
table based on some form of already available mapping from the id (or another
identifier) to the new value, this is often addressed with an iterative solution
in the form:

```python
for key, value in mapping.items():
    cr.execute(
        """
        UPDATE table
           SET col = %s
         WHERE key_col = %s
        """,
        [value, key],
    )
```

or in a more efficient (only issuing a single query) but hacky way:

```python
cr.execute(
    """
    UPDATE table
       SET col = (%s::jsonb)->>(key_col::text)
     WHERE key_col = ANY(%s)
    """,
    [json.dumps(mapping), list(mapping)],
)
```

With the former being ineffective for big mappings and the latter often
requiring some comments at review time to get it right.
This commit introduces a util meant to make it easier to efficiently perform
such updates.
@Pirols Pirols force-pushed the master-add_update_table_from_dict-pied branch from dbf2889 to a3160cb Compare July 30, 2025 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants