Skip to content

BUG on main: DeprecationWarning triggered by internal read_orc/parquet code #56171

@twoertwein

Description

@twoertwein

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# any
pd.read_orc(...)
# and any
pd.read_parquet(...)
# trigger DeprecationWarning: Passing a BlockManager to DataFrame is deprecated and will raise in a future version. Use public APIs instead.

Issue Description

The above happens only on main.

From read_parquet

../../.cache/pypoetry/virtualenvs/pandas-stubs-DrIM1v70-py3.11/lib/python3.11/site-packages/pandas/io/parquet.py:671: in read_parquet
    return impl.read(
../../.cache/pypoetry/virtualenvs/pandas-stubs-DrIM1v70-py3.11/lib/python3.11/site-packages/pandas/io/parquet.py:280: in read
    result = pa_table.to_pandas(**to_pandas_kwargs)
pyarrow/array.pxi:884: in pyarrow.lib._PandasConvertible.to_pandas
    ???
pyarrow/table.pxi:4196: in pyarrow.lib.Table._to_pandas
    ???
pyarrow/pandas-shim.pxi:112: in pyarrow.lib._PandasAPIShim.data_frame
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[AttributeError("'DataFrame' object has no attribute '_mgr'") raised in repr()] DataFrame object at 0x7fdf70635210>
data = BlockManager
Items: Index(['a', 'b'], dtype='object')
Axis 1: Index([0, 1, 2], dtype='int64')
NumpyBlock: slice(0, 1, 1), 1 x 3, dtype: int64
NumpyBlock: slice(1, 2, 1), 1 x 3, dtype: float64
index = None, columns = None, dtype = None, copy = None

    def __init__(
        self,
        data=None,
        index: Axes | None = None,
        columns: Axes | None = None,
        dtype: Dtype | None = None,
        copy: bool | None = None,
    ) -> None:
        allow_mgr = False
        if dtype is not None:
            dtype = self._validate_dtype(dtype)

        if isinstance(data, DataFrame):
            data = data._mgr
            allow_mgr = True
            i if not copy:
                # if not copying data, ensure to still return a shallow copy
                # to avoid the result sharing the same Manager
                data = data.copy(deep=False)

        if isinstance(data, (BlockManager, ArrayManager)):
            if not allow_mgr:
                # GH#52419
>               warnings.warn(
                    f"Passing a {type(data).__name__} to {type(self).__name__} "
                    "is deprecated and will raise in a future version. "
                    "Use public APIs instead.",
                    DeprecationWarning,
                    stacklevel=1,  # bump to 2 once pyarrow 15.0 is released with fix
                )
E               DeprecationWarning: Passing a BlockManager to DataFrame is deprecated and will raise in a future version. Use public APIs instead.

xref pandas-dev/pandas-stubs#819

Expected Behavior

No warning

Installed Versions

On main

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions