Skip to content

Commit 9c5c378

Browse files
authored
Merge branch 'main' into share-datetime-parsing-format-paths
2 parents 891ab4e + eff6566 commit 9c5c378

37 files changed

+243
-112
lines changed

.github/actions/setup-conda/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ runs:
1818
- name: Set Arrow version in ${{ inputs.environment-file }} to ${{ inputs.pyarrow-version }}
1919
run: |
2020
grep -q ' - pyarrow' ${{ inputs.environment-file }}
21-
sed -i"" -e "s/ - pyarrow<10/ - pyarrow=${{ inputs.pyarrow-version }}/" ${{ inputs.environment-file }}
21+
sed -i"" -e "s/ - pyarrow/ - pyarrow=${{ inputs.pyarrow-version }}/" ${{ inputs.environment-file }}
2222
cat ${{ inputs.environment-file }}
2323
shell: bash
2424
if: ${{ inputs.pyarrow-version }}

.github/workflows/ubuntu.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ jobs:
2929
matrix:
3030
env_file: [actions-38.yaml, actions-39.yaml, actions-310.yaml]
3131
pattern: ["not single_cpu", "single_cpu"]
32-
pyarrow_version: ["7", "8", "9"]
32+
pyarrow_version: ["7", "8", "9", "10"]
3333
include:
3434
- name: "Downstream Compat"
3535
env_file: actions-38-downstream_compat.yaml

ci/deps/actions-310.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ dependencies:
4242
- psycopg2
4343
- pymysql
4444
- pytables
45-
- pyarrow<10
45+
- pyarrow
4646
- pyreadstat
4747
- python-snappy
4848
- pyxlsb

ci/deps/actions-38-downstream_compat.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ dependencies:
4040
- openpyxl
4141
- odfpy
4242
- psycopg2
43-
- pyarrow<10
43+
- pyarrow
4444
- pymysql
4545
- pyreadstat
4646
- pytables

ci/deps/actions-38.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ dependencies:
4040
- odfpy
4141
- pandas-gbq
4242
- psycopg2
43-
- pyarrow<10
43+
- pyarrow
4444
- pymysql
4545
- pyreadstat
4646
- pytables

ci/deps/actions-39.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ dependencies:
4141
- pandas-gbq
4242
- psycopg2
4343
- pymysql
44-
- pyarrow<10
44+
- pyarrow
4545
- pyreadstat
4646
- pytables
4747
- python-snappy

ci/deps/circle-38-arm64.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ dependencies:
4040
- odfpy
4141
- pandas-gbq
4242
- psycopg2
43-
- pyarrow<10
43+
- pyarrow
4444
- pymysql
4545
# Not provided on ARM
4646
#- pyreadstat

doc/source/whatsnew/v2.0.0.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,8 @@ Other API changes
466466
- :meth:`Index.astype` now allows casting from ``float64`` dtype to datetime-like dtypes, matching :class:`Series` behavior (:issue:`49660`)
467467
- Passing data with dtype of "timedelta64[s]", "timedelta64[ms]", or "timedelta64[us]" to :class:`TimedeltaIndex`, :class:`Series`, or :class:`DataFrame` constructors will now retain that dtype instead of casting to "timedelta64[ns]"; timedelta64 data with lower resolution will be cast to the lowest supported resolution "timedelta64[s]" (:issue:`49014`)
468468
- Passing ``dtype`` of "timedelta64[s]", "timedelta64[ms]", or "timedelta64[us]" to :class:`TimedeltaIndex`, :class:`Series`, or :class:`DataFrame` constructors will now retain that dtype instead of casting to "timedelta64[ns]"; passing a dtype with lower resolution for :class:`Series` or :class:`DataFrame` will be cast to the lowest supported resolution "timedelta64[s]" (:issue:`49014`)
469-
- Passing a ``np.datetime64`` object with non-nanosecond resolution to :class:`Timestamp` will retain the input resolution if it is "s", "ms", or "ns"; otherwise it will be cast to the closest supported resolution (:issue:`49008`)
469+
- Passing a ``np.datetime64`` object with non-nanosecond resolution to :class:`Timestamp` will retain the input resolution if it is "s", "ms", "us", or "ns"; otherwise it will be cast to the closest supported resolution (:issue:`49008`)
470+
- Passing a string in ISO-8601 format to :class:`Timestamp` will retain the resolution of the parsed input if it is "s", "ms", "us", or "ns"; otherwise it will be cast to the closest supported resolution (:issue:`49737`)
470471
- The ``other`` argument in :meth:`DataFrame.mask` and :meth:`Series.mask` now defaults to ``no_default`` instead of ``np.nan`` consistent with :meth:`DataFrame.where` and :meth:`Series.where`. Entries will be filled with the corresponding NULL value (``np.nan`` for numpy dtypes, ``pd.NA`` for extension dtypes). (:issue:`49111`)
471472
- Changed behavior of :meth:`Series.quantile` and :meth:`DataFrame.quantile` with :class:`SparseDtype` to retain sparse dtype (:issue:`49583`)
472473
- When creating a :class:`Series` with a object-dtype :class:`Index` of datetime objects, pandas no longer silently converts the index to a :class:`DatetimeIndex` (:issue:`39307`, :issue:`23598`)
@@ -807,6 +808,7 @@ Datetimelike
807808
- Bug in :func:`to_datetime` was throwing ``ValueError`` when parsing dates with ISO8601 format where some values were not zero-padded (:issue:`21422`)
808809
- Bug in :func:`to_datetime` was giving incorrect results when using ``format='%Y%m%d'`` and ``errors='ignore'`` (:issue:`26493`)
809810
- Bug in :func:`to_datetime` was failing to parse date strings ``'today'`` and ``'now'`` if ``format`` was not ISO8601 (:issue:`50359`)
811+
- Bug in :meth:`Timestamp.round` when the ``freq`` argument has zero-duration (e.g. "0ns") returning incorrect results instead of raising (:issue:`49737`)
810812
- Bug in :func:`to_datetime` was not raising ``ValueError`` when invalid format was passed and ``errors`` was ``'ignore'`` or ``'coerce'`` (:issue:`50266`)
811813
- Bug in :class:`DateOffset` was throwing ``TypeError`` when constructing with milliseconds and another super-daily argument (:issue:`49897`)
812814
-
@@ -839,6 +841,7 @@ Conversion
839841
- Bug in :meth:`Series.convert_dtypes` not converting dtype to nullable dtype when :class:`Series` contains ``NA`` and has dtype ``object`` (:issue:`48791`)
840842
- Bug where any :class:`ExtensionDtype` subclass with ``kind="M"`` would be interpreted as a timezone type (:issue:`34986`)
841843
- Bug in :class:`.arrays.ArrowExtensionArray` that would raise ``NotImplementedError`` when passed a sequence of strings or binary (:issue:`49172`)
844+
- Bug in :meth:`Series.astype` raising ``pyarrow.ArrowInvalid`` when converting from a non-pyarrow string dtype to a pyarrow numeric type (:issue:`50430`)
842845
- Bug in :func:`to_datetime` was not respecting ``exact`` argument when ``format`` was an ISO8601 format (:issue:`12649`)
843846
- Bug in :meth:`TimedeltaArray.astype` raising ``TypeError`` when converting to a pyarrow duration type (:issue:`49795`)
844847
-

environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ dependencies:
4343
- odfpy
4444
- py
4545
- psycopg2
46-
- pyarrow<10
46+
- pyarrow
4747
- pymysql
4848
- pyreadstat
4949
- pytables

pandas/_libs/tslibs/conversion.pyx

Lines changed: 33 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -405,7 +405,8 @@ cdef _TSObject convert_datetime_to_tsobject(
405405

406406

407407
cdef _TSObject _create_tsobject_tz_using_offset(npy_datetimestruct dts,
408-
int tzoffset, tzinfo tz=None):
408+
int tzoffset, tzinfo tz=None,
409+
NPY_DATETIMEUNIT reso=NPY_FR_ns):
409410
"""
410411
Convert a datetimestruct `dts`, along with initial timezone offset
411412
`tzoffset` to a _TSObject (with timezone object `tz` - optional).
@@ -416,6 +417,7 @@ cdef _TSObject _create_tsobject_tz_using_offset(npy_datetimestruct dts,
416417
tzoffset: int
417418
tz : tzinfo or None
418419
timezone for the timezone-aware output.
420+
reso : NPY_DATETIMEUNIT, default NPY_FR_ns
419421
420422
Returns
421423
-------
@@ -427,16 +429,19 @@ cdef _TSObject _create_tsobject_tz_using_offset(npy_datetimestruct dts,
427429
datetime dt
428430
Py_ssize_t pos
429431

430-
value = npy_datetimestruct_to_datetime(NPY_FR_ns, &dts)
432+
value = npy_datetimestruct_to_datetime(reso, &dts)
431433
obj.dts = dts
432434
obj.tzinfo = timezone(timedelta(minutes=tzoffset))
433-
obj.value = tz_localize_to_utc_single(value, obj.tzinfo)
435+
obj.value = tz_localize_to_utc_single(
436+
value, obj.tzinfo, ambiguous=None, nonexistent=None, creso=reso
437+
)
438+
obj.creso = reso
434439
if tz is None:
435-
check_overflows(obj, NPY_FR_ns)
440+
check_overflows(obj, reso)
436441
return obj
437442

438443
cdef:
439-
Localizer info = Localizer(tz, NPY_FR_ns)
444+
Localizer info = Localizer(tz, reso)
440445

441446
# Infer fold from offset-adjusted obj.value
442447
# see PEP 495 https://www.python.org/dev/peps/pep-0495/#the-fold-attribute
@@ -454,6 +459,7 @@ cdef _TSObject _create_tsobject_tz_using_offset(npy_datetimestruct dts,
454459
obj.dts.us, obj.tzinfo, fold=obj.fold)
455460
obj = convert_datetime_to_tsobject(
456461
dt, tz, nanos=obj.dts.ps // 1000)
462+
obj.ensure_reso(reso) # TODO: more performant to get reso right up front?
457463
return obj
458464

459465

@@ -490,7 +496,7 @@ cdef _TSObject _convert_str_to_tsobject(object ts, tzinfo tz, str unit,
490496
int out_local = 0, out_tzoffset = 0, string_to_dts_failed
491497
datetime dt
492498
int64_t ival
493-
NPY_DATETIMEUNIT out_bestunit
499+
NPY_DATETIMEUNIT out_bestunit, reso
494500

495501
if len(ts) == 0 or ts in nat_strings:
496502
ts = NaT
@@ -513,19 +519,26 @@ cdef _TSObject _convert_str_to_tsobject(object ts, tzinfo tz, str unit,
513519
&out_tzoffset, False
514520
)
515521
if not string_to_dts_failed:
522+
reso = get_supported_reso(out_bestunit)
516523
try:
517-
check_dts_bounds(&dts, NPY_FR_ns)
524+
check_dts_bounds(&dts, reso)
518525
if out_local == 1:
519-
return _create_tsobject_tz_using_offset(dts,
520-
out_tzoffset, tz)
526+
return _create_tsobject_tz_using_offset(
527+
dts, out_tzoffset, tz, reso
528+
)
521529
else:
522-
ival = npy_datetimestruct_to_datetime(NPY_FR_ns, &dts)
530+
ival = npy_datetimestruct_to_datetime(reso, &dts)
523531
if tz is not None:
524532
# shift for _localize_tso
525-
ival = tz_localize_to_utc_single(ival, tz,
526-
ambiguous="raise")
527-
528-
return convert_to_tsobject(ival, tz, None, False, False)
533+
ival = tz_localize_to_utc_single(
534+
ival, tz, ambiguous="raise", nonexistent=None, creso=reso
535+
)
536+
obj = _TSObject()
537+
obj.dts = dts
538+
obj.value = ival
539+
obj.creso = reso
540+
maybe_localize_tso(obj, tz, obj.creso)
541+
return obj
529542

530543
except OutOfBoundsDatetime:
531544
# GH#19382 for just-barely-OutOfBounds falling back to dateutil
@@ -538,10 +551,12 @@ cdef _TSObject _convert_str_to_tsobject(object ts, tzinfo tz, str unit,
538551
pass
539552

540553
try:
541-
dt = parse_datetime_string(ts, dayfirst=dayfirst,
542-
yearfirst=yearfirst)
543-
except (ValueError, OverflowError):
544-
raise ValueError("could not convert string to Timestamp")
554+
# TODO: use the one that returns reso
555+
dt = parse_datetime_string(
556+
ts, dayfirst=dayfirst, yearfirst=yearfirst
557+
)
558+
except (ValueError, OverflowError) as err:
559+
raise ValueError("could not convert string to Timestamp") from err
545560

546561
return convert_datetime_to_tsobject(dt, tz)
547562

0 commit comments

Comments
 (0)