Skip to content

to_datetime() ns precision inconsistencies between s and ns #15817

@mchwalisz

Description

@mchwalisz

Code Sample, a copy-pastable example if possible

import pandas as pd
pd.options.display.float_format = '{:.20f}'.format
with open('example.csv', 'w') as f:
    f.write('''
full,ts,nano
1490195805.433502912,1490195805,433502912
1490195805.933358907,1490195805,933358907
1490195806.433445930,1490195806,433445930
1490195806.933351039,1490195806,933351039
''')
df = pd.read_csv('example.csv')
df['from two ints'] = pd.to_datetime(df['ts'] * 10**9 + df['nano'], unit='ns')
df['from float to s'] = pd.to_datetime(df['full'], unit='s')
df['from float^9 to ns'] = pd.to_datetime(df['full'] * 10**9, unit='ns')

print('Types:')
print(df.dtypes)
print('Data:')
print(df.iloc[0])

Output

Types:
full                         float64
ts                             int64
nano                           int64
from two ints         datetime64[ns]
from float to s       datetime64[ns]
from float^9 to ns    datetime64[ns]
dtype: object
Data:
full                 1490195805.43350267410278320312
ts                                        1490195805
nano                                       433502912
from two ints          2017-03-22 15:16:45.433502912
from float to s           2017-03-22 15:16:45.433503
from float^9 to ns     2017-03-22 15:16:45.433502720
Name: 0, dtype: object

Problem description

I would expect that all of the following methods would give the same result. In the documentation that the epoch times will be rounded to the nearest nanosecond. I don't understand why conversion from float using seconds s unit gives different rounding than nano seconds ns. Not to mention strange float parsing coming from csv.

I expect the whole issue to be connected to #7307.

Expected Output

Types:
full                         float64
ts                             int64
nano                           int64
from two ints         datetime64[ns]
from float to s       datetime64[ns]
from float^9 to ns    datetime64[ns]
dtype: object
Data:
full                            1490195805.433502912
ts                                        1490195805
nano                                       433502912
from two ints          2017-03-22 15:16:45.433502912
from float to s        2017-03-22 15:16:45.433502912
from float^9 to ns     2017-03-22 15:16:45.433502912
Name: 0, dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.9.10-040910-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.3.2
Cython: 0.25.2
numpy: 1.10.4
scipy: 0.19.0
statsmodels: None
xarray: None
IPython: 5.3.0
sphinx: 1.4.8
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: 3.3.0
numexpr: 2.6.2
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
httplib2: 0.9.1
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions