Skip to content

Blaze support #10

@s-celles

Description

@s-celles

Hello,

Blaze http://blaze.pydata.org/ is very efficient when you want to connect to a database and
want to retrieve data from a very long table.

It will be nice if gtabview (and maybe tabview also) could display blaze.interactive.InteractiveSymbol (named dat here or ds sometimes)

http://blaze.pydata.org/en/latest/quickstart.html
http://blaze.pydata.org/en/latest/rosetta-pandas.html

With Blaze, you can connect to a database table using

import blaze as bz
table_uri = 'dialect+driver://user:password@host:port/databasename::tablename'
dat = bz.Data(table_uri)

it's much more efficient than

import pandas as pd
import sqlalchemy
db_uri = 'dialect+driver://user:password@host:port/databasename'
engine = sqlalchemy.create_engine(db_uri)
query = 'SELECT * FROM tablename'
df = pd.read_sql(query, con=engine)

which will retrieve the whole table into memory.

Passing a table_uri to gtabview will display a part of table content (without retrieving the whole table into memory)

Blaze comes with a very convenient tool named odo http://odo.readthedocs.org/

DataFrame(s) can be contruct by chunk using `odo``

from blaze import odo, chunks
chunksize = 500
for chunk in odo(ds, chunks(pd.DataFrame), chunksize=chunksize):
    print(chunk)

with odo(..., chunks(pd.DataFrame)) you only have one chunk in memory at a time.

I can provide you a quite big MySQL table with Poitiers weather conditions (from 2011-03-07 to 2015-06-02 every 10 minutes - more than 200'000 rows) to try if you don't have a quite long table.

Kind regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions