-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Hello,
Blaze http://blaze.pydata.org/ is very efficient when you want to connect to a database and
want to retrieve data from a very long table.
It will be nice if gtabview
(and maybe tabview
also) could display blaze.interactive.InteractiveSymbol
(named dat
here or ds
sometimes)
http://blaze.pydata.org/en/latest/quickstart.html
http://blaze.pydata.org/en/latest/rosetta-pandas.html
With Blaze, you can connect to a database table using
import blaze as bz
table_uri = 'dialect+driver://user:password@host:port/databasename::tablename'
dat = bz.Data(table_uri)
it's much more efficient than
import pandas as pd
import sqlalchemy
db_uri = 'dialect+driver://user:password@host:port/databasename'
engine = sqlalchemy.create_engine(db_uri)
query = 'SELECT * FROM tablename'
df = pd.read_sql(query, con=engine)
which will retrieve the whole table into memory.
Passing a table_uri to gtabview will display a part of table content (without retrieving the whole table into memory)
Blaze comes with a very convenient tool named odo
http://odo.readthedocs.org/
DataFrame(s) can be contruct by chunk using `odo``
from blaze import odo, chunks
chunksize = 500
for chunk in odo(ds, chunks(pd.DataFrame), chunksize=chunksize):
print(chunk)
with odo(..., chunks(pd.DataFrame))
you only have one chunk in memory at a time.
I can provide you a quite big MySQL table with Poitiers weather conditions (from 2011-03-07 to 2015-06-02 every 10 minutes - more than 200'000 rows) to try if you don't have a quite long table.
Kind regards