-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
DataScienceSome issue about the application in data scienceSome issue about the application in data scienceDiDiThe issue publisher is from DiDiThe issue publisher is from DiDi
Description
Hi, I have some issues when I tried to develop SQLFlow models:
- Analysts usually use Dataframe to manipulate data and use it as input to the Keras model. It is convenient to debug, but SQLFlow tf-codegen uses dataset, which requires additional learning costs.
- It is troublesome to connect with SQLFlow. For models configured under SQLFlow models, if you want to debug locally, you need to implement a train.py yourself, including reading data, defining feature columns, etc.. However, train.py generated locally and train.py generated by SQLFlow do not always behave consistently.
- Usually an analysis task includes feature engineering -> data preprocessing -> model training (prediction). At present, the model zoo only includes the last step. But actually, sharing model between operations is a chain that needs to share the entire data processing. I hope that SQLFlow will also have the ability to do custom data preprocessing and be included in the design of the model zoo.
Metadata
Metadata
Labels
DataScienceSome issue about the application in data scienceSome issue about the application in data scienceDiDiThe issue publisher is from DiDiThe issue publisher is from DiDi