are there plans to support this [new SOTA open source vision model](https://sharegpt4v.github.io/)? --despite its compact size, the model is able to extract text from images with incredible accuracy.