Very cool new open source LLM with these capabilities:
- Understanding diagrams, charts, and graphs
- Doing OCR on screens
- Outputting bounding boxes for the locations of objects on screens
- Answering UI-based questions
Very impressive, congrats to the Adept team and open-source contributors.
@naoto_shibata_morph@keita_mitsuhashi_morph charts understanding capabilities might be of interest.
Congratulations Team Fuyu-8B on your successful launch on Producthunt. Your multimodal model is very impressive! For enhancement, how about considering a feature that offers insights about the emotional context of the image, making image captioning more interactive and empathetic? Good luck moving forward!