Vlm

  • Published on
    This post explores ZeroGUI, an online learning framework that eliminates the need for manual data annotation to train GUI agents, achieving significant performance improvements through automated task generation and reward estimation using Vision-Language Models.
  • Published on
    This blog post from Fireworks.ai introduces Document Inlining, a new compound AI system designed to enhance Large Language Model (LLM) interaction with non-textual data like PDFs and images.