Tech Xplore
Visual abilities of language models found to be lacking depth
Pooyan Rahmanzadehgervi, Logan Bolton, Anh Totti Nguyen and Mohammad Reza Taesiri have tested four of the most popular VLMs (GPT-4o, Gemini-1.5 Pro, Claude-3 Sonnet, and Claude-3.5 Sonnet) on ...
5 days ago