VLM-Lens

Probes and test cases for Qwen-VL and InternLM, with a focus on color naming, language gaps, and modality gaps in vision-language models.

Abstract placeholder image for VLM-Lens

VLM-Lens studies the behavior of vision-language models through targeted probes and evaluation cases. My contribution focused on building and analyzing examples for systems such as Qwen-VL and InternLM.

The project emphasizes several failure modes that are especially interesting from a linguistic and interpretability perspective, including color naming behavior, cross-lingual inconsistencies, and gaps between visual and textual competence.

This page can be expanded freely in markdown as the project evolves. You can add figures, links, updates, or a fuller narrative of methods and findings here.