Vision demo

Locate & describe

Take or upload a photo, then ask each model. LocateAnything draws boxes; Cosmos answers in text.

Photo

Locate prompts — what to find (one or more)

Cosmos prompt — the question Mode

LocateAnything

Cosmos

Raw model output