Meta's largest multimodal model with 90B parameters. Supports text and image inputs with strong reasoning capabilities.