Amazon's highly capable multimodal model balancing accuracy, speed, and cost. Processes text and images with 300K context.