Alibaba's Qwen3.5 native vision-language model with 122B total / 10B active parameters using a hybrid linear-attention sparse mixture-of-experts architecture.
Alibaba's Qwen3.5 native vision-language model with 122B total / 10B active parameters using a hybrid linear-attention sparse mixture-of-experts architecture.