Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is much faster though. On my m1 max, describing a picture (quick way to get a pretty large context):

Qwen 3.6 35b a3b: 34 tok/sec

Qwen 3.5 27b: 10 tok/sec

Qwen 3.5 35b a3b: doesn't support image input

 help



I've been using Qwen 3.5 35B-A3B with images as input so I suspect you perhaps didn't include the vision part of the model during testing (I use llama.cpp and I learned I needed to include the separate mmproj part).

What is the quantization level of your Owen 3.6 3b model?



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: