I have my local LLM currently setup and it runs just as well as Sonnet 4.6 from a quality standpoint, and for performance it is slightly slower but it’s still faster than I can respond.
This is with a Strix Halo APU with 128GB unified memory using the latest Qwen3.6 models with llama.cpp.
I have my local LLM currently setup and it runs just as well as Sonnet 4.6 from a quality standpoint, and for performance it is slightly slower but it’s still faster than I can respond.
This is with a Strix Halo APU with 128GB unified memory using the latest Qwen3.6 models with llama.cpp.