AMD Ryzen AI Max+ 395, also known as Strix Halo, is currently the most powerful x86 APU available and offers a big performance boost over other options. It features 16 Gen5 CPU cores, over 50 peak AI TOPS with XDNA2 NPU, and a large integrated GPU with 40 AMD RDNA 3.5 Compute Units. This makes it a major upgrade for high-end thin-and-light devices. You can get the Ryzen AI Max Plus 395 with system memory options ranging from 32 GB to 128 GB of unified memory, with up to 96 GB of that available as VRAM with AMD Variable Graphics Memory.  

The Ryzen AI Max+ 395 performs especially well with consumer AI tasks such as using LM Studio, which LAMA CPP powers. LM Studio is becoming a popular choice for running language models on your own device, even if you have no technical background. It makes it easy to use new AI text and vision models right away.  

The new AMD Ryzen AI Max series, the Strix Halo platform, continues to lead in LM Studio performance.  

As a primal, the model size is dictated by the number of parameters and the precision used. Generally speaking, doubling the number of parameters (on the same architecture) or the precision will also double the model size. Most of our competitors’ current-generation offerings in this space max out at 32 GB of one-package memory. This is enough shared graphics memory to run large language models (up to 16 GB).  

Benchmarking Text and Vision Language Models in LM Studio 

For this comparison, we used the Asus ROG Flow Z13 with 64 GB of unified memory. We limited the language model size to 16 GB so it would work on a competitor’s 32 GB laptop. We measured latency by looking at the time to first token (how long it takes the model to start responding) and tokens per second.  

The results show that the Asus ROG Flow Z13, which uses the integrated Radeon 8060S and 256GB of bandwidth, easily achieves 2.2 times the token throughput of the Intel Arc 140V.  

The performance uplift is very consistent among different model types (whether you are running Chain of Thought, Deep Seek R1, Deep Tales, or Standard models like Microsoft Phi 4) and different parameter sizes.  

In Time-to-First-Token Benchmarks, the AMD Ryzen AI Max+ 3950X processor is up to 4x faster than the competition on smaller models like LAMA 3.2 3B Instruct.  

For larger models with 7 or 8 billion parameters, like Deep Sea R1 Distal Queen 7B and Deep Sea R1 Distal Llama 8B, the Ryzen AI Max+ 395 is up to 9.1 times faster. With 14 billion parameter models, which is about the largest that fits on a standard 32GB laptop, the Asus ROG Flow Z13 is up to 12.2 times faster than a laptop with an Intel Core Ultra 258V. This is more than 10 times faster than the competition.  

The larger the LLM, the faster the AMD Ryzen AI Max+ 395 processor responds to your queries. Whether you are chatting with the model or giving it large summarization tasks with thousands of tokens, the AMD system will respond much more quickly. This advantage grows as the prompt gets longer, so the more demanding the task, the greater the speed difference. The IBM Grand Light Vision is one example, and the recently launched Google Gemma 3 family of models is another, with both providing highly capable vision capabilities to next-generation AMD AI PCs. Both these models run performantly on an AMD Ryzen AI Max+ 395 processor.  

An interesting point to note here: when running vision models, the time to first token also measures how long the model takes to analyze your image. Vision 3.2 3b is up to 4.6x faster in Google JAMA 3 4b and up to 6x faster in Google JAMA 3 12b. The Asus ROG Flow Z13 came with a 64 GB memory option, so it can also effortlessly run the Google JAMA 3 27b vision model, which is currently considered SOTA (state-of-the-art) in vision.  

Another example is running the Deep Seek R1 distal quan32b in 6-bit precision, while 4 bits are the industry standard for most users. Coding often requires higher precision for accuracy. With this setup, you can code a gaming classic in about 5 minutes.

Source: AMD Ryzen™ AI MAX+ 395 Processor: Breakthrough AI Performance in Thin and Light 

Amazon

Leave a Reply

Your email address will not be published. Required fields are marked *