
NVIDIA’s New AI: A Revolution...For Free!
Audio Summary
AI Summary
A new AI assistant, Nemotron 3 Super, has been released, which is free and open-source, accompanied by a 51-page research paper detailing its creation and training data. This is a significant departure from most proprietary AI systems, which require subscriptions and have undisclosed internal workings and training data. Nemotron 3 Super was trained on 25 trillion tokens and emerged as a 120-billion parameter AI assistant. Its intelligence level is comparable to leading closed frontier models from about a year and a half ago, which cost billions to train and were kept secret.
Nemotron 3 Super performs well in most tests, matching some of the best open models, though it lags slightly in a few areas. A notable innovation is its speed. The model comes in two versions, BF16 and NVFP4, with similar accuracy. However, the NVFP4 version is approximately 3.5 times faster than the BF16 version and up to 7 times faster than other similarly smart open models. This speed improvement, without a significant loss in accuracy, is a major breakthrough.
The paper reveals four key secrets behind this performance. First, NVFP4 achieves its speed by compressing the mathematics it uses. It rounds off less sensitive digits in long numbers, effectively reducing computational work. While normally this would lead to significant accuracy loss, scientists carefully applied this rounding only where it wouldn't cause problems, resulting in a 7-times speedup with no meaningful accuracy loss.
Second, the system employs "multi-token prediction." Unlike other AI techniques that generate responses token by token (or word by word), Nemotron 3 Super calculates several future tokens (specifically, 7 tokens) at once and then verifies them in a single go, leading to another massive speed increase.
Third, the model incorporates "mamba layers" to address memory issues common in traditional AI systems. Instead of constantly re-reading information, these layers process data efficiently by taking highly compressed notes and remembering important conversation details while discarding filler words. This allows the system to handle large amounts of data effectively.
Finally, to combat error magnification during step-by-step answer generation due to numerical rounding, scientists introduced "stochastic rounding." This technique adds carefully crafted random noise that averages to zero, ensuring that while individual steps might be slightly off, the overall calculation remains accurate over many steps.
Despite these advancements, the system is not perfect; complex tasks, like assembling robotic cows with extensive math, can still take nearly an hour to process. Nevertheless, the release of Nemotron 3 Super signifies a major shift in the AI landscape, challenging the dominance of closed systems. NVIDIA's commitment to investing billions in fully open systems like this suggests a future with more accessible and powerful AI for everyone.