Rebeca Moen
Jun 12, 2025 06:53
NVIDIA’s TensorRT SDK significantly boosts the performance of Stable Diffusion 3.5, reducing VRAM requirements by 40% and doubling efficiency on RTX GPUs.
NVIDIA has unveiled a major enhancement to AI model performance with the introduction of TensorRT, an advanced software development kit (SDK) that significantly boosts the efficiency of Stable Diffusion 3.5 on NVIDIA GeForce RTX and RTX PRO GPUs. According to NVIDIA, this innovation not only doubles the performance of the AI model but also reduces VRAM usage by 40%.
Revolutionizing AI Performance
Generative AI continues to transform digital content creation, with models growing in complexity and VRAM demands. The latest Stable Diffusion 3.5 Large model initially required over 18GB of VRAM, limiting its accessibility. NVIDIA has addressed this by collaborating with Stability AI to apply quantization techniques, particularly FP8 quantization, to reduce VRAM consumption significantly.
The newly optimized models, Stable Diffusion 3.5 Large and Medium, leverage the TensorRT SDK to enhance performance. The SDK optimizes model weights and execution graphs specifically for RTX GPUs, resulting in a 2.3x performance boost for SD3.5 Large and a 1.7x increase for SD3.5 Medium compared to previous PyTorch implementations.
TensorRT for RTX: A Game Changer
Unveiled at Microsoft Build, the TensorRT for RTX is now available as a standalone SDK, enabling developers to easily integrate and optimize AI models on RTX GPUs. This new version allows for just-in-time (JIT) compilation, significantly reducing the time required to optimize models for different GPU classes.
The SDK’s compact size and compatibility with Windows ML make it an attractive option for developers seeking to deploy high-performance AI applications. By integrating TensorRT, developers can achieve substantial performance improvements with minimal memory usage, paving the way for more efficient AI-driven applications.
Broader Implications and Future Prospects
NVIDIA’s collaboration with Stability AI extends beyond optimizations. The companies are working to release Stable Diffusion 3.5 as an NVIDIA NIM microservice, facilitating easier deployment for creators and developers. This microservice is expected to be available in July, offering a streamlined approach to implementing AI models in various applications.
As NVIDIA continues to innovate, its efforts in AI and machine learning are set to redefine the capabilities of generative AI models. With ongoing advancements, stakeholders can anticipate more robust and efficient AI solutions that cater to the growing demands of digital content creation and beyond.
Image source: Shutterstock