Novita AI's Partnership with SGLang Revolutionizes AI Inference Capabilities

Novita AI Partners with SGLang to Enhance AI Inference

On May 22, 2025, Novita AI, a prominent global player in the artificial intelligence cloud platform space, officially announced a strategic alliance with SGLang, a cutting-edge engine designed for inference in large language models and vision models. This collaboration aims to boost AI inference performance by leveraging Novita AI’s high-performance GPU cloud resources, crucial for SGLang’s research, benchmarking, and optimization efforts.

About SGLang

SGLang has established itself as a formidable inference engine, uniquely combining structured generation language design with highly optimized runtime capabilities. This innovative approach allows for significant performance improvements, such as efficient RadixAttention cache reuse and zero-overhead batch scheduling. Such enhancements empower developers to create advanced generation workflows, multi-modal applications, and robust parallel inference pipelines, all while maintaining high reliability and scalability. SGLang's effectiveness has attracted support from leading tech companies like NVIDIA, AMD, xAI, Oracle Cloud, and Google Cloud, as well as academic institutions such as Stanford University and the University of California system, which speaks volumes about its widespread industry engagement and adoption.

The Strategic Partnership

In a statement about this collaboration, Junyu Huang, the Co-Founder and COO of Novita AI, expressed excitement about the potential enhancements in performance across multiple industries. He highlighted that SGLang seamlessly integrates language-level control with backend optimizations, unlocking a new realm of potential for developers.

As part of this partnership, Novita AI has already been instrumental in the development of SGLang's first end-to-end multi-turn reinforcement learning (RL) framework and its multi-large language model serving system. This commitment aims to bolster SGLang's innovation trajectory and enhance inference performances in a way that can be utilized across different applications. Huang further noted that the two companies are also collaborating on a large-scale expert parallelism project, which will push the performance thresholds outlined in the DeepSeek blog, indicating an ambitious approach to push the boundaries of AI inference capabilities.

Commitment to Open Ecosystem

Novita AI's commitment to supporting an open ecosystem of inference engines like SGLang illustrates its dedication to promoting diverse research initiatives and shared infrastructure. The collaboration highlights Novita AI's mission of democratizing access to AI technology, making state-of-the-art inference capabilities accessible to developers around the globe.

Learn More About Novita AI

Novita AI simplifies the deployment of AI models through an easy-to-use API, backed by reliable and cost-effective GPU cloud infrastructure. An advocate for open-source libraries focused on large language model inference and serving, Novita AI is set on driving forward the future of AI and fostering ongoing innovation across the technology sector. To understand more about their offerings, visit www.novita.ai.

As this partnership unfolds, industry observers keenly await the groundbreaking advancements that may emerge, further positioning Novita AI and SGLang at the forefront of AI inference technology.