Contextual Language Model (CLM) Leverages the WEKA Data Platform to Power More Secure, Accurate, and Efficient Use of Enterprise AI
WekaIO (WEKA), the AI-native data platform company, announced it is working with Contextual AI, the company building AI to change how the world works, to provide the data infrastructure underpinning its Contextual Language Models (CLMs). Contextual AI’s CLMs are trained using RAG 2.0, a proprietary next-generation retrieval-augmented generation (RAG) approach developed by Contextual AI, now powered by the WEKA® Data Platform. CLMs power safe, accurate, and trustworthy AI applications for Fortune 500 enterprises on Contextual AI’s platform.
Developing the Next Generation of Enterprise AI Models
Founded in 2023, Contextual AI delivers a turnkey platform for building enterprise AI applications powered by its state-of-the-art RAG 2.0 technology. Unlike traditional RAG pipelines, which stitch together a frozen model for embeddings, a vector database for retrieval, and a black box generation model, RAG 2.0 provides a single end-to-end integrated system, enabling higher accuracy, better compliance, less hallucination, and the ability to attribute answers back to source documents.
Generative AI workloads have significant performance, data management, and computational power requirements that can make them time- and resource-intensive to train and serve. Contextual AI leverages large, diverse datasets to train its CLMs. While training, the company initially encountered performance bottlenecks and scale challenges that caused poor GPU utilization and delayed its AI model development times.
Architecting a Data Management System to Maximize GPU Utilization
Increasing GPU utilization is critical to ensure AI systems and workloads run at peak efficiency. The WEKA Data Platform’s advanced AI-native architecture is purpose-built to accelerate every step of the AI pipeline, creating frictionless data pipelines that saturate GPUs with data to ensure they run more effectively so AI workloads run faster and more sustainably. Cloud and hardware-agnostic, WEKA’s software solution is designed to be deployed anywhere, and its zero-copy, zero-tune architecture dynamically supports every AI workload profile—handling metadata operations across millions of small files during model training and massive write performance during model checkpoint operations—in a single data platform.
Contextual AI deployed the WEKA Data Platform on Google Cloud to create a high-performance data infrastructure layer that manages all its datasets—100TBs in total—for AI model training. The WEKA platform delivered a significant leap in data performance that directly correlated to increasing developer productivity and accelerating model training times.
In addition to fast data movement from storage to accelerator, the WEKA platform provided Contextual AI with seamless metadata handling, checkpointing, and data preprocessing capabilities that have eliminated performance bottlenecks in its training processes, improved GPU utilization and helped lower its cloud costs.
“Training large-scale AI models in the cloud requires a modern data management solution that can deliver high GPU utilization and accelerate the wall clock time for model development,” said Amanpreet Singh, CTO & co-founder of Contextual AI. “With the WEKA Data Platform, we now have the robust data pipelines needed to power next-gen GPUs and build state-of-the-art generative AI solutions at scale. It works like magic to turn fast, ephemeral storage into persistent, affordable data.”
Key Outcomes Achieved With the WEKA Data Platform:
- 3x Performance Improvements: Achieved a threefold increase in performance for key AI use cases thanks to a significant increase in GPU utilization.
- 4x Faster AI Model Checkpointing: Eliminated delays in model checkpoint completion to achieve a 4x improvement in checkpointing processes, dramatically improving developer productivity.
- 38% Cost Reduction: Associated cloud storage costs were reduced by 38 percent per terabyte.
“Generative AI holds virtually unlimited potential to unlock insights and new value creation for enterprises, but many are still challenged with where to begin and how to move their AI projects forward,” said Jonathan Martin, president at WEKA. “Contextual AI is innovating the future of enterprise AI by creating advanced generative AI solutions that help organizations tap AI’s potential much, much faster. WEKA is proud to be helping Contextual AI overcome critical data management challenges to accelerate training of reliable, trustworthy AI models that will advance the AI revolution.”