Case Study: GenAI for 7.7M Customers with Galileo
Industry
Education
COMPANY OVERVIEW
A world-leading learning company offers digital and physical textbook rentals, online tutoring, and other student services. Primarily aimed at high school and college students, the company looks to enhance academic outcomes and save students money through direct engagement with a wide range of educational resources and tools.
CHALLENGE
When the company aimed to develop and launch new generative AI products, they partnered with Galileo. They had two key product initiatives, a chat interface capable of handling up to 10,000 student requests per minute, and an agent-based product to develop personalized learning journeys for students.
Before partnering with Galileo, their team struggled to scale prompt engineering and experimentation. They tried mapping existing prompt libraries to their needs and use cases, however the process of manually and repeatedly testing and refining prompts quickly proved tedious, time-consuming, and error-prone.
Furthermore, as a provider of high-quality education to students worldwide, the company needed to manage trust and safety issues at a production scale, including handling PII, preventing prompt injections, and ensuring on-brand responses. However they had yet to adopt observability and monitoring tools due to uncertainties regarding how industry metrics correlated with their internal performance metrics.
SOLUTION
With Galileo, their AI team streamlined their LLM development workflow. Using Galileo Evaluate, ML engineers collaborated with product teams to accelerate prompt and system experimentation. Feedback from product managers and subject matter experts was integrated directly through the Galileo UI, allowing rapid iteration of prompts and retrieval mechanisms. They personalized their evaluation framework using both Galileo’s and their own custom evaluation models. These capabilities integrated with the company's action store, designed to orchestrate AI-driven prompts, tools, and skills, enhancing both agent capabilities and user experience.
RESULTS
Using Galileo’s end-to-end evaluation and observability platform, their team has been able to apply a scalable and consistent evaluation framework across their genAI project lifecycle. For one, they have built confidence in their prompt store and prompt optimization workflows. Their evaluation framework, powered by Galileo and their own evaluation models, coupled with streamlined and collaborative development workflows, has helped the company accelerate development, experimentation, and time-to-market.
"We’re developing proprietary LLMs and AI agents with a very high focus on content quality and accuracy. Galileo’s end-to-end platform has proven instrumental as we work to productionize our next generation of products at global scale."
– AI/ML Tech Lead