LLAMA 3.1 8B, 70B and 405B

Scribd, Inc.

News card image
scribd logo
CASE STUDY

Delivering faster, cheaper and more accurate results with a Llama-powered AI content discovery assistant

At a glance

Industry: Technology

Use case: Enhancing real-time content discovery

Goal: Improve discovery and engagement with generative AI

Llama versions: Llama 8B, 70B and 405B

Deployment: Amazon Web Services (AWS) and Databricks batch inference

+76%
faster throughput in tokens per second vs. the previous model
97.7%
accurate Macro-F1 score on intent detection
33%
compute cost savings with JSON output

*All results are self-reported and not identifiably repeatable. Generally expected individual results will differ.

THEIR STORY

Sparking human curiosity with content

Scribd, Inc. is a multinational tech company focused on the written and spoken word. The company’s three products — Everand, Scribd and SlideShare — deliver knowledge, information and inspiration to over 200 million monthly unique visitors across the globe. The Everand library is home to millions of bestselling and emerging audiobooks, ebooks, podcasts and magazines.

THEIR GOAL

Enabling faster, more personalized and engaging content discovery

Scribd, Inc. had developed an AI-powered content discovery assistant, Ask AI, to help its Everand digital library users overcome limited search functionality. But they wanted to up-level the solution to deliver personalized recommendations by combining knowledge of more than 195 million content pieces on Everand with nuanced understanding of each customer.

THEIR SOLUTION

An AI content discovery assistant powered by Llama

Llama enabled Scribd, Inc. to enhance Ask AI’s responses, optimize performance and manage infrastructure costs. As an open-source product, Llama was cost-effective and highly adaptable. Compared with other closed and open-source models, Llama delivered the essential component for success: high accuracy. Llama’s semantic understanding and unmatched accuracy provided the nuanced responses to transform Ask AI.

Thanks to Llama offering the most comprehensive deployment options of any model provider, the team quickly found an approach compatible with its existing systems: Amazon Web Services (AWS) and Databricks batch inference. Scribd, Inc. integrated Llama into its Ask AI assistant workflow without any major infrastructure changes.

their solution graphic

The new Ask AI experience provides on-point responses to nuanced questions.

THEIR APPROACH

A multi-layered, model distillation strategy

The Scribd, Inc. team used three Llama models to create the new Ask AI: Llama 3.1 405B, 8B and 70B. To push beyond closed model limitations and achieve deeper customization, the team used Llama 3.1 405B to create synthetic training data and fine-tuned Llama 3.1 8B. With the latter, Scribd, Inc. was able to deliver better results with minimal latency for real-time components of Ask AI while managing the model’s footprint and computing demands.

In the background, Llama 3.1 70B generates content metadata for Ask AI’s knowledge base to improve content discovery and answer accuracy.

their solution graphic

Ask AI uses multiple Llama models to achieve performance.

THEIR SUCCESS

Streamlined content discovery strengthens customer relationships

The new Llama-powered Ask AI streamlines the content discovery process by generating intuitive recommendations fast — and delivering a better, more personalized customer experience. At the same time, Llama has optimized the application for Scribd, Inc.

• +76% faster throughput in tokens per second vs. the previous model
• 97.7% accurate Macro-F1 score on intent detection
• 33% compute cost savings with JSON output
*All results are self-reported and not identifiably repeatable. Generally expected individual results will differ.
Llama stood out because of its superior performance in understanding broad user intents, which is critical for content discovery. It offered a cost-effective solution with high accuracy and speed, outperforming other options.
"Llama stood out because of its superior performance in understanding broad user intents, which is critical for content discovery. It offered a cost-effective solution with high accuracy and speed, outperforming other options.""

Steve Neola Tarazi, Senior Director of Product, Generative AI, Scribd, Inc.

Models used

Create generative AI applications for business with open-source large language models that bring unmatched control, customization and flexibility.
Start building
applications icon
Text

Llama 3.1 8B, 70B and 405B

•
State-of-the-art multilingual open-source large language model
•
Llama Guard 3 8B and Prompt Guard are included
*Licensed under Llama 3.1 Community License Agreement
Download models
Horizon banner image

Stay up-to-date

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with the latest Llama updates, releases and more.

Explore more
Dec 10, 2025
Stoque
Unifying internal knowledge for faster insights and efficiency
Tech
Read story
Nov 14, 2025
Contextual AI
Reducing hallucinations with a better context layer for AI agents
Tech
Read story
Nov 13, 2025
AITEM
Equipping veterinarians to make faster, better clinical decisions
Tech
Read story