CommunityBuilt with LlamaCase studiesMajor health system

LLAMA 3.1 8B INSTRUCT AND 70B INSTRUCT

Major health system

News card image
CASE STUDY

Llama-powered clinical documentation expands knowledge bases for medical researchers

At a glance

Industry: Healthcare

Use case: Generating clinical annotations for biospecimen registries

Goal: Reduce manual review and annotation while maintaining patient confidentiality

Llama versions: Llama 3.1 8B Instruct and 70B Instruct

Deployment: Kubernetes orchestration, on-premises infrastructure

Concept proven
Automated, accurate clinical annotation is possible with Llama
Data secured
Protopia AI’s Stained Glass Transform (SGT) protects data during training and inference
70%–80%
Less manual annotation, creating potential savings of up to $176 per patient record

*All results are self-reported and not identifiably repeatable. Generally expected individual results will differ.

THEIR STORY

Advancing healthcare and biomedical research

Major Health System (MHS) is an affiliate of one of the leading medical schools in the United States. It provides comprehensive care in more than 100 medical specialties and supports biomedical research across the healthcare spectrum.

Company size: 29,000 employees, 2,681 licensed beds

UCSD

THEIR GOAL

Automate biospecimen cataloging with AI

Medical researchers depend on biospecimen registries — large collections of biological samples that include patient demographics and health records — for genetic, longitudinal and population-based studies. Creating new entries requires hours of expert annotation and can cost more than $220 per patient.

The MHS team wanted to test AI’s ability to automate the annotation process, reduce costs and increase the number of cases researchers can catalog. Since biospecimen registries are open resources for medical research, the solution would have to share data with third parties while providing airtight data privacy to protect patient records and meet industry privacy regulations.

THEIR SOLUTION

An open-source pipeline that automates clinical annotation

The MHS team built a RAG workflow to extract clinical details from EHR data — including diagnoses, treatments and outcomes — and generate data inputs for biospecimen registries. In experiments with proprietary and open-source LLMs, open-source models performed just as well at far lower costs.

Based on ease of training and the accuracy of inference results, Llama foundation models, fine-tuned Llama models and the medical-specific Me-Llama model are the leading candidates for production deployment. “We got performance on par with the top proprietary models using the most cost-efficient infrastructure available,” says the Medical Director of Data Science for MHS.

THEIR APPROACH

Keeping patient data private and secure was key

The MHS solution needed to process confidential patient information without exposing sensitive data to the AI model. To protect privacy and heighten security, the team integrated Stained Glass Transform (SGT) from Protopia AI into the pipeline. SGT transforms prompts and context into machine-readable, randomized representations before injecting them into the AI model.

their solution graphic

MHS integrated Protopia Stained Glass Transform with a RAG pipeline to preserve patient privacy.

THEIR SUCCESS

Automating clinical annotation can save time, reduce costs and expand services

Initial results show that MHS’s Llama-powered solution can reduce manual annotation 70%-80%, dropping annotation costs from $220 per patient to $44–$66. By proving automated annotation can deliver high-quality results and protect patient confidentiality, the MHS team is helping the medical research community catalog more biospecimens at far lower costs.

    • Concept proven automated, accurate clinical annotation is possible with Llama

    • Data secured protopia AI’s SGT protects data during training and inference

    • 70%–80% less manual annotation, creating potential savings of up to $176 per patient record

*All results are self-reported and not identifiably repeatable. Generally, expected individual results will differ.
Manual clinical annotation is inefficient, expensive and often has issues with inter-reader reliability. By automating even a portion of the data abstraction, significant time and resources can be saved when creating biospecimen registries.
“Manual clinical annotation is inefficient, expensive and often has issues with inter-reader reliability. By automating even a portion of the data abstraction, significant time and resources can be saved when creating biospecimen registries”

Medical Director of Data Science, Major Health System

Models used

Create generative AI applications for business with open-source large language models that bring unmatched control, customization and flexibility.
Start building
applications icon
Text

Llama 3.3 70B

State-of-the-art multilingual open-source large language model
Experience 405B performance and quality at a fraction of the cost
*Licensed under Llama 3.3 Community License Agreement
Download models
applications icon
Text

Llama 3.1 405B

State-of-the-art multilingual open-source large language model
Llama Guard 3 8B and Prompt Guard are included
*Licensed under Llama 3.1 Community License Agreement
Download models
applications icon
Text

Llama 2

OSS, free for research and commercial use
The first openly available large language model instruction-tuned for text
*Licensed under Llama 2 Community License Agreement
Download models
Horizon banner image

Stay up-to-date

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with the latest Llama updates, releases and more.