LLAMA 3.1 8B INSTRUCT AND 70B INSTRUCT
Major health system
Llama-powered clinical documentation expands knowledge bases for medical researchers
At a glance
Industry: Healthcare
Use case: Generating clinical annotations for biospecimen registries
Goal: Reduce manual review and annotation while maintaining patient confidentiality
Llama versions: Llama 3.1 8B Instruct and 70B Instruct
Deployment: Kubernetes orchestration, on-premises infrastructure
Concept proven
Data secured
70%–80%
*All results are self-reported and not identifiably repeatable. Generally expected individual results will differ.
Advancing healthcare and biomedical research
Major Health System (MHS) is an affiliate of one of the leading medical schools in the United States. It provides comprehensive care in more than 100 medical specialties and supports biomedical research across the healthcare spectrum.
Company size: 29,000 employees, 2,681 licensed beds
THEIR GOAL
Automate biospecimen cataloging with AI
Medical researchers depend on biospecimen registries — large collections of biological samples that include patient demographics and health records — for genetic, longitudinal and population-based studies. Creating new entries requires hours of expert annotation and can cost more than $220 per patient.
The MHS team wanted to test AI’s ability to automate the annotation process, reduce costs and increase the number of cases researchers can catalog. Since biospecimen registries are open resources for medical research, the solution would have to share data with third parties while providing airtight data privacy to protect patient records and meet industry privacy regulations.
THEIR SOLUTION
An open-source pipeline that automates clinical annotation
The MHS team built a RAG workflow to extract clinical details from EHR data — including diagnoses, treatments and outcomes — and generate data inputs for biospecimen registries. In experiments with proprietary and open-source LLMs, open-source models performed just as well at far lower costs.
Based on ease of training and the accuracy of inference results, Llama foundation models, fine-tuned Llama models and the medical-specific Me-Llama model are the leading candidates for production deployment. “We got performance on par with the top proprietary models using the most cost-efficient infrastructure available,” says the Medical Director of Data Science for MHS.
THEIR APPROACH
Keeping patient data private and secure was key
The MHS solution needed to process confidential patient information without exposing sensitive data to the AI model. To protect privacy and heighten security, the team integrated Stained Glass Transform (SGT) from Protopia AI into the pipeline. SGT transforms prompts and context into machine-readable, randomized representations before injecting them into the AI model.
MHS integrated Protopia Stained Glass Transform with a RAG pipeline to preserve patient privacy.
THEIR SUCCESS
Automating clinical annotation can save time, reduce costs and expand services
Initial results show that MHS’s Llama-powered solution can reduce manual annotation 70%-80%, dropping annotation costs from $220 per patient to $44–$66. By proving automated annotation can deliver high-quality results and protect patient confidentiality, the MHS team is helping the medical research community catalog more biospecimens at far lower costs.
• Concept proven automated, accurate clinical annotation is possible with Llama
• Data secured protopia AI’s SGT protects data during training and inference
• 70%–80% less manual annotation, creating potential savings of up to $176 per patient record
Manual clinical annotation is inefficient, expensive and often has issues with inter-reader reliability. By automating even a portion of the data abstraction, significant time and resources can be saved when creating biospecimen registries.
“Manual clinical annotation is inefficient, expensive and often has issues with inter-reader reliability. By automating even a portion of the data abstraction, significant time and resources can be saved when creating biospecimen registries”
Medical Director of Data Science, Major Health System
Models used
Create generative AI applications for business with open-source large language models that bring unmatched control, customization and flexibility.Llama 3.3 70B
Llama 3.1 405B
Llama 2
OSS, free for research and commercial useStay up-to-date
Our latest updates delivered to your inbox
Subscribe to our newsletter to keep up with the latest Llama updates, releases and more.