Goal:
Conduct research on domain-specific fine-tuning of large language models (LLMs) for radiology to improve accuracy in processing, summarizing, and generating diagnostic reports from medical imaging data, evaluating performance gains for potential AI-assisted diagnostic support.
Impact Keywords:
* Radiology AI
* Diagnostic Modeling
* LLM Fine-Tuning
* QLoRA
* Clinical NLP
* Medical Imaging
* Healthcare AI
* Domain Adaptation
* De-Identification
Approach:
LunarTech Lab researched and developed RadiologyLlama-70B, a fine-tuned LLM prototype based on Llama 3-70B, using 6.5 million de-identified radiology reports from MGH (2008–2018) across CT, MRI, X-ray, and fluoroscopy. The study explored full fine-tuning and QLoRA for efficient adaptation, demonstrating superior performance in generating concise impressions from findings.
1. Dataset Preparation:
De-identified reports structured as findings-impression pairs, with preprocessing for privacy (regex filtering), quality control, and synthetic instruction variations for robustness.
2. Model & Training:
Base: Llama 3-70B Instruct. Techniques: Full fine-tuning for max accuracy; QLoRA (4-bit quantization) for efficiency. Optimized with DeepSpeed ZeRO-3, mixed-precision (BF16), and 8x NVIDIA H100 GPUs. Metrics: ROUGE-L, BERTScore, GPT-4o scoring.
3. Evaluation Results:
Fine-tuned variants outperformed baseline (e.g., ROUGE-L: 0.292 vs. 0.149; BERTScore F1: 0.877 vs. 0.846). QLoRA matched full tuning with 50% less compute, reducing hallucinations and improving clinical phrasing.
4. Applications & Compliance:
Explored potential for report drafting, training support, data normalization, and decision aid. Emphasized on-premise deployment for HIPAA/GDPR compliance, with encrypted, auditable processes.
Summary:
This research initiative highlights LunarTech Lab's exploration of LLM adaptation for healthcare, presenting RadiologyLlama-70B as a proof-of-concept that could alleviate radiologist workloads, ensure privacy, and advance AI-driven diagnostics through efficient fine-tuning and domain-specific precision.
