Week 11: Longitudinal Datasets & RANO Criteria

// 11.1 Beyond a Single Timepoint

Why Longitudinal Data Changes Everything

The BraTS challenge gives you one MRI per patient, taken before surgery. In the real world, a glioblastoma patient may have 15–30 MRI scans over the course of their treatment — baseline, 48 hours post-surgery, post-radiation baseline, then every 2–3 months for surveillance. The clinical question is never “what does the tumor look like right now?” but “is it getting better, worse, or staying the same?”

This shift from single-timepoint segmentation to longitudinal treatment monitoring introduces challenges that don’t exist in BraTS: the tumor changes appearance after each treatment, the brain itself changes shape (resection, edema resolution), and treatment creates imaging artifacts that look like tumor but aren’t (pseudoprogression, radiation necrosis). Understanding these challenges is essential for building clinically useful AI.

87%

Reliability of automated volumetry vs 51% for manual RANO (Lancet Oncology)

20–30%

Of GBM patients experience pseudoprogression after chemoradiation

0.97

ICC for automated volumetry vs 0.42–0.61 for manual 2D measurements

// 11.2 The Clinical Standard

RANO Criteria: How Doctors Measure Response

The Response Assessment in Neuro-Oncology (RANO) criteria are the international standard for determining whether a brain tumor is responding to treatment. They define four response categories based on imaging measurements and clinical status. Understanding RANO is essential because it defines what “success” means for your segmentation model in a clinical context.

RANO 1.0 (2010): Bidimensional Measurements

The original criteria use bidimensional measurements: multiply the longest diameter of the enhancing tumor by its perpendicular diameter. Response is defined as: Complete Response (CR): disappearance of all enhancing tumor. Partial Response (PR): ≥50% decrease in the product of perpendicular diameters. Stable Disease (SD): neither PR nor PD criteria met. Progressive Disease (PD): ≥25% increase in the product, or new lesions.

RANO 2.0 (2023): The Volumetric Shift

Published in the Journal of Clinical Oncology, RANO 2.0 is a major update informed by data from 1,106 glioblastoma patients. The key changes represent lessons learned from a decade of clinical experience:

Post-Radiation Baseline

RANO 2.0 uses the post-radiotherapy MRI as baseline, not the post-surgical scan. This change alone improved the correlation between progression-free survival (PFS) and overall survival (OS) from 0.53 to 0.67. The rationale: comparing pre-treatment to post-treatment scans introduces too much noise from treatment effects.

Volumetric Measurements Now Allowed

RANO 2.0 formally introduces volumetric measurement as an option alongside traditional 2D. This is a direct acknowledgment that bidimensional measurements are unreliable for irregularly shaped tumors. A study comparing RANO 2.0 vs mRANO vs RANO 1.0 found that RANO 2.0 provided the strongest survival risk stratification (HR=3.6). The vorasidenib trial (NEJM 2023) used volumetric RANO for low-grade gliomas, demonstrating clinical trial adoption.

Pseudoprogression Confirmation Window

Confirmation scans are now mandatory only within the first 12 weeks post-radiotherapy (when pseudoprogression incidence is highest at 20–30%), not at all timepoints. The previous iRANO approach required 3-month confirmation for all suspected progression, which censored >50% of patients without improving outcomes.

Simplified FLAIR Assessment

T2/FLAIR evaluation is eliminated for IDH wild-type glioblastoma (except with anti-angiogenic therapy). Data showed that adding FLAIR evaluation did not improve the correlation between PFS and OS. For IDH-mutant tumors with large non-enhancing components, both enhancing and non-enhancing tumor must still be evaluated.

RANO-BM: Brain Metastases

RANO-BM (Lancet Oncology, 2015) defines separate criteria for metastases: measurable disease is ≥10mm in longest diameter (or ≥5mm with thin-slice MRI). Up to 5 target CNS lesions are measured. PR requires ≥30% decrease in the sum of longest diameters; PD requires ≥20% increase or new lesions. CNS response is assessed independently from systemic disease.

💡

Why this matters for AI: RANO 2.0’s endorsement of volumetric measurements is the clinical mandate for automated segmentation. Manual 2D measurements have ICC of only 0.42–0.61 between neuroradiologists. Automated volumetry achieves ICC of 0.97 with processing in under 2 minutes vs 16+ minutes manually. AI segmentation makes reliable RANO volumetric assessment feasible for the first time.

// 11.3 Why Post-Treatment Is Harder

Treatment Effects That Confound Segmentation

Every treatment changes how the tumor looks on MRI. A model trained only on pre-operative BraTS data will fail on post-treatment scans because it has never seen these appearances. Understanding them is critical for building models that work longitudinally.

After Surgery (Days)

The resection cavity appears as a fluid-filled void where the tumor was. Blood products at the cavity rim are hyperintense on T1 (mimicking enhancing tumor) and variable on FLAIR. Post-operative enhancement along the cavity margin is normal and does not indicate residual tumor. Imaging should be acquired within 48 hours post-surgery to establish the post-surgical baseline before enhancement develops. Segmentation models need a separate “surgical cavity” label.

After Radiation (Weeks to Months)

Pseudoprogression (3–6 months post-chemoradiation): transient increase in enhancement caused by treatment-induced endothelial injury and inflammation. Occurs in 20–30% of newly diagnosed GBM patients and is actually associated with better prognosis. On imaging, it’s often indistinguishable from true progression on conventional MRI alone. DSC perfusion (low rCBV, sensitivity 90%, specificity 88%) and DCE perfusion help differentiate.

Radiation necrosis (>6 months): heterogeneous “frond-like” enhancement with central necrosis. rCBV achieves 91% sensitivity and 100% specificity when combined with DTI and MRS. A meta-analysis of AI approaches found overall accuracy of 80% (sensitivity 85%, specificity 69%) for distinguishing pseudoprogression from true progression.

After Anti-Angiogenic Therapy — Bevacizumab (Days)

Pseudoresponse: bevacizumab normalizes abnormal blood-brain barrier permeability, causing rapid decrease in enhancement without true tumor response. Response rates of 35–63% in trials, but no overall survival benefit. rCBV changes at 2 weeks predicted outcome (P=0.002) when enhancement-based RANO could not (P=0.86). Non-enhancing tumor often progresses on T2/FLAIR while enhancement decreases — making enhancement an unreliable metric in bevacizumab-treated patients.

After Immunotherapy

Immune checkpoint inhibitors can trigger inflammatory responses that mimic tumor progression on imaging. The previous iRANO criteria proposed mandatory 3-month confirmation scans, but validation showed this censored >50% of patients without improving PFS-OS correlation. RANO 2.0 now limits confirmation requirements to the first 12 weeks post-RT.

⚠️

The fundamental challenge: Pseudoprogression, radiation necrosis, pseudoresponse, and immunotherapy inflammation all create imaging changes that look like tumor but aren’t. A model that simply measures enhancing volume will systematically miscategorize these patients. Advanced imaging (perfusion, diffusion, spectroscopy) and temporal context (when did treatment happen?) are essential for accurate longitudinal assessment. This is why RANO 2.0 is moving toward multimodal, volumetric, context-aware evaluation.

// 11.4 New Labels for New Problems

Post-Treatment Segmentation Labels

Pre-treatment BraTS has three foreground labels: enhancing tumor (ET), necrosis (NCR), and peritumoral edema (ED). Post-treatment segmentation requires additional classes to capture treatment effects:

Pre-Treatment Labels (BraTS Standard)

Label 0: Background. Label 1: Necrotic/non-enhancing tumor core. Label 2: Peritumoral edema. Label 4: Enhancing tumor. These labels capture a treatment-naive tumor’s anatomy.

Post-Treatment Labels (BraTS 2024 / UCSD-PTGBM)

All pre-treatment labels plus: Surgical cavity (distinct from necrosis — a fluid-filled void where tumor was resected). Radiation necrosis (distinct from tumor necrosis — treatment-induced tissue death). Non-enhancing residual tumor (distinct from edema — residual tumor that doesn’t enhance on T1ce). Blood products (acute vs chronic). These additional classes are critical because conflating surgical cavity with tumor necrosis, or edema with non-enhancing tumor, leads to incorrect volume measurements and wrong RANO classifications.

Inter-rater agreement is lower for post-treatment labels (Dice 0.60–0.75) compared to pre-treatment (0.74–0.85), reflecting the genuine ambiguity of post-treatment imaging. Your model’s ceiling is limited by this human disagreement.

// 11.5 Structuring the Chaos

Organizing Longitudinal Datasets

Longitudinal data is inherently more complex than single-timepoint data. Each patient has multiple scans, each scan has multiple modalities, and the clinical context (treatment dates, molecular markers, RANO assessments) must travel with the imaging data.

      
# Recommended folder structure for longitudinal neuro-oncology data

Patient_001/

  ├── clinical/

  │  ├── demographics.json       # Age, sex, IDH, MGMT, 1p/19q

  │  └── treatment_timeline.json # Surgery, RT, chemo dates

  ├── T0_PreOp/                 # Baseline pre-operative

  │  ├── T1.nii.gz

  │  ├── T1ce.nii.gz

  │  ├── T2.nii.gz

  │  ├── FLAIR.nii.gz

  │  └── seg.nii.gz             # Pre-treatment labels

  ├── T1_PostOp_48hr/           # 48-hour post-surgery

  │  └── ...                    # Post-treatment labels

  ├── T2_PostRT_Baseline/        # Post-radiation (RANO 2.0 baseline)

  ├── T3_FollowUp_3mo/

  ├── T4_FollowUp_6mo/

  ├── T5_FollowUp_9mo/

  └── T6_Progression/

📚

For nnU-Net: Each timepoint is typically treated as a separate training case. If you’re training a post-treatment model, collect only post-treatment timepoints in imagesTr. Mixing pre- and post-treatment data in the same training set without careful stratification will hurt performance. Consider training separate models for pre-treatment (BraTS-style labels) and post-treatment (expanded labels with surgical cavity, radiation necrosis) — or use a single model with the full post-treatment label set if you have sufficient post-treatment annotations.

Database Systems

XNAT is the most widely used platform for longitudinal neuroimaging research. It tracks subjects, experiments (scans), and assessments across time, integrates with PACS, and supports automated processing pipelines. The Heidelberg automated RANO pipeline uses XNAT as its backbone, processing data from 34 institutions. Flywheel is a cloud-based alternative with built-in DICOM management. The LUMIERE Dataset provides longitudinal glioblastoma MRI with expert RANO ratings, serving as a benchmark for automated response assessment.

// 11.6 Aligning the Scans

Longitudinal Registration: When Anatomy Has Changed

Comparing tumor volume between two timepoints requires registering the scans to a common space. But after surgery, the brain has shifted — there’s a cavity where tumor used to be, edema may have resolved, ventricles may have expanded. Standard rigid registration is insufficient.

Registration Strategy

Rigid registration: Corrects for head position differences only. Fast and reliable, but doesn’t account for tissue changes. Use as a first step. Affine registration: Adds global scaling and shearing. Better for accounting for brain shift. Deformable registration (ANTs SyN, FSL FNIRT): Models local tissue deformation. Necessary when anatomy has changed significantly (resection, edema resolution). But over-aggressive deformation can warp tumor regions incorrectly. Quality control of registration is critical — misalignment leads to false volume changes that mimic progression or response.

Practical Approach

For longitudinal RANO assessment, pairwise registration between consecutive timepoints is often preferred over registering everything to a common template. Each scan is registered to its immediately preceding scan, preserving the temporal chain. The 2025 ACR guidelines emphasize that the post-radiotherapy scan should serve as the registration reference (new baseline), not the pre-operative scan.

// 11.7 The Measurement Revolution

Automated Volumetry vs Manual 2D: The Evidence

The landmark Kickingereder et al. study (Lancet Oncology, 2019) is the single most important paper demonstrating why AI-based volumetry matters for treatment response. Tested on 596 patients from the EORTC-26101 trial across 38 institutions:

87%

Reliability of automated volumetric assessment

51%

Reliability of manual RANO assessment between local and central readers

20×

Speed improvement of automated vs manual delineation

Automated volumetric progression detection was a superior surrogate endpoint for overall survival compared to RANO-defined progression. A separate validation on 760 pre-operative and 504 post-operative patients achieved ICC of 0.959 for enhancing tumor volume and 0.960 for cavity volume, with processing up to 20 times faster than manual delineation.

The relationship between 2D and 3D measurements is non-linear because tumors grow irregularly. Studies consistently show ~20% disagreement between 2D and volumetric response classification for pediatric gliomas. A 50% reduction in bidimensional product does not correspond to a consistent volumetric change. This non-linearity is why RANO 2.0 now endorses volumetric assessment.

// 11.8 The Hardest Problem

AI for Pseudoprogression vs True Progression

This is one of the most clinically consequential problems in neuro-oncology imaging. Misclassifying pseudoprogression as true progression may lead to premature treatment changes. Misclassifying true progression as pseudoprogression delays life-saving intervention.

The State of the Art

A 2025 meta-analysis of 26 studies (1,972 patients) found AI algorithms achieve overall accuracy 80%, sensitivity 85%, specificity 69% for differentiating pseudoprogression from true progression. Specificity remains the bottleneck — the models are better at detecting true progression than confirming pseudoprogression. The most successful approaches combine multiple imaging modalities: a Vision Transformer using pre- and post-operative contrast-enhanced T1 achieved AUC of 95.2%, and adding clinical data pushed this to AUC of 99.3%. Combined perfusion parameters (rCBV + ASL) achieved AUC of 0.948.

Why Conventional MRI Isn’t Enough

Conventional MRI has modest performance in the post-treatment setting (sensitivity 68%, specificity 77% per ACR 2025 guidelines). The most powerful discriminators require advanced imaging: DSC perfusion (relative cerebral blood volume), DCE perfusion, diffusion-weighted imaging (ADC maps), MR spectroscopy (Cho/NAA ratios), and amino acid PET. A combined DSC + MRS approach achieved AUC of 0.994 for distinguishing tumor recurrence from radiation necrosis. Future multimodal AI models that integrate these sequences with temporal context will likely achieve near-clinician performance.

// 11.9 Putting It All Together

End-to-End Longitudinal Pipeline

DICOM Reception & Metadata Extraction

Receive DICOM data from PACS. Extract scan date, sequence type, acquisition parameters. Classify the timepoint: pre-op, post-op, post-RT baseline, follow-up. Automated scan-type classifiers achieve 99.7% accuracy.

Preprocessing

DICOM → NIfTI conversion, skull stripping (HD-BET/SynthStrip), N4 bias field correction, intensity normalization, resampling to 1mm³ isotropic. Same pipeline as Week 2 but applied to every timepoint.

Longitudinal Registration

Register each scan to its preceding timepoint using rigid + deformable registration (ANTs SyN). Quality-control the registration visually or with automated metrics. Use the post-RT scan as the RANO 2.0 baseline reference.

Timepoint-Appropriate Segmentation

Use a pre-treatment model for pre-operative scans and a post-treatment model for all subsequent timepoints. Post-treatment models produce surgical cavity, residual tumor, radiation necrosis, and edema labels. nnU-Net or BraTumIA for each timepoint independently.

Volumetric Measurement

Compute volumes for each tumor compartment at each timepoint: enhancing tumor volume, total tumor volume, cavity volume. Calculate bidimensional products for RANO 1.0 compatibility. Generate volume-over-time plots.

RANO Classification

Apply RANO 2.0 thresholds to volume changes relative to the post-RT baseline: ≥25% increase → PD, ≥50% decrease → PR. Flag equivocal cases within the 12-week post-RT window for pseudoprogression confirmation. Integrate clinical data (steroid dose, neurological status) for final classification.

Reporting

Generate structured report with volume trends, RANO classification, and uncertainty flags. Push results (segmentation + volume chart + RANO assessment) back to PACS. The Heidelberg pipeline demonstrated this end-to-end across 34 institutions in a clinical trial.

// 11.10 Special Populations

Pediatric Longitudinal Monitoring

Pediatric brain tumors present unique longitudinal challenges: the brain is still developing (making registration harder), tumor biology differs from adults (more posterior fossa tumors, different molecular subtypes), and treatment protocols are different. The RAPNO (Response Assessment in Pediatric Neuro-Oncology) criteria are analogous to RANO for adults, and AI-RAPNO is an emerging initiative applying AI to pediatric response assessment.

A deep learning pipeline for pediatric tumors (794 pre-operative + 1,003 post-operative MRIs) achieved automated RAPNO scores with ICC of 0.851–0.909 vs manual measurements, and was superior in repeatability for patients with multiple lesions. Two companion Lancet Oncology reviews (2025) laid out the state of the art and the challenges for clinical translation, emphasizing that pediatric datasets are scarce, imaging protocols are heterogeneous, and validated AI tools are urgently needed.

// 11.11 Resources & Further Reading

This Week’s Learning Resources

Essential Papers

PaperWen et al. — RANO 2.0 (J Clin Oncology, 2023)

The foundational RANO 2.0 publication. Read the full text — it defines the clinical standard your AI tools will be measured against. Unified criteria for all glioma grades, volumetric option, post-RT baseline, simplified FLAIR assessment.

J Clin Oncol. 2023;41(33):5187–5199

PaperKickingereder et al. — Automated Volumetric Response Assessment (Lancet Oncology, 2019)

The paper that proved automated volumetry outperforms manual RANO. 596 patients, 38 institutions, 87% vs 51% reliability. Volumetric time-to-progression was a superior OS surrogate. The strongest evidence for AI-based longitudinal monitoring.

Lancet Oncol. 2019;20(5):728–740

PaperEllingson et al. — Operationalizing RANO 2.0 (AJNR, 2024)

Practical step-by-step guide for implementing RANO 2.0. Includes representative cases and challenging scenarios. Read this after the RANO 2.0 paper to understand how it works in practice.

AJNR. 2024;45(12):1846–1856

PaperLin et al. — RANO-BM for Brain Metastases (Lancet Oncology, 2015)

The RANO criteria for brain metastases. Different thresholds and measurement approach from glioma RANO. Essential if you work on the BraTS-METS challenge.

Lancet Oncol. 2015;16(6):e270–e278

PaperAkbari et al. — Histopathology-Validated ML for Pseudoprogression (Cancer, 2020)

87% accuracy (AUC 0.92) for distinguishing true progression from pseudoprogression, validated against biopsy. The gold standard for pseudoprogression AI.

Cancer. 2020;126(11):2625–2636

Datasets & Tools

ToolLUMIERE Dataset — Longitudinal GBM MRI with RANO

Publicly available longitudinal glioblastoma dataset with expert RANO ratings at each timepoint. The benchmark for automated response assessment.

ToolBraTS 2025 Challenge

Includes post-treatment segmentation tasks with expanded label sets. Check for longitudinal and post-treatment tracks that extend beyond single-timepoint pre-operative segmentation.

ToolXNAT — Neuroimaging Data Platform

The standard platform for organizing longitudinal neuroimaging studies. Supports subject tracking, automated processing pipelines, and multi-site coordination. Used by the Heidelberg RANO pipeline.

Deep Dives

PaperKann et al. — AI-RAPNO Part 1: State of the Art (Lancet Oncology, 2025)

Comprehensive review of AI for pediatric neuro-oncology response assessment. The roadmap for applying everything in this course to pediatric brain tumors.

PaperBoxerman et al. — rCBV Identifies Pseudoresponse (Frontiers in Oncology, 2023)

From the ACRIN 6677/RTOG 0625 trial: perfusion changes at 2 weeks predict outcome when enhancement-based RANO cannot. The definitive evidence against trusting enhancement alone during bevacizumab.

PaperGagnon et al. — UCSD Post-Treatment GBM Dataset (Scientific Data, 2026)

New publicly available post-treatment glioblastoma dataset with annotations for surgical cavity, radiation necrosis, and non-enhancing residual tumor. Train your post-treatment model here.

PaperFamiliar et al. — Pediatric Brain Tumor Measurement Challenges (Neuro-Oncology, 2024)

Reviews challenges specific to pediatric tumors: non-enhancing tumors, mild enhancement assessment, cystic components, and the role of AI for consistency.

← Week 10: Deployment Week 12: Radiomics →