BraTS teaches you to segment a single scan. But real patients are scanned repeatedly — before surgery, after radiation, every three months for years. This week you learn to organize, register, and segment longitudinal brain MRI, measure treatment response using RANO criteria, and navigate the imaging challenges that make post-treatment segmentation fundamentally harder than pre-treatment.
The BraTS challenge gives you one MRI per patient, taken before surgery. In the real world, a glioblastoma patient may have 15–30 MRI scans over the course of their treatment — baseline, 48 hours post-surgery, post-radiation baseline, then every 2–3 months for surveillance. The clinical question is never “what does the tumor look like right now?” but “is it getting better, worse, or staying the same?”
This shift from single-timepoint segmentation to longitudinal treatment monitoring introduces challenges that don’t exist in BraTS: the tumor changes appearance after each treatment, the brain itself changes shape (resection, edema resolution), and treatment creates imaging artifacts that look like tumor but aren’t (pseudoprogression, radiation necrosis). Understanding these challenges is essential for building clinically useful AI.
The Response Assessment in Neuro-Oncology (RANO) criteria are the international standard for determining whether a brain tumor is responding to treatment. They define four response categories based on imaging measurements and clinical status. Understanding RANO is essential because it defines what “success” means for your segmentation model in a clinical context.
The original criteria use bidimensional measurements: multiply the longest diameter of the enhancing tumor by its perpendicular diameter. Response is defined as: Complete Response (CR): disappearance of all enhancing tumor. Partial Response (PR): ≥50% decrease in the product of perpendicular diameters. Stable Disease (SD): neither PR nor PD criteria met. Progressive Disease (PD): ≥25% increase in the product, or new lesions.
Published in the Journal of Clinical Oncology, RANO 2.0 is a major update informed by data from 1,106 glioblastoma patients. The key changes represent lessons learned from a decade of clinical experience:
RANO 2.0 uses the post-radiotherapy MRI as baseline, not the post-surgical scan. This change alone improved the correlation between progression-free survival (PFS) and overall survival (OS) from 0.53 to 0.67. The rationale: comparing pre-treatment to post-treatment scans introduces too much noise from treatment effects.
RANO 2.0 formally introduces volumetric measurement as an option alongside traditional 2D. This is a direct acknowledgment that bidimensional measurements are unreliable for irregularly shaped tumors. A study comparing RANO 2.0 vs mRANO vs RANO 1.0 found that RANO 2.0 provided the strongest survival risk stratification (HR=3.6). The vorasidenib trial (NEJM 2023) used volumetric RANO for low-grade gliomas, demonstrating clinical trial adoption.
Confirmation scans are now mandatory only within the first 12 weeks post-radiotherapy (when pseudoprogression incidence is highest at 20–30%), not at all timepoints. The previous iRANO approach required 3-month confirmation for all suspected progression, which censored >50% of patients without improving outcomes.
T2/FLAIR evaluation is eliminated for IDH wild-type glioblastoma (except with anti-angiogenic therapy). Data showed that adding FLAIR evaluation did not improve the correlation between PFS and OS. For IDH-mutant tumors with large non-enhancing components, both enhancing and non-enhancing tumor must still be evaluated.
RANO-BM (Lancet Oncology, 2015) defines separate criteria for metastases: measurable disease is ≥10mm in longest diameter (or ≥5mm with thin-slice MRI). Up to 5 target CNS lesions are measured. PR requires ≥30% decrease in the sum of longest diameters; PD requires ≥20% increase or new lesions. CNS response is assessed independently from systemic disease.
Every treatment changes how the tumor looks on MRI. A model trained only on pre-operative BraTS data will fail on post-treatment scans because it has never seen these appearances. Understanding them is critical for building models that work longitudinally.
The resection cavity appears as a fluid-filled void where the tumor was. Blood products at the cavity rim are hyperintense on T1 (mimicking enhancing tumor) and variable on FLAIR. Post-operative enhancement along the cavity margin is normal and does not indicate residual tumor. Imaging should be acquired within 48 hours post-surgery to establish the post-surgical baseline before enhancement develops. Segmentation models need a separate “surgical cavity” label.
Pseudoprogression (3–6 months post-chemoradiation): transient increase in enhancement caused by treatment-induced endothelial injury and inflammation. Occurs in 20–30% of newly diagnosed GBM patients and is actually associated with better prognosis. On imaging, it’s often indistinguishable from true progression on conventional MRI alone. DSC perfusion (low rCBV, sensitivity 90%, specificity 88%) and DCE perfusion help differentiate.
Radiation necrosis (>6 months): heterogeneous “frond-like” enhancement with central necrosis. rCBV achieves 91% sensitivity and 100% specificity when combined with DTI and MRS. A meta-analysis of AI approaches found overall accuracy of 80% (sensitivity 85%, specificity 69%) for distinguishing pseudoprogression from true progression.
Pseudoresponse: bevacizumab normalizes abnormal blood-brain barrier permeability, causing rapid decrease in enhancement without true tumor response. Response rates of 35–63% in trials, but no overall survival benefit. rCBV changes at 2 weeks predicted outcome (P=0.002) when enhancement-based RANO could not (P=0.86). Non-enhancing tumor often progresses on T2/FLAIR while enhancement decreases — making enhancement an unreliable metric in bevacizumab-treated patients.
Immune checkpoint inhibitors can trigger inflammatory responses that mimic tumor progression on imaging. The previous iRANO criteria proposed mandatory 3-month confirmation scans, but validation showed this censored >50% of patients without improving PFS-OS correlation. RANO 2.0 now limits confirmation requirements to the first 12 weeks post-RT.
Pre-treatment BraTS has three foreground labels: enhancing tumor (ET), necrosis (NCR), and peritumoral edema (ED). Post-treatment segmentation requires additional classes to capture treatment effects:
Label 0: Background. Label 1: Necrotic/non-enhancing tumor core. Label 2: Peritumoral edema. Label 4: Enhancing tumor. These labels capture a treatment-naive tumor’s anatomy.
All pre-treatment labels plus: Surgical cavity (distinct from necrosis — a fluid-filled void where tumor was resected). Radiation necrosis (distinct from tumor necrosis — treatment-induced tissue death). Non-enhancing residual tumor (distinct from edema — residual tumor that doesn’t enhance on T1ce). Blood products (acute vs chronic). These additional classes are critical because conflating surgical cavity with tumor necrosis, or edema with non-enhancing tumor, leads to incorrect volume measurements and wrong RANO classifications.
Inter-rater agreement is lower for post-treatment labels (Dice 0.60–0.75) compared to pre-treatment (0.74–0.85), reflecting the genuine ambiguity of post-treatment imaging. Your model’s ceiling is limited by this human disagreement.
Longitudinal data is inherently more complex than single-timepoint data. Each patient has multiple scans, each scan has multiple modalities, and the clinical context (treatment dates, molecular markers, RANO assessments) must travel with the imaging data.
# Recommended folder structure for longitudinal neuro-oncology data
Patient_001/
├── clinical/
│ ├── demographics.json # Age, sex, IDH, MGMT, 1p/19q
│ └── treatment_timeline.json # Surgery, RT, chemo dates
├── T0_PreOp/ # Baseline pre-operative
│ ├── T1.nii.gz
│ ├── T1ce.nii.gz
│ ├── T2.nii.gz
│ ├── FLAIR.nii.gz
│ └── seg.nii.gz # Pre-treatment labels
├── T1_PostOp_48hr/ # 48-hour post-surgery
│ └── ... # Post-treatment labels
├── T2_PostRT_Baseline/ # Post-radiation (RANO 2.0 baseline)
├── T3_FollowUp_3mo/
├── T4_FollowUp_6mo/
├── T5_FollowUp_9mo/
└── T6_Progression/
imagesTr. Mixing pre- and post-treatment data in the same training set without careful stratification will hurt performance. Consider training separate models for pre-treatment (BraTS-style labels) and post-treatment (expanded labels with surgical cavity, radiation necrosis) — or use a single model with the full post-treatment label set if you have sufficient post-treatment annotations.XNAT is the most widely used platform for longitudinal neuroimaging research. It tracks subjects, experiments (scans), and assessments across time, integrates with PACS, and supports automated processing pipelines. The Heidelberg automated RANO pipeline uses XNAT as its backbone, processing data from 34 institutions. Flywheel is a cloud-based alternative with built-in DICOM management. The LUMIERE Dataset provides longitudinal glioblastoma MRI with expert RANO ratings, serving as a benchmark for automated response assessment.
Comparing tumor volume between two timepoints requires registering the scans to a common space. But after surgery, the brain has shifted — there’s a cavity where tumor used to be, edema may have resolved, ventricles may have expanded. Standard rigid registration is insufficient.
Rigid registration: Corrects for head position differences only. Fast and reliable, but doesn’t account for tissue changes. Use as a first step. Affine registration: Adds global scaling and shearing. Better for accounting for brain shift. Deformable registration (ANTs SyN, FSL FNIRT): Models local tissue deformation. Necessary when anatomy has changed significantly (resection, edema resolution). But over-aggressive deformation can warp tumor regions incorrectly. Quality control of registration is critical — misalignment leads to false volume changes that mimic progression or response.
For longitudinal RANO assessment, pairwise registration between consecutive timepoints is often preferred over registering everything to a common template. Each scan is registered to its immediately preceding scan, preserving the temporal chain. The 2025 ACR guidelines emphasize that the post-radiotherapy scan should serve as the registration reference (new baseline), not the pre-operative scan.
The landmark Kickingereder et al. study (Lancet Oncology, 2019) is the single most important paper demonstrating why AI-based volumetry matters for treatment response. Tested on 596 patients from the EORTC-26101 trial across 38 institutions:
Automated volumetric progression detection was a superior surrogate endpoint for overall survival compared to RANO-defined progression. A separate validation on 760 pre-operative and 504 post-operative patients achieved ICC of 0.959 for enhancing tumor volume and 0.960 for cavity volume, with processing up to 20 times faster than manual delineation.
The relationship between 2D and 3D measurements is non-linear because tumors grow irregularly. Studies consistently show ~20% disagreement between 2D and volumetric response classification for pediatric gliomas. A 50% reduction in bidimensional product does not correspond to a consistent volumetric change. This non-linearity is why RANO 2.0 now endorses volumetric assessment.
This is one of the most clinically consequential problems in neuro-oncology imaging. Misclassifying pseudoprogression as true progression may lead to premature treatment changes. Misclassifying true progression as pseudoprogression delays life-saving intervention.
A 2025 meta-analysis of 26 studies (1,972 patients) found AI algorithms achieve overall accuracy 80%, sensitivity 85%, specificity 69% for differentiating pseudoprogression from true progression. Specificity remains the bottleneck — the models are better at detecting true progression than confirming pseudoprogression. The most successful approaches combine multiple imaging modalities: a Vision Transformer using pre- and post-operative contrast-enhanced T1 achieved AUC of 95.2%, and adding clinical data pushed this to AUC of 99.3%. Combined perfusion parameters (rCBV + ASL) achieved AUC of 0.948.
Conventional MRI has modest performance in the post-treatment setting (sensitivity 68%, specificity 77% per ACR 2025 guidelines). The most powerful discriminators require advanced imaging: DSC perfusion (relative cerebral blood volume), DCE perfusion, diffusion-weighted imaging (ADC maps), MR spectroscopy (Cho/NAA ratios), and amino acid PET. A combined DSC + MRS approach achieved AUC of 0.994 for distinguishing tumor recurrence from radiation necrosis. Future multimodal AI models that integrate these sequences with temporal context will likely achieve near-clinician performance.
Receive DICOM data from PACS. Extract scan date, sequence type, acquisition parameters. Classify the timepoint: pre-op, post-op, post-RT baseline, follow-up. Automated scan-type classifiers achieve 99.7% accuracy.
DICOM → NIfTI conversion, skull stripping (HD-BET/SynthStrip), N4 bias field correction, intensity normalization, resampling to 1mm³ isotropic. Same pipeline as Week 2 but applied to every timepoint.
Register each scan to its preceding timepoint using rigid + deformable registration (ANTs SyN). Quality-control the registration visually or with automated metrics. Use the post-RT scan as the RANO 2.0 baseline reference.
Use a pre-treatment model for pre-operative scans and a post-treatment model for all subsequent timepoints. Post-treatment models produce surgical cavity, residual tumor, radiation necrosis, and edema labels. nnU-Net or BraTumIA for each timepoint independently.
Compute volumes for each tumor compartment at each timepoint: enhancing tumor volume, total tumor volume, cavity volume. Calculate bidimensional products for RANO 1.0 compatibility. Generate volume-over-time plots.
Apply RANO 2.0 thresholds to volume changes relative to the post-RT baseline: ≥25% increase → PD, ≥50% decrease → PR. Flag equivocal cases within the 12-week post-RT window for pseudoprogression confirmation. Integrate clinical data (steroid dose, neurological status) for final classification.
Generate structured report with volume trends, RANO classification, and uncertainty flags. Push results (segmentation + volume chart + RANO assessment) back to PACS. The Heidelberg pipeline demonstrated this end-to-end across 34 institutions in a clinical trial.
Pediatric brain tumors present unique longitudinal challenges: the brain is still developing (making registration harder), tumor biology differs from adults (more posterior fossa tumors, different molecular subtypes), and treatment protocols are different. The RAPNO (Response Assessment in Pediatric Neuro-Oncology) criteria are analogous to RANO for adults, and AI-RAPNO is an emerging initiative applying AI to pediatric response assessment.
A deep learning pipeline for pediatric tumors (794 pre-operative + 1,003 post-operative MRIs) achieved automated RAPNO scores with ICC of 0.851–0.909 vs manual measurements, and was superior in repeatability for patients with multiple lesions. Two companion Lancet Oncology reviews (2025) laid out the state of the art and the challenges for clinical translation, emphasizing that pediatric datasets are scarce, imaging protocols are heterogeneous, and validated AI tools are urgently needed.