GLOSSARY

Imaging Data Harmonization

Imaging data harmonization is the process of making medical images acquired under different conditions — from different scanners, sites, or time points — comparable enough to be analysed together. Without harmonization, scanner-induced differences can distort endpoint measurements, inflate noise, and reduce the statistical power of a clinical trial.

What is imaging data harmonization?

Imaging data harmonization — also called medical image harmonization or, in European clinical contexts, imaging data harmonisation — is the process of making medical images acquired under different conditions comparable enough to be analysed together as a unified dataset in a clinical trial or research study.

Harmonization does not make all images identical. It corrects for systematic technical differences in image quality, signal intensity, contrast, and resolution that arise because of hardware variability, scanner manufacturer differences, and protocol inconsistencies across sites. The goal is to ensure that measurements derived from those images reflect biology — disease progression, treatment response — rather than scanner artefact.

Unaddressed scanner variability directly compromises endpoint reliability. In a multi-site neuroimaging trial measuring brain atrophy, for example, a mid-trial software update at one site can produce a measurable shift in signal intensity that registers as apparent volume change. That apparent change reflects the software update, not disease progression — and without harmonization, it contaminates the endpoint dataset. Pre-specifying the harmonization method in the statistical analysis plan, and validating it before data lock, is a standard best practice in well-controlled clinical trials — consistent with principles in ICH E9(R1), which require pre-specification of analysis methods.¹

Regulatory agencies such as the FDA expect imaging-derived endpoints and their preprocessing methods — including harmonization — to be clearly defined and validated in advance of analysis.

When is harmonization needed?

Harmonization is most critical in four situations:

Multi-site trials — where images are collected across sites with different scanner models, field strengths, and local acquisition practices. This is the most common use case and the primary driver of harmonization methodology development.

Longitudinal studies with scanner drift — where a single site's scanner undergoes a software update, hardware replacement, or coil change during a multi-year study, introducing a measurable signal shift between early and late timepoints.

Retrospective datasets — where images were collected under routine clinical care rather than a standardised research protocol, producing heterogeneous data that requires correction before research analysis.

Mixed field-strength studies — where images from 1.5T and 3T scanners are combined in the same dataset, requiring explicit harmonization or separate analysis streams.

Harmonization vs. standardization — what is the difference?

These terms are often used interchangeably but describe different points of intervention.

Standardization

Applied before data collection. Defines a single imaging protocol — sequence parameters, field strength, slice thickness, acquisition order — and requires all sites to follow it precisely.

Standardization is the preferred approach in prospective clinical trials because it prevents the variability problem from occurring in the first place. It is also insufficient on its own: even well-standardized trials experience scanner drift over multi-year follow-up as sites upgrade hardware or update acquisition software.

Harmonization

Applied after data collection. Addresses variability in images that have already been acquired — because standardization was incomplete, because scanners changed during the trial, or because the study uses retrospective or real-world data collected outside a controlled protocol.

In practice, both are needed together. Standardization reduces the magnitude of the harmonization problem; harmonization corrects for what standardization could not prevent.

What causes imaging variability across sites?

Scanner manufacturer and model

MRI scanners from Siemens, GE, Philips, United, and Canon use different hardware architectures and image reconstruction algorithms. The same anatomical structure imaged on two different 3T scanners from different manufacturers will produce images with measurably different signal characteristics, even under identical nominal acquisition parameters.

Field strength

1.5T, 3T, and 7T MRI scanners produce images with different signal-to-noise ratios, tissue contrast profiles, and susceptibility artefacts. Combining data from different field strengths in a single analysis requires explicit harmonization or separate analysis streams per field strength.

Software updates

Scanner manufacturers periodically update image reconstruction software. These updates can change signal characteristics subtly — and in longitudinal trials, a mid-study update at one site introduces a measurable signal shift between earlier and later timepoints at that site that is indistinguishable from biological change without correction.

Operator variability

Patient positioning, coil placement, a loose screw, and acquisition parameter selection by site radiographers introduce variability even when nominal protocol parameters are standardized. This is partly addressable through site training and qualification, but not fully eliminable across a large site network.

Harmonization methods

Three main approaches are used in clinical research, each with different trade-offs between effectiveness and implementation complexity.

Phantom-based (calibration-based) harmonization

Applies correction factors to image intensities or derived measurements based on calibration phantom measurements or reference scan comparisons. Relatively straightforward to implement and does not require paired training data from multiple sites. Most commonly used in prospective trials where a standardised phantom protocol can be established alongside the clinical imaging protocol. This is a preventive measure that helps standardize the acquisition, which is applied before data collection.

Statistical harmonization (ComBat and variants)

Models site-specific technical effects statistically and removes them while preserving biological signal. ComBat — originally developed for genomics batch correction — has been validated for MRI data harmonization and is widely used in neuroimaging research.² It is most commonly applied to derived imaging features such as cortical thickness or radiomics measurements rather than to raw image data directly. ComBat and its neuroimaging extensions (neuroComBat, longComBat) have been used in studies supporting regulatory submissions when appropriately validated and pre-specified in the statistical analysis plan. LongComBat is specifically designed to preserve within-subject change over time in longitudinal studies.

AI-based harmonization

Uses deep learning models trained on multi-site data to transform images from one site's appearance into a reference domain, normalising scanner-specific characteristics. More powerful than statistical methods for complex, non-linear variability patterns — but requires large training datasets, careful validation to avoid introducing artefacts, and explicit documentation of the model architecture and training procedure for regulatory submissions.

Harmonization may be applied at the image level or at the level of derived quantitative features, depending on the analysis pipeline and the method used. Regardless of approach, the harmonization method must be pre-specified in the statistical analysis plan and validated before data lock. Post-hoc harmonization — applied after results are known — is generally not accepted as a primary analysis method in regulatory submissions.²

Need help designing a harmonization strategy for your imaging trial?
QMENTA's imaging scientists can advise on protocol design, method selection, and statistical analysis plan documentation for your regulatory submission.
Speak to an expert →

Harmonization in multi-site imaging trials

In a multi-site imaging trial, harmonization is typically an analytical requirement when sites use heterogeneous imaging hardware or protocols — not an optional refinement. Images from 20 or 50 sites — acquired on different scanner models, at different field strengths, with different software versions — cannot reliably be combined in a single endpoint analysis without addressing the systematic technical differences between them. These differences are often systematic (site-specific bias) rather than random noise, which is why harmonization is required rather than simple averaging.

The standard pipeline is: standardize the acquisition protocol as thoroughly as possible before the trial starts, qualify sites against that protocol, apply automated quality checking at the point of image receipt, and pre-specify a validated harmonization method in the statistical analysis plan for use before endpoint analysis.

QMENTA's Imaging Hub supports this pipeline with automated protocol deviation detection at upload, integrated quality dashboards across all sites, and support for established harmonization workflows, including ComBat-based approaches. Algorithm versions used in harmonization and analysis are typically locked at trial initiation, with any changes controlled through formal versioning procedures.

Key takeaways

Harmonization corrects for scanner and site variability after image collection; standardization prevents it before — both are needed in multi-site trials
Unaddressed scanner variability can introduce measurement artefacts that are indistinguishable from biological change — directly compromising endpoint validity
The four situations requiring harmonization: multi-site trials, longitudinal studies with scanner drift, retrospective datasets, mixed field-strength studies
ComBat and its neuroimaging variants are the most widely validated statistical harmonization method for MRI data
Harmonization methods must be pre-specified in the statistical analysis plan — post-hoc harmonization is generally not accepted as a primary analysis method in regulatory submissions
Both American (harmonization) and British (harmonisation) spellings are used in clinical research literature; they refer to the same process

By Paulo Rodrigues, PhD, Chief Technology Officer and Co-Founder at QMENTA
Paulo Rodrigues leads technology strategy at QMENTA and writes about imaging clinical trials, protocol standardization, real-time QC, and compliance-ready neuroimaging workflows for multi-site studies. View executive leadership.

¹ ICH E9(R1). Addendum on Estimands and Sensitivity Analysis in Clinical Trials. 2019. ich.org

² Fortin JP, et al. Harmonization of cortical thickness measurements across scanners and sites. NeuroImage. 2017;167:104–120. doi:10.1016/j.neuroimage.2017.11.024

Imaging data harmonization is important because unaddressed scanner and site variability inflates measurement noise, reduces statistical power, and can introduce artefacts that are indistinguishable from true biological signal. In a multi-site neuroimaging trial measuring brain atrophy, for example, a mid-trial software update at one site can produce an apparent shift in measured atrophy that reflects the update rather than disease progression. Without harmonization, this artefact contaminates the endpoint dataset and can bias the trial's results in either direction. Harmonization corrects for these systematic technical differences so that measured endpoint changes reflect biology rather than scanner or site effects.

Standardization is applied before data collection — it means defining a single imaging protocol and requiring all sites to follow it precisely. Harmonization is applied after data collection to address variability in images that have already been acquired. Both are needed in practice: even well-standardized trials experience scanner drift over multi-year follow-up as sites upgrade hardware or update acquisition software. Standardization reduces the magnitude of the harmonization problem; harmonization corrects for what standardization could not prevent.

It depends on the harmonization method. Image-level harmonization methods — such as AI-based approaches that transform pixel values — modify the image data directly. Feature-level or statistical harmonization methods — such as ComBat applied to extracted measurements — correct biomarker values without altering the raw image. For regulatory submissions, the harmonization method and its validation must be documented in the statistical analysis plan regardless of which approach is used. Post-hoc harmonization applied after results are known is generally not accepted as a primary analysis method in regulatory submissions.

Both approaches are used in practice. Image-level harmonization applied to raw DICOM data before analysis corrects for scanner effects at the source and affects all downstream measurements consistently. Feature-level harmonization applied to extracted measurements is computationally simpler and more commonly used in large multi-site trials. The choice should be pre-specified in the statistical analysis plan and driven by the specific biomarker, the degree of scanner variability in the dataset, and the regulatory context.

Single-site trials using the same scanner throughout the study generally do not require formal harmonization unless scanner conditions change during the study. However, long-running single-site studies should include longitudinal quality checks — such as periodic phantom scans — to detect signal shifts caused by scanner software updates, hardware replacements, or coil changes. If a significant shift is detected mid-trial, a correction method should be applied and documented in the statistical analysis plan.