Skip to main content

The combination of artificial intelligence with Monte Carlo simulations may represent a quantum leap in computational radiotherapy. This claim, made in the preface to the second edition of Monte Carlo Techniques in Radiation Therapy (CRC Press, 2022), is no exaggeration — recent advances in deep neural networks are transforming how we generate, process, and apply MC simulation data in clinical practice. For a comprehensive overview of all MC techniques in radiotherapy, see our complete guide to Monte Carlo in Radiotherapy.

Series overview: for the full roadmap and related articles, return to the complete guide on Monte Carlo in radiotherapy.

Artificial intelligence interface applied to dose planning in radiotherapy using deep neural networks
Photo: Tara Winstead / Pexels

MC simulations are inherently statistical and produce massive volumes of data — features that make the field a natural candidate for deep learning approaches. Deep neural networks (DNNs) have already demonstrated the ability to learn complex statistical correlations in domains like computer vision, and now this same methodological infrastructure is being directed at specific medical physics problems.

This article explores how AI is being integrated into the Monte Carlo ecosystem, from direct dose distribution estimation to simulation acceleration and CBCT image correction. We also discuss future trends, including the prediction that the MC method will remain an essential component of scientific infrastructure in radiotherapy.

Deep Neural Networks and Monte Carlo Simulation

To understand how AI connects to Monte Carlo, it helps to revisit the basic mechanics of neural networks. A DNN is organized in layers of neurons connected by adjustable weights. Each neuron applies a nonlinear activation function to the weighted sum of its inputs — without this nonlinearity, the entire network would be just a giant linear function, incapable of capturing complex patterns.

When dealing with image-like data — such as dose distributions in voxels or CT images — convolutional networks (CNNs) excel. Convolution operations interspersed between layers capture nonlocal properties of the input data. The U-Net architecture, widely used in medical segmentation, combines a contracting path (downsampling) with an expansive path (upsampling), ideal for image-to-image transformations. Variational autoencoders and generative adversarial networks (GANs) complete the arsenal of architectures employed in the works discussed here.

In practice, training a network involves tuning hyperparameters such as learning rate, batch size, and penalty weights. This is nontrivial — but frameworks like Keras, TensorFlow, and PyTorch, originally developed by major tech companies, have drastically lowered the barrier for medical physics researchers.

An MC simulation in radiotherapy can be viewed as a mapping: from a CT image to a dose distribution, for instance. No explicit analytical expression exists for this mapping, but a neural network can learn it from sufficient data. The simulation involves particle tracking, scoring, and binning — operations yielding results with inherent statistical variance for any fixed set of initial conditions.

AI for Dose Estimation from MC Simulations

Several research groups have investigated CNNs for estimating dose distributions across different contexts — internal radiotherapy, external beam, and brachytherapy — using MC simulations as training reference and validation.

Radiotherapy equipment with linear accelerator used in treatment planning and dose calculation
Photo: Jo McNamara / Pexels

Lee et al. (2019) proposed Deep-Dose, a U-Net-based network trained on PET and CT image patches associated with dose distributions calculated by GATE (the GEANT4 Application for Tomographic Emission). The database comprised ten patients with eight PET/CT time points after intravenous injection of 68Ga-NOTA-RGD, covering 1 to 62 minutes post-injection. Accuracy was within 3% of the reference computation, reducing calculation time from hours to minutes.

Götz et al. (2019) combined U-Net with empirical mode decomposition, using CT images and dose maps estimated by the MIRD protocol (organ S-value method) from SPECT images for 177Lu internal radiotherapy treatments. Performance exceeded the conventional fast dose-volume-kernel method.

In external beam radiotherapy and brachytherapy, Nguyen et al. (2019) used structure contours, prescriptions, and delivered doses as training data for head and neck VMAT treatments. Liu et al. (2019) investigated models for nasopharyngeal cancer helical tomotherapy. Mao et al. (2020) developed RapidBrachyDL for fast brachytherapy dose calculations. All reported accurate predictions.

A crucial point: neural networks trained on MC data can never fully replace MC — they will always depend on simulations for training datasets. The real goal is accelerating computation to clinically viable levels (minutes instead of hours). Extensive training and MC reference simulations can be performed offline. However, careful construction of training datasets is essential: they must cover a sufficiently wide range of clinical cases. For more details on how external photon beams are modeled with Monte Carlo, see our dedicated article.

Another frequently overlooked aspect: in a pure MC calculation, statistical uncertainty is known. With an MC-based neural network, this uncertainty is implicitly present but invisible to the user.

Deep Learning-Based Monte Carlo Dose Denoising

Unlike the methods in the previous section — which attempt to replace MC by mapping directly from images to dose — denoising acts as post-processing. The network receives noisy dose distributions (obtained from few MC histories) and generates smoothed maps equivalent to simulations with far more particles.

MC denoising methods have existed for some time. Naqa et al. (2005) demonstrated that smoothing statistical fluctuations can reduce computation time. The “noise” in computed dose is related to the variance on deposited energy and decreases at a rate of $1/\sqrt{N}$, where $N$ is the number of simulated particles. Achieving low fluctuation in low-dose regions requires an enormous number of iterations.

Traditional filtering techniques include 3D wavelets, advanced mean-median filtering, and anisotropic diffusion. They work reasonably well, but effective acceleration depends heavily on dose distribution characteristics.

With deep learning, the principle is to train a CNN on pairs of high-noise/low-noise dose distributions obtained from low- and high-statistics MC simulations, respectively. Most works employ U-Net variants, although DenseNet and conveying path convolutional encoder-decoders have also been studied.

Application Authors Modality Indication
Photons Peng et al. (2019), Fornander (2019), Neph et al. (2019), Kontaxis et al. (2020) EBRT, MRgRT Brain, head and neck, liver, lung, prostate
Protons Javaid et al. (2019), Madrigal (2018) Proton therapy Various indications
MR-guided Neph et al. (2019) MRgRT (magnetic field) Dose delocalization from charged particles in magnetic fields

Source: Monte Carlo Techniques in Radiation Therapy (2nd ed., CRC Press, 2022)

Results are encouraging. CNNs produced noise-equivalent dose maps with 10–100 times fewer particles than originally needed. Evaluation metrics included peak signal-to-noise ratio, gamma index, and dose-volume histograms (DVH).

Challenges remain. Results depend on the size and complexity of training datasets, and generalization to other scenarios is not yet guaranteed. Crucially, denoised dose maps must preserve dose gradient features — and it is not yet fully clear how to guarantee this in all situations. To better understand how dose is calculated in the patient, see our article on patient dose calculation with Monte Carlo.

The denoising problem is not limited to dose distributions. Methods investigated for low-dose CT imaging — such as those by Wolterink et al. (2017) and Yang et al. (2018) — may serve as inspiration for radiotherapy applications.

AI for Imaging Detector and Source Modelling

The works presented so far depend on MC simulation outputs without altering the simulation itself. The next frontier is more ambitious: replacing parts of the MC simulation with neural networks, accelerating particle transport through specific geometrical components.

Radiotherapy treatment room with linear accelerator and planning system for Monte Carlo dose calculation
Photo: Jo McNamara / Pexels

Sarrut et al. (2018) proposed a DNN to learn the Angular Response Function (ARF) of a SPECT collimator-detector system. Instead of explicitly simulating photon transport in the imaging head, the network receives kinematic photon properties (energy and direction) crossing a virtual plane and returns detection probabilities in each energy window. Compared to histogram-based ARFs, the neural network approach depends less on training data statistics, requires no explicit binning, and needs less training data. The speedup relative to analog MC ranged from 10 to 3,000: more efficient in low-count regions (speedup 1,000–3,000) than in high-count areas (speedup 20–300), and more efficient for high-energy radionuclides such as 131I. This implementation is available in the GATE platform.

In PET imaging, neural networks have been proposed to estimate depth of interaction (DOI) and event position in pixelated or continuous monolithic scintillators (Zatcepin et al. 2020; Berg and Cherry 2018; Müller et al. 2019). Incorporating DOI into image reconstruction improves PET image quality.

Perhaps the most daring application is the use of GANs for phase space generation. Sarrut et al. (2019) employed generative adversarial networks to learn the phase space distribution from a linac simulation. The properties (energy, position, and direction) of all particles reaching an exit plane are stored in phase space files — typically tens of gigabytes large and difficult to use efficiently. Statistical limitations from particle recycling also arise when more particles are needed than stored.

After training, the generator network “G” produces particles belonging to the original phase space probability distribution, occupying about 10 MB instead of several GB. Tests showed good dosimetric accuracy, including for prostate brachytherapy. For details on beam modeling and phase spaces, see our article on Monte Carlo fundamentals in radiotherapy.

Open questions remain: it is unclear whether the same GAN architecture works for all phase space types, and training requires delicate hyperparameter tuning. Alternative methods such as Gaussian mixture models may also be useful.

Neural Network-Based CBCT Scatter Correction

Cone beam computed tomography (CBCT) is inseparable from modern radiotherapy, yet suffers from inferior image quality and scatter artifacts. The imager panel captures not only attenuated primary photons from the X-ray source but also coherently and incoherently scattered photons from within the patient. For accurate image reconstruction, the scatter contribution would need to be known and subtracted — but in practice, the panel provides only a non-discriminative cumulative intensity signal.

MC simulations offer a conceptual solution: they can specifically tag scattered photons, producing perfect scatter-free projections. Jarry et al. (2006) already used MC for CBCT scatter estimation. The problem is that direct simulation of kV photons is too slow for clinical integration, even with variance reduction techniques (Mainegra-Hing and Kawrakow 2008).

Recent works propose deep convolutional networks learning from MC-simulated CBCT projections. The networks generate estimated scatter images (projections) as output from raw projections as input (Lee et al. 2019; van der Heyden et al. 2020; Lalonde et al. 2020; Maier et al. 2019). All report promising results. These methods rely exclusively on MC simulations for training, where primary photons can be distinguished from scattered ones — impossible with experimental projections alone.

Another approach operates in the image domain: takes CBCT as input and generates synthetic CT as output. These synthetic images contain far fewer artifacts than the original CBCT images.

Perspectives and the Future of Monte Carlo

Integrating AI with Monte Carlo brings a paradigm shift to medical physics. To some extent, researchers must relinquish the instinct to mathematically master the phenomenon under investigation and instead rely on large volumes of data for heuristic learning. This transition will demand new skills — the ability to implement and manage complex computational tasks will become as important as radiation physics modeling.

However, as this article has demonstrated, deep learning works in radiotherapy heavily rely on MC-generated training data. The simulation must be skillfully set up and evaluated. Generating suitable datasets may become a skill in itself, similar to commissioning a treatment planning system (TPS).

Currently, AI operates at the “high level” of MC — connecting input (CT) to output (dose) of a simulation or replacing intermediate steps (such as GANs for phase space). Whether AI will be useful deeper inside MC code — at the level of particle transport — remains to be seen. One can imagine neural networks replacing handcrafted numerical models for fitting complex measured data or probability distribution functions for certain types of interactions.

The high-energy physics (HEP) community is already actively exploring AI: deep learning for nuclear interaction modeling (Ciardiello et al. 2020), neural networks in condensed matter physics (Carrasquilla and Melko 2017), and GANs for fast simulation of particle showers in electromagnetic calorimeters (Paganini et al. 2018). Knowledge exchange between HEP and radiotherapy would be highly beneficial.

As for the Monte Carlo method itself, Amdahl’s law will continue to limit massively parallel machines, but single-chip speed gains continue following Moore’s law. Algorithm development is harder to predict — historical data show unexplained productivity jumps: a factor of 2.8 in 1991 and 1.6 in 1998, illustrating the chaotic nature of progress in this field.

Since 2005, new codes have appeared — mostly user interfaces to existing codes such as TOPAS and GATE — and Monte Carlo publications continue to grow rapidly. GEANT4 in particular shows strong usage growth. As the book’s preface noted, “we have not yet seen the end of potential applications, guaranteeing endless fun for generations to come.”

The picture is clear: network training time remains large and demands heavy GPU computing. Generalization of learned models to datasets beyond training remains uncertain. Final accuracy does not always reach conventional MC levels. Nevertheless, this is an extremely promising field, and publications at the intersection of AI and MC for radiotherapy should grow exponentially in coming years. To explore how Monte Carlo applies to other modalities like proton therapy and advanced QA, see our article on protons and advanced QA with Monte Carlo.

Leave a Reply