{"id":18173,"date":"2026-06-10T13:08:37","date_gmt":"2026-06-10T16:08:37","guid":{"rendered":"https:\/\/rtmedical.com.br\/tmp-en-1781107716981\/"},"modified":"2026-06-10T13:08:45","modified_gmt":"2026-06-10T16:08:45","slug":"ai-radiotherapy-dose-calculation-monte-carlo","status":"publish","type":"post","link":"https:\/\/rtmedical.com.br\/en\/ai-radiotherapy-dose-calculation-monte-carlo\/","title":{"rendered":"AI Dose Calculation: Surrogate Models and Clinical Limits"},"content":{"rendered":"<p>The incorporation of artificial intelligence (AI) algorithms into the radiotherapy planning workflow represents one of the most profound transformations experienced by medical physics in recent decades. For more than thirty years, the dose calculation engine has been synonymous with deterministic or stochastic physics: analytical convolutions, particle transport by Boltzmann equations or Monte Carlo (MC) simulation. These methods operate on explicit models of radiation transport, with parameters derived from commissioning data and validated against independent dosimetric measurements. Now a new category of engines emerges\u2014models trained from data\u2014whose capabilities and limitations do not naturally fit into the quality assurance (QA) protocols developed for deterministic algorithms.<\/p>\n<p>What is generically called &#8220;AI in dose calculation&#8221; encompasses very different technological realities: neural networks that predict dose distributions based on structure geometries, reinforcement learning models for plan optimization, and emulators that reproduce outputs from slow engines \u2014 such as MC \u2014 with fractions of a second latency. None of these models transport particles. They learn statistical correlations between inputs (CT images, contours, beams) and outputs (dose distributions) from a training set. The clinically relevant question is not &#8220;AI or Monte Carlo?&#8221; but rather: under what conditions can a surrogate model be used with confidence, and what safeguards are needed to detect when it silently fails?<\/p>\n<figure class=\"wp-block-image size-large dose-algorithm-infographic\"><img alt=\"AI dose surrogate model with clinical validation guardrails\" decoding=\"async\" data-src=\"https:\/\/rtmedical.com.br\/wp-content\/uploads\/2026\/06\/ai-dose-guardrails.jpg\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" class=\"lazyload\" style=\"--smush-placeholder-width: 1600px; --smush-placeholder-aspect-ratio: 1600\/900;\" \/><figcaption>Technical infographic from the dose-calculation algorithm cluster.<\/figcaption><\/figure>\n<p>This article examines these questions from the perspective of medical physicists, dosimetrists, and radiation oncologists who need to make adoption or oversight decisions for AI-based tools. The text differentiates physical description of the phenomenon, commercial implementation and published validation evidence \u2014 three dimensions often confused in discussions on the topic. It is not intended to recommend specific products, but to provide a conceptual map for critical evaluation of these technologies.<\/p>\n<div class=\"toc\">\n<h2>In this Article<\/h2>\n<ul>\n<li><a href=\"#what-it-means-to-use-ai-as-a-dose-surrogate-model\">1. What it means to use AI as a dose surrogate model<\/a><\/li>\n<li><a href=\"#difference-between-predicting-dose-and-transporting-particles\">2. Difference between predicting dose and transporting particles<\/a><\/li>\n<li><a href=\"#training-data-bias-and-validity-domain\">3. Training Data, Bias, and Validity Domain<\/a><\/li>\n<li><a href=\"#generalization-to-machines-energies-and-anatomies\">4. Generalization to machines, energies and anatomies<\/a><\/li>\n<li><a href=\"#uncertainty-outlier-detection-and-silent-failures\">5. Uncertainty, outlier detection and silent failures<\/a><\/li>\n<li><a href=\"#how-to-compare-ai-monte-carlo-and-deterministic-solvers\">6. How to compare AI, Monte Carlo and deterministic solvers<\/a><\/li>\n<li><a href=\"#clinical-validation-governance-and-responsible-use\">7. Clinical validation, governance and responsible use<\/a><\/li>\n<li><a href=\"#faq\">8. FAQ<\/a><\/li>\n<li><a href=\"#references\">9. References<\/a><\/li>\n<\/ul>\n<\/div>\n<h2 id=\"what-it-means-to-use-ai-as-a-dose-surrogate-model\">What it means to use AI as a dose surrogate model<\/h2>\n<p>A surrogate model (<em>surrogate model<\/em> or <em>emulator<\/em>) is a computational system trained to reproduce the behavior of another more expensive or slower system, accepting the same inputs and producing approximate outputs. In the dose context, the &#8220;expensive&#8221; system is typically a high-fidelity MC engine or a linear Boltzmann transport equation (LBTE) solver such as Acuros XB. The surrogate model \u2014 typically a deep convolutional neural network, often with a similar architecture to U-Net \u2014 learns, from reference (input, reference output) pairs, a mapping that can be evaluated in milliseconds rather than minutes or hours.<\/p>\n<p>It is important to distinguish two sub-cases that the literature often conflicts with. In the first, the network predicts the dose <em>based on already optimized treatment plans<\/em>, functioning as a quick check or generation of an initial plan (<em>knowledge-based planning<\/em>). In the second, the network directly replaces the calculation engine within TPS (<em>treatment planning system<\/em>), being invoked during each optimization iteration. The second case imposes much more severe requirements on accuracy and robustness: a systematically low error in a critical region will propagate to the optimization, producing plans with lower actual coverage than projected, without any warning signal to the user.<\/p>\n<p>There are prototypes and products that use machine learning in the planning and dose estimation stages, but the intended use must be verified in the documentation for each version. The distinction between &#8220;AI-accelerated&#8221; and &#8220;MC\/LBTE-calculated with hardware acceleration&#8221; is crucial. GPUMCD, for example, is Monte Carlo on GPU, not a neural network.<\/p>\n<p>Reduced latency can support adaptive flows and repeated calculations. The cost is to transfer part of the performance guarantee to the data, validity domain, and failure detection controls.<\/p>\n<h2 id=\"difference-between-predicting-dose-and-transporting-particles\">Difference between predicting dose and transporting particles<\/h2>\n<p>Transporting particles, in the physical sense, means solving \u2014 exactly, approximately or stochastically \u2014 the Boltzmann equation for radiation transport, considering interaction cross sections dependent on the material crossed, locally deposited energy and secondary scattering. MC samples individual trajectories of photons, electrons and secondary particles. LBTE\/Acuros XB solves the equation in its deterministic form over a spatial mesh. Pencil Beam decomposes the beam into pencils and applies water-calibrated scattering kernels, with empirical corrections for inhomogeneities. The AAA (<em>Anisotropic Analytical Algorithm<\/em>) utilizes separate energy convolutions for primary photons, lateral scattering, and contaminating electrons. All of these algorithms have parameters with direct physical meaning and can be, at least in principle, commissioned and validated against independent phantom measurements.<\/p>\n<p>A dose prediction neural network does not solve any of these equations. It learns a function \u2014 potentially of very high dimensionality \u2014 that maps the geometry of the problem (CT morphology in Hounsfield units, structure contours, beam configuration) to a dose distribution, minimizing a loss functional over the training set. The learned mapping is, by construction, an interpolation over the manifold of cases seen during training. Outside of this manifold \u2014 an unusual anatomy, an unrepresented combination of energies, an atypical beam geometry \u2014 the network will extrapolate in an unpredictable manner, with no guarantee of physical coherence.<\/p>\n<p>This distinction has direct implications for concepts such as <em>dose to medium<\/em> (Dm) and <em>dose to water<\/em> (Dw). Algorithms such as Acuros XB allow you to explicitly choose which quantity is calculated, with clinical consequences discussed in the literature especially in bone-tissue interfaces and in proton therapy. A surrogate model trained on Dm outputs implicitly &#8220;learns&#8221; this convention, but will not make it explicit. A convention change in the reference engine during retraining may go unnoticed \u2014 a structural example of silent failure.<\/p>\n<p>Another relevant aspect is incremental convergence: in MC, more particle histories equate to lower statistical uncertainty, and the user can balance calculation time and accuracy in a controlled way. In an ML model, there is no equivalent mechanism \u2014 the output is deterministic for a given input, and the model&#8217;s uncertainty is fixed, determined by the training phase.<\/p>\n<h2 id=\"training-data-bias-and-validity-domain\">Training Data, Bias, and Validity Domain<\/h2>\n<p>The performance of any surrogate model is fundamentally limited by the quality, quantity, and diversity of the training data. For dose prediction, the data set is generally clinically approved plans at one or more institutions, with dose distributions calculated by institutional TPS as the label (<em>ground truth<\/em>). Two structural problems immediately emerge.<\/p>\n<p>First, the label is not the actual dose \u2014 it is the dose calculated by the TPS algorithm, with its own uncertainties and approximations. If TPS used Pencil Beam for lung cases with severe heterogeneities, and the model learns to reproduce Pencil Beam, there is no gain in physical accuracy; there is only acceleration of an imprecise method. Second, the training data reflects local planning patterns and biases: preferred beam topologies, normalization criteria, margin philosophies. A model trained in a highly specialized center may not generalize to a center with different patient populations, equipment or practices.<\/p>\n<p>The table below summarizes the most relevant sources of bias in training datasets for dose models:<\/p>\n<table>\n<thead>\n<tr>\n<th>Source of bias<\/th>\n<th>Description<\/th>\n<th>Potential clinical impact<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Case selection bias<\/td>\n<td>Atypical or difficult cases excluded from clinical approval<\/td>\n<td>Model underestimates complexity; failure in difficult scenarios<\/td>\n<\/tr>\n<tr>\n<td>Reference algorithm bias<\/td>\n<td><em>Ground truth<\/em> generated by engine with known limitations (e.g., PB in lung)<\/td>\n<td>Preserves systematic errors from the original engine<\/td>\n<\/tr>\n<tr>\n<td>Institutional bias<\/td>\n<td>Single-center planning patterns<\/td>\n<td>Low generalizability to other institutions<\/td>\n<\/tr>\n<tr>\n<td>Selection bias anatomical<\/td>\n<td>Underrepresentation of rare or post-surgical anatomies<\/td>\n<td>Silent failure in cases outside the distribution<\/td>\n<\/tr>\n<tr>\n<td>Time bias<\/td>\n<td>Changes in protocols, fixtures or equipment throughout collection<\/td>\n<td>Inconsistency in training labels<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The concept of <em>domain of validity<\/em> \u2014 the space of inputs over which the model can be considered reliable \u2014 is analogous to the commissioning scope of a physical engine, but much more difficult to delimit. For a conventional TPS, commissioning explicitly defines the energies, field sizes, phantom geometries, and tissues for which the engine has been validated. For an ML model, this space is implicitly defined by the distribution of the training data, and there is no standardized protocol to formally characterize it.<\/p>\n<h2 id=\"generalization-to-machines-energies-and-anatomies\">Generalization to machines, energies and anatomies<\/h2>\n<p>One of the most practical challenges for clinical adoption is the transferability of models between linear accelerators, beam energies and patient populations. A model trained on data from a specific accelerator with 6 MV FFF has, <em>a priori<\/em>, no guarantee of correct behavior on a different platform, at 10 MV, or in physically filtered beams. Differences in the shape of the energy spectrum, electronic contamination, virtual source size and beam profiles result in qualitatively distinct dose distributions in regions of build-up, penumbra and inhomogeneities.<\/p>\n<p>The literature describes approaches to <em>transfer learning<\/em> and <em>domain adaptation<\/em> to reduce the cost of re-training when migrating to a new machine, but validation evidence for clinical use is still limited and mostly comes from academic groups. Commercial implementations must be evaluated for the exact scope of machines and energies for which the model has been validated by the manufacturer \u2014 information that should appear in the system&#8217;s technical documentation, not in marketing material.<\/p>\n<p>The anatomical dimension is equally critical. Models trained predominantly on prostate cases tend to perform better in this location and lower in the head and neck, where proximity to critical OARs and anatomical variability are greater. The following table summarizes the relationship between case complexity and extrapolation risk:<\/p>\n<table>\n<thead>\n<tr>\n<th>Case category<\/th>\n<th>Relative complexity<\/th>\n<th>Model extrapolation risk<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Conventional prostate (7 fields IMRT)<\/td>\n<td>Low<\/td>\n<td>Low, if represented in training<\/td>\n<\/tr>\n<tr>\n<td>Head and neck (VMAT)<\/td>\n<td>High<\/td>\n<td>Moderate to high<\/td>\n<\/tr>\n<tr>\n<td>Lung with severe heterogeneities<\/td>\n<td>High<\/td>\n<td>High \u2014 especially Dm\/Dw and dim light<\/td>\n<\/tr>\n<tr>\n<td>Post-surgery with metallic prostheses<\/td>\n<td>Very high<\/td>\n<td>High \u2014 CT artifacts out of distribution<\/td>\n<\/tr>\n<tr>\n<td>Pediatric<\/td>\n<td>Medium-high<\/td>\n<td>High \u2014 anatomy underrepresented in most sets<\/td>\n<\/tr>\n<tr>\n<td>Re-irradiation<\/td>\n<td>High<\/td>\n<td>High \u2014 accumulated dose not modeled in training<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Post-surgical anatomies, the presence of metallic implants with CT artifacts, and pediatric cases represent high-risk extrapolation scenarios that deserve specific escalation protocols for verification by an independent physical engine.<\/p>\n<h2 id=\"uncertainty-outlier-detection-and-silent-failures\">Uncertainty, outlier detection and silent failures<\/h2>\n<p>A limitation of classical deterministic engines (AAA, Acuros XB, Pencil Beam) is that they produce a single dose value per voxel, with no estimate of uncertainty associated with the model itself \u2014 only the commissioning measurements. Paradoxically, machine learning methods offer tools for estimating predictive uncertainty: <em>Monte Carlo Dropout<\/em>, <em>deep ensembles<\/em>, <em>conformal prediction<\/em> and probabilistic models such as Bayesian neural networks. When implemented, these techniques allow the model to indicate regions of greater uncertainty\u2014a valuable diagnostic signal that deterministic engines do not provide.<\/p>\n<p>The problem is that these techniques are rarely available in commercial implementations and still lack robust clinical validation. The opposite \u2014 and clinically more dangerous \u2014 risk is that of <em>silent failure<\/em>: the model produces a dose distribution that is plausible in appearance (passing simple DVH and isodose checks) but systematically wrong in specific regions, without any warning indicator. Documented examples include errors in regions of high heterogeneity (air-tissue interfaces, lung), shallow build-up, and small fields \u2014 exactly the regions where simpler algorithms like Pencil Beam also fail, but for well-understood and auditable physical reasons.<\/p>\n<p>Outlier detection \u2014 identifying cases outside the domain of validity before using prediction \u2014 is an active area of \u200b\u200bresearch. Metrics such as distance in latent feature space, anomaly scores based on autoencoders, and comparison with training distributions have been explored. In the absence of automatic tools, the practical approach is to: (1) define explicit exclusion criteria based on the characteristics of the training set; (2) require independent verification by physical engine for cases in high-risk categories; and (3) implement discrepancy reporting processes as part of routine QA.<\/p>\n<h2 id=\"how-to-compare-ai-monte-carlo-and-deterministic-solvers\">How to compare AI, Monte Carlo and deterministic solvers<\/h2>\n<p>The comparison between calculation engines must be structured in at least three independent dimensions: physical accuracy, computational performance and clinical validation maturity. Often, discussions about AI versus MC inappropriately collapse these dimensions, generating claims that are true in one dimension and misleading in the others.<\/p>\n<p>The AAPM report <a href=\"https:\/\/www.aapm.org\/pubs\/reports\/RPT_105.pdf\">TG-105<\/a> establishes a methodological framework for MC commissioning in radiotherapy that remains relevant as a reference for any high-fidelity engine. The proposed acceptance criteria \u2014 gamma comparisons, DVH analyses, specific test scenarios \u2014 can and should be applied to surrogate models when they are used as the primary calculation engine. The fundamental difference is that, for MC, statistical convergence can be increased with more particle histories; for an ML model, there is no equivalent self-refinement mechanism at inference time.<\/p>\n<p>The gamma analysis is common, but alone does not demonstrate clinical equivalence. Assessment should include DVHs, metrics by framework, error maps, worst-performing cases, and out-of-distribution testing, with criteria defined before validation.<\/p>\n<p>The <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/33227715\/\">proton physics literature<\/a> specifically discusses validation challenges where range uncertainties add a dimension that analytical algorithms address in a simplified way and MC addresses more fully. Surrogate models for protons face the additional challenge of correctly modeling the Bragg peak region and halo effects, which are highly sensitive to tissue composition\u2014exactly the type of variability that may not be well represented in the training set.<\/p>\n<h2 id=\"clinical-validation-governance-and-responsible-use\">Clinical validation, governance and responsible use<\/h2>\n<p>Clinical validation of a dose replacement model goes beyond technical commissioning. It covers the complete process of introducing a new technology into patient care, including risk assessment, staff training, definition of scope of use and continuous monitoring mechanisms. The concept of <em>digital twins<\/em> in oncology, discussed in <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/35602761\/\">recent reviews<\/a>, illustrates the ambition for personalized models of treatment response \u2014 but also highlights the gap between technological promise and the clinical evidence available for routine use.<\/p>\n<p>From a regulatory perspective, classification and responsibilities depend on jurisdiction, intended use and commercial configuration. On-site retraining, self-integration, or out-of-scope use may change applicable obligations. The institution must involve quality, regulatory affairs and safety before using healthcare.<\/p>\n<p>Internal governance must establish, at a minimum:<\/p>\n<ul>\n<li><strong>Commissioning protocol<\/strong> with pre-defined and non-adjustable acceptance criteria <em>post hoc<\/em>;<\/li>\n<li><strong>Documented definition of the clinical scope of use<\/strong> (anatomical sites, techniques, energies, age groups);<\/li>\n<li><strong>Escalation process<\/strong> for cases that exceed the scope, with independent engine verification;<\/li>\n<li><strong>Periodic audits<\/strong> comparing surrogate model outputs with independent calculations on a sample of real clinical cases;<\/li>\n<li><strong>Reporting and investigation process for discrepancies<\/strong>, integrated into the institution&#8217;s quality management system.<\/li>\n<\/ul>\n<p>The underlying ethical issue is that radiotherapy planning involves decisions with consequences for the patient. Gaining speed is only clinically useful when uncertainty, domain validity, oversight, and accountability are defined.<\/p>\n<h2 id=\"faq\">FAQ<\/h2>\n<h3>Can an AI model with a high concordance rate gamma with respect to MC be considered equivalent to MC for clinical use?<\/h3>\n<p>Not necessarily. High gamma agreement on the validation set demonstrates average performance over the tested cases, but does not guarantee correct behavior outside the training domain. Clinical equivalence requires validation on cases representative of <em>the entire<\/em> range of situations in which the model will be used, including edge cases and adverse scenarios \u2014 not just typical cases. Furthermore, MC has an incremental convergence mechanism (more stories, lower statistical uncertainty); the ML model does not. The comparison should include worst-case analysis and DVH metrics per structure, not just the median gamma rate.<\/p>\n<h3>How to differentiate, in the TPS documentation, whether the engine uses real AI or GPU acceleration?<\/h3>\n<p>Search the technical documentation for the terms &#8220;machine learning&#8221;, &#8220;neural network&#8221;, &#8220;deep learning&#8221; or &#8220;trained model&#8221;. GPU-accelerated engines like GPUMCD are stochastic MC on GPU; Your documentation will describe particle samples, cross sections, and statistical convergence. An ML model will describe network architecture, training data, and validation metrics. In case of ambiguity, ask the manufacturer for the <em>Intended Use Statement<\/em> and the clinical validation documentation for the specific engine \u2014 documents that must exist for any regulated device.<\/p>\n<h3>What is the impact of the distinction dose to medium \/ dose to water on dose substitute models?<\/h3>\n<p>The model learns to reproduce the convention of the engine that generated the training data (Dm or Dw), but rarely makes this convention explicit to the user. If the reference engine is Acuros XB set to Dw, the model will output Dw implicitly; if set to Dm, it will output Dm. In anatomies with a high proportion of cortical bone or air-tissue interface, the difference between Dm and Dw may be clinically relevant. The user must track and document which convention the model reproduces, ensuring that the plan acceptance criteria are consistent with it.<\/p>\n<h3>Is it possible to use an AI-based dose model trained at another institution without local re-training?<\/h3>\n<p>Transferability depends on population, equipment, energy and protocols. Even with multicenter training, it is necessary to validate performance in the local environment with representative cases and adequate independent references. The validation scope must match the intended use.<\/p>\n<h3>What are the highest risk scenarios for silent failure in dose surrogate models?<\/h3>\n<p>Highest risk scenarios include small fields, high heterogeneity, superficial build-up, post-surgical anatomies, implants and re-irradiation. In these cases, the protocol should require additional controls proportionate to the risk, including independent comparison when technically appropriate.<\/p>\n<h2 id=\"references\">References<\/h2>\n<ul>\n<li>\n<p>AAPM Task Group 105. <em>Issues associated with clinical implementation of Monte Carlo-based photon and electron external beam treatment planning<\/em>. AAPM Report 105. Available at: <a href=\"https:\/\/www.aapm.org\/pubs\/reports\/RPT_105.pdf\">https:\/\/www.aapm.org\/pubs\/reports\/RPT_105.pdf<\/a><\/p>\n<\/li>\n<li>\n<p>Paganetti H, et al. <em>Roadmap: proton therapy physics and biology<\/em>. Physics in Medicine &amp; Biology, 2021. Available at: <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/33227715\/\">https:\/\/pubmed.ncbi.nlm.nih.gov\/33227715\/<\/a><\/p>\n<\/li>\n<li>\n<p>Elhalawani H, et al. <em>Digital twins in clinical oncology<\/em>. Nature Medicine, 2022. Available at: <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/35602761\/\">https:\/\/pubmed.ncbi.nlm.nih.gov\/35602761\/<\/a><\/p>\n<\/li>\n<li>\n<p>AAPM Task Group 218. <em>Tolerance limits and methodologies for IMRT measurement-based verification QA<\/em>. Medical Physics, 2018. Available at: <a href=\"https:\/\/www.aapm.org\/pubs\/reports\/RPT_218.pdf\">https:\/\/www.aapm.org\/pubs\/reports\/RPT_218.pdf<\/a><\/p>\n<\/li>\n<li>\n<p>FDA. <em>Artificial Intelligence and Machine Learning in Software as a Medical Device<\/em>. US Food &amp; Drug Administration. Available at: <a href=\"https:\/\/www.fda.gov\/medical-devices\/software-medical-device-samd\/artificial-intelligence-and-machine-learning-software-medical-device\">https:\/\/www.fda.gov\/medical-devices\/software-medical-device-samd\/artificial-intelligence-and-machine-learning-software-medical-device<\/a><\/p>\n<\/li>\n<li>\n<p>Vassiliev ON, et al. <em>Feasibility of a multigroup deterministic solution method for three-dimensional radiotherapy dose calculations<\/em>. Physics in Medicine &amp; Biology, 2010. Available at: <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/20124709\/\">https:\/\/pubmed.ncbi.nlm.nih.gov\/20124709\/<\/a><\/p>\n<\/li>\n<li>\n<p>Zhu J, Liu X, Chen L. <em>A preliminary study of a photon dose calculation algorithm using a convolutional neural network<\/em>. Physics in Medicine &amp; Biology, 2020. Available at: <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/33063695\/\">https:\/\/pubmed.ncbi.nlm.nih.gov\/33063695\/<\/a><\/p>\n<\/li>\n<li>\n<p>Pastor-Serrano O, Perk\u00f3 Z. <em>Millisecond speed deep learning based proton dose calculation<\/em>. Physics in Medicine &amp; Biology, 2022. Available at: <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/35447605\/\">https:\/\/pubmed.ncbi.nlm.nih.gov\/35447605\/<\/a><\/p>\n<\/li>\n<\/ul>\n<aside aria-label=\"Dose-calculation algorithm map\" class=\"dose-cluster-nav\">\n<h2>Dose-calculation algorithm map<\/h2>\n<h3>Methods and algorithms<\/h3>\n<ul>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/photon-dose-calculation-algorithms\/\">Complete guide<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/empirical-broad-beam-dose-calculation\/\">Empirical methods and Batho<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/superposition-clarkson-terma-dose\/\">Clarkson, superposition, and TERMA<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/pencil-beam-radiotherapy-limitations\/\">Pencil Beam<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/collapsed-cone-convolution-kernels\/\">Collapsed Cone<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/aaa-eclipse-algorithm-explained\/\">AAA<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/acuros-xb-lbte-dose-calculation\/\">Acuros XB<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/dose-to-medium-vs-dose-to-water-radiotherapy\/\">Dose to medium vs dose to water<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/monte-carlo-radiotherapy-guide\/\">Monte Carlo<\/a><\/li>\n<\/ul>\n<h3>Advanced applications<\/h3>\n<ul>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/monaco-gpumcd-dose-to-medium-dose-to-water\/\">Monaco and GPUMCD<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/electron-dose-algorithms-pencil-beam-emc-monte-carlo\/\">Electron dose algorithms<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/protons-pencil-beam-vs-monte-carlo-dose-calculation\/\">Protons: Pencil Beam vs Monte Carlo<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/mr-linac-magnetic-field-dose-calculation-monte-carlo\/\">MR-Linac dose calculation<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/adaptive-radiotherapy-dose-recalculation-cbct-synthetic-ct\/\">Adaptive recalculation on CBCT and synthetic CT<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/ai-radiotherapy-dose-calculation-monte-carlo\/\">AI dose calculation<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/commissioning-qa-dose-algorithm-comparison\/\">Commissioning and QA<\/a><\/li>\n<\/ul>\n<\/aside>\n","protected":false},"excerpt":{"rendered":"<p>Where neural networks can accelerate dose calculation, what they do not replace, and how to validate generalization and uncertainty.<\/p>\n","protected":false},"author":1,"featured_media":18144,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"fifu_image_url":"","fifu_image_alt":"","footnotes":""},"categories":[99,230],"tags":[],"class_list":{"0":"post-18173","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-radiotherapy","8":"category-software-en"},"aioseo_notices":[],"rt_seo":{"title":"AI Dose Calculation: Surrogate Models and Clinical Limits","description":"AI in radiotherapy dose calculation: Monte Carlo surrogate models, generalization, uncertainty, validation, and clinical limits.","canonical":"https:\/\/rtmedical.com.br\/en\/ai-radiotherapy-dose-calculation-monte-carlo\/","og_image":"https:\/\/rtmedical.com.br\/wp-content\/uploads\/2026\/06\/ai-dose-guardrails.jpg","robots":"index,follow","schema_type":"Article","include_in_llms":true,"llms_label":"Technical guide","llms_summary":"Where neural networks can accelerate dose calculation, what they do not replace, and how to validate generalization and uncertainty.","faq_items":[{"q":"Can an AI model with a high concordance rate gamma with respect to MC be considered equivalent to MC for clinical use?","a":"Not necessarily. High gamma agreement on the validation set demonstrates average performance over the tested cases, but does not guarantee correct behavior outside the training domain. Clinical equivalence requires validation on cases representative of the entire range of situations in which the model will be used, including edge cases and adverse scenarios \u2014 not just typical cases. Furthermore, MC has an incremental convergence mechanism (more stories, lower statistical uncertainty); the ML model does not. The comparison should include worst-case analysis and DVH metrics per structure, not just the median gamma rate."},{"q":"How to differentiate, in the TPS documentation, whether the engine uses real AI or GPU acceleration?","a":"Search the technical documentation for the terms \"machine learning\", \"neural network\", \"deep learning\" or \"trained model\". GPU-accelerated engines like GPUMCD are stochastic MC on GPU; Your documentation will describe particle samples, cross sections, and statistical convergence. An ML model will describe network architecture, training data, and validation metrics. In case of ambiguity, ask the manufacturer for the Intended Use Statement and the clinical validation documentation for the specific engine \u2014 documents that must exist for any regulated device."},{"q":"What is the impact of the distinction dose to medium \/ dose to water on dose substitute models?","a":"The model learns to reproduce the convention of the engine that generated the training data (Dm or Dw), but rarely makes this convention explicit to the user. If the reference engine is Acuros XB set to Dw, the model will output Dw implicitly; if set to Dm, it will output Dm. In anatomies with a high proportion of cortical bone or air-tissue interface, the difference between Dm and Dw may be clinically relevant. The user must track and document which convention the model reproduces, ensuring that the plan acceptance criteria are consistent with it."},{"q":"Is it possible to use an AI-based dose model trained at another institution without local re-training?","a":"Transferability depends on population, equipment, energy and protocols. Even with multicenter training, it is necessary to validate performance in the local environment with representative cases and adequate independent references. The validation scope must match the intended use."},{"q":"What are the highest risk scenarios for silent failure in dose surrogate models?","a":"Highest risk scenarios include small fields, high heterogeneity, superficial build-up, post-surgical anatomies, implants and re-irradiation. In these cases, the protocol should require additional controls proportionate to the risk, including independent comparison when technically appropriate."}],"video":[],"gtin":"","mpn":"","brand":"","aggregate_rating":[]},"_links":{"self":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/18173\/"}],"collection":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/"}],"about":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/types\/post\/"}],"author":[{"embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/users\/1\/"}],"replies":[{"embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/comments\/?post=18173"}],"version-history":[{"count":1,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/18173\/revisions\/"}],"predecessor-version":[{"id":18175,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/18173\/revisions\/18175\/"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/media\/18144\/"}],"wp:attachment":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/media\/?parent=18173"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/categories\/?post=18173"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/tags\/?post=18173"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}