{"id":18304,"date":"2026-06-15T19:21:27","date_gmt":"2026-06-15T22:21:27","guid":{"rendered":"https:\/\/rtmedical.com.br\/tmp-en-1781562086644\/"},"modified":"2026-06-15T19:26:03","modified_gmt":"2026-06-15T22:26:03","slug":"validate-ai-predicted-dose-qa-commissioning","status":"publish","type":"post","link":"https:\/\/rtmedical.com.br\/en\/validate-ai-predicted-dose-qa-commissioning\/","title":{"rendered":"How to Validate AI-Predicted Dose: QA, Commissioning, and Clinical Risk"},"content":{"rendered":"<p>Validating AI-predicted dose requires more than running gamma agreement on average cases. The correct question is: for which intended use, patient population, TPS, energies, structures, and failure limits will the model be accepted?<\/p>\n<p>This checklist complements the comparison of <a href=\"https:\/\/rtmedical.com.br\/en\/ai-predicted-dose-mvision-raystation-optiplan\/\">MVision, RayStation, and OptiPlan<\/a> and the <a href=\"https:\/\/rtmedical.com.br\/en\/doserad2026-monte-carlo-ai-dose-calculation\/\">DoseRAD2026<\/a> benchmark. The goal is to turn enthusiasm for AI into an auditable medical physics process.<\/p>\n<figure class=\"wp-block-image size-large dose-algorithm-infographic\"><img alt=\"Clinical validation guardrails for AI-predicted dose models\" decoding=\"async\" data-src=\"https:\/\/rtmedical.com.br\/wp-content\/uploads\/2026\/06\/ai-dose-validation-qa.jpg\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" class=\"lazyload\" style=\"--smush-placeholder-width: 1600px; --smush-placeholder-aspect-ratio: 1600\/900;\" \/><figcaption>Original RT Medical Systems infographic for the AI-predicted dose cluster.<\/figcaption><\/figure>\n<h2>1. Define intended use before metrics<\/h2>\n<p>The same model may be acceptable for pre-planning and unacceptable for automatic approval. Document whether the output will be used as a visual estimate, optimization reference dose, difficult-case triage, secondary dose, adaptive support, or operational substitute for a physical calculation.<\/p>\n<ul>\n<li>Required input: CT, MRI, sCT, structures, prescription, beam geometry, or complete plan.<\/li>\n<li>Allowed output: beam dose, plan dose, optimization objectives, or warning.<\/li>\n<li>Authorized user: dosimetrist, physicist, physician, or automated pipeline.<\/li>\n<li>Allowed action: inform, suggest, automate, or block approval.<\/li>\n<\/ul>\n<h2>2. Build a local validation cohort<\/h2>\n<p>Multicenter training does not remove local validation. The sample must cover real protocols, anatomical extremes, implants, prostheses, air, cortical bone, re-irradiation, PTV near OAR, and TPS version changes. Easy cases matter, but edge cases test safety.<\/p>\n<figure class=\"wp-block-table\">\n<table>\n<thead>\n<tr>\n<th>Risk<\/th>\n<th>Example<\/th>\n<th>Minimum control<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Out of domain<\/td>\n<td>Post-surgical anatomy or metal implant<\/td>\n<td>OOD detector and mandatory physics review<\/td>\n<\/tr>\n<tr>\n<td>Localized physical error<\/td>\n<td>Air-tissue interface, build-up, small field<\/td>\n<td>Region-specific metric and independent comparison<\/td>\n<\/tr>\n<tr>\n<td>Clinically hidden error<\/td>\n<td>OAR DVH worsens with acceptable gamma<\/td>\n<td>D2%, Dmean, Vx and structure-level review<\/td>\n<\/tr>\n<tr>\n<td>Version change<\/td>\n<td>New MLC, TPS, or protocol<\/td>\n<td>Regression revalidation before use<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<h2>3. Use layered metrics<\/h2>\n<p>Gamma is useful, but insufficient alone. Combine voxel metrics, DVH, structure-level error, and worst-case analysis. For fast beam-level models, include segment or beamlet error. For full planning, include prescription, coverage, and OAR metrics.<\/p>\n<ul>\n<li>3D local gamma with documented dose threshold and strict criteria.<\/li>\n<li>MAE in high-, mid-, and low-dose regions.<\/li>\n<li>D98%, V95%, D2%, Dmean, and protocol-specific metrics.<\/li>\n<li>Error at interfaces and high-density materials.<\/li>\n<li>Inference time and batch failure rate.<\/li>\n<\/ul>\n<h2>4. Treat OOD as a safety requirement<\/h2>\n<p>AI models can fail silently. A mature protocol must state when the model should refuse, warn, or require independent review. Examples include anatomy outside training, missing contours, non-standard names, degraded MRI, unusual isocenter, and incompatible prescription.<\/p>\n<h2>5. Separate scientific validation from clinical validation<\/h2>\n<p>A benchmark such as <a href=\"https:\/\/doserad2026.grand-challenge.org\/\" rel=\"noopener\" target=\"_blank\">DoseRAD2026<\/a> measures performance under controlled rules. Clinical validation must also include DICOM integration, permissions, logs, traceability, model updates, cybersecurity, user training, and rollback.<\/p>\n<h2>6. Suggested acceptance structure<\/h2>\n<p>Final thresholds belong to the department and intended use. Still, validation must define limits before final testing, not after. Include automatic rejection, physics-review, and assistive-use approval criteria.<\/p>\n<ul>\n<li>No critical case with clinically relevant DVH error without warning.<\/li>\n<li>Stratified performance by site, protocol, and complexity.<\/li>\n<li>Reproducibility after TPS, model, or library update.<\/li>\n<li>Failure log and periodic technical committee review.<\/li>\n<\/ul>\n<h2>FAQ<\/h2>\n<h3>What is the biggest validation mistake?<\/h3>\n<p>Validating only the average. A model may look good overall and fail exactly in the rare cases that require stronger physics control.<\/p>\n<h3>Can the model be accepted if gamma is high?<\/h3>\n<p>Not automatically. Gamma must be combined with DVH, structure-level assessment, error in critical regions, and out-of-domain analysis.<\/p>\n<h3>When should revalidation happen?<\/h3>\n<p>After updates to TPS, model, protocol, structure set, scanner, imaging modality, energy, MLC, or treated population.<\/p>\n<section class=\"dose-ai-references\">\n<h2>References<\/h2>\n<ol>\n<li>AAPM TG-218. <a href=\"https:\/\/www.aapm.org\/pubs\/reports\/RPT_218.pdf\" rel=\"noopener\" target=\"_blank\">https:\/\/www.aapm.org\/pubs\/reports\/RPT_218.pdf<\/a><\/li>\n<li>FDA AI\/ML-enabled Software as a Medical Device. <a href=\"https:\/\/www.fda.gov\/medical-devices\/software-medical-device-samd\/artificial-intelligence-and-machine-learning-software-medical-device\" rel=\"noopener\" target=\"_blank\">https:\/\/www.fda.gov\/medical-devices\/software-medical-device-samd\/artificial-intelligence-and-machine-learning-software-medical-device<\/a><\/li>\n<li>DoseRAD2026 metrics and ranking. <a href=\"https:\/\/doserad2026.grand-challenge.org\/metrics-and-ranking\/\" rel=\"noopener\" target=\"_blank\">https:\/\/doserad2026.grand-challenge.org\/metrics-and-ranking\/<\/a><\/li>\n<li>RaySearch deep learning planning. <a href=\"https:\/\/www.raysearchlabs.com\/media\/publications\/white-papers\/deep-learning-planning\/\" rel=\"noopener\" target=\"_blank\">https:\/\/www.raysearchlabs.com\/media\/publications\/white-papers\/deep-learning-planning\/<\/a><\/li>\n<li>MVision Dose+. <a href=\"https:\/\/mvision.ai\/dose\/\" rel=\"noopener\" target=\"_blank\">https:\/\/mvision.ai\/dose\/<\/a><\/li>\n<\/ol>\n<\/section>\n<aside aria-label=\"AI dose prediction series\" class=\"dose-ai-series\">\n<h2>Series: AI-predicted dose<\/h2>\n<ul>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/ai-radiotherapy-dose-calculation-monte-carlo\/\">Hub: AI dose calculation<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/ai-predicted-dose-mvision-raystation-optiplan\/\">MVision, RayStation, and OptiPlan<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/doserad2026-monte-carlo-ai-dose-calculation\/\">DoseRAD2026 and Monte Carlo<\/a><\/li>\n<li><strong>Validation, QA, and commissioning<\/strong><\/li>\n<\/ul>\n<\/aside>\n<aside aria-label=\"Dose-calculation algorithm map\" class=\"dose-cluster-nav\">\n<h2>Dose-calculation algorithm map<\/h2>\n<h3>Methods and algorithms<\/h3>\n<ul>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/photon-dose-calculation-algorithms\/\">Complete guide<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/empirical-broad-beam-dose-calculation\/\">Empirical methods and Batho<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/superposition-clarkson-terma-dose\/\">Clarkson, superposition, and TERMA<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/pencil-beam-radiotherapy-limitations\/\">Pencil Beam<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/collapsed-cone-convolution-kernels\/\">Collapsed Cone<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/aaa-eclipse-algorithm-explained\/\">AAA<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/acuros-xb-lbte-dose-calculation\/\">Acuros XB<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/dose-to-medium-vs-dose-to-water-radiotherapy\/\">Dose to medium vs dose to water<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/monte-carlo-radiotherapy-guide\/\">Monte Carlo<\/a><\/li>\n<\/ul>\n<h3>Advanced applications<\/h3>\n<ul>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/monaco-gpumcd-dose-to-medium-dose-to-water\/\">Monaco and GPUMCD<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/electron-dose-algorithms-pencil-beam-emc-monte-carlo\/\">Electron dose algorithms<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/protons-pencil-beam-vs-monte-carlo-dose-calculation\/\">Protons: Pencil Beam vs Monte Carlo<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/mr-linac-magnetic-field-dose-calculation-monte-carlo\/\">MR-Linac dose calculation<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/adaptive-radiotherapy-dose-recalculation-cbct-synthetic-ct\/\">Adaptive recalculation on CBCT and synthetic CT<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/ai-radiotherapy-dose-calculation-monte-carlo\/\">AI dose calculation<\/a><\/li>\n<li><a href=\"https:\/\/rtmedical.com.br\/en\/commissioning-qa-dose-algorithm-comparison\/\">Commissioning and QA<\/a><\/li>\n<\/ul>\n<\/aside>\n","protected":false},"excerpt":{"rendered":"<p>Clinical validation checklist for AI-predicted dose: intended use, reference data, DVH, gamma, OOD, and governance.<\/p>\n","protected":false},"author":1,"featured_media":18290,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"fifu_image_url":"","fifu_image_alt":"","footnotes":""},"categories":[99,230],"tags":[],"class_list":{"0":"post-18304","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-radiotherapy","8":"category-software-en"},"aioseo_notices":[],"rt_seo":{"title":"How to Validate AI-Predicted Dose: QA, Commissioning, and Cl","description":"How to validate AI-predicted dose models in radiotherapy: QA, commissioning, OOD, gamma, DVH, and clinical limits.","canonical":"https:\/\/rtmedical.com.br\/en\/validate-ai-predicted-dose-qa-commissioning\/","og_image":"https:\/\/rtmedical.com.br\/wp-content\/uploads\/2026\/06\/ai-dose-validation-qa.jpg","robots":"index,follow","schema_type":"Article","include_in_llms":true,"llms_label":"Technical guide","llms_summary":"Clinical validation checklist for AI-predicted dose: intended use, reference data, DVH, gamma, OOD, and governance.","faq_items":[{"q":"What is the biggest validation mistake?","a":"Validating only the average. Rare and out-of-domain cases are the most important safety tests."},{"q":"Can the model be accepted if gamma is high?","a":"Not automatically. Gamma must be combined with DVH, structure-level analysis, regional error, and edge cases."},{"q":"When should revalidation happen?","a":"After changes to TPS, model, protocol, structure set, scanner, imaging, energy, MLC, or treated population."}],"video":[],"gtin":"","mpn":"","brand":"","aggregate_rating":[]},"_links":{"self":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/18304\/"}],"collection":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/"}],"about":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/types\/post\/"}],"author":[{"embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/users\/1\/"}],"replies":[{"embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/comments\/?post=18304"}],"version-history":[{"count":1,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/18304\/revisions\/"}],"predecessor-version":[{"id":18306,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/posts\/18304\/revisions\/18306\/"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/media\/18290\/"}],"wp:attachment":[{"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/media\/?parent=18304"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/categories\/?post=18304"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rtmedical.com.br\/en\/wp-json\/wp\/v2\/tags\/?post=18304"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}