Insecure xml.dom.minidom Usage (XXE)

MEDIUM

xml.dom.minidom is vulnerable to XML External Entity (XXE) attacks. Use defusedxml.minidom for safe XML parsing.

Rule Information

Language
Python
Category
Python Core
Author
Shivasurya
Shivasurya
Last Updated
2026-03-22
Tags
pythonminidomxmlxxexml-external-entityCWE-611OWASP-A05
CWE References

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-LANG-SEC-091 --project .
1
2
3
4
5
6
7
8
9
rule.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

About This Rule

Understanding the vulnerability and how it is detected

Python's xml.dom.minidom provides a DOM-based XML parser that is vulnerable to XML External Entity (XXE) injection. minidom uses the expat parser internally but may resolve external entity references depending on the Python version and configuration.

minidom is commonly used for XML generation and pretty-printing (toprettyxml()) as well as parsing. When used to parse untrusted XML input via minidom.parse() or minidom.parseString(), XXE attacks can lead to local file disclosure, server-side request forgery, and denial-of-service.

defusedxml.minidom provides a safe replacement that prevents XXE and other XML attacks while maintaining API compatibility.

Security Implications

Potential attack scenarios if this vulnerability is exploited

1

File Disclosure via External Entities

XXE in minidom allows reading local files through entity references. The minidom DOM structure includes entity content, which is then returned when the application accesses text nodes or attributes.

2

SSRF via External Entity URLs

External entities referencing http:// or https:// URLs cause the parser to make outbound HTTP requests. This enables SSRF attacks against internal services, cloud metadata endpoints, and internal APIs.

3

Out-of-Band Data Exfiltration

Parameterized XXE using XML parameter entities can exfiltrate file contents to attacker-controlled servers even when the parsed XML result is not directly returned to the attacker.

4

Denial of Service via DTD Attacks

DTD-based attacks (Billion Laughs, quadratic blowup) cause exponential or quadratic memory consumption during parsing, crashing the application and potentially affecting other services on the same host.

How to Fix

Recommended remediation steps

  • 1Replace xml.dom.minidom.parse() and minidom.parseString() with defusedxml.minidom.parse() and defusedxml.minidom.parseString().
  • 2Install defusedxml: pip install defusedxml. The API is compatible with standard minidom.
  • 3Note that xml.dom.minidom.Document() for XML generation (not parsing) is safe and does not need to be replaced.
  • 4Validate the XML structure and content after safe parsing to ensure it meets expected schema constraints.
  • 5For large XML documents, consider SAX-based parsing (defusedxml.sax) which uses less memory than DOM parsing.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule detects calls to xml.dom.minidom.parse(), xml.dom.minidom.parseString(), and the abbreviated import forms in Python source code. XML generation via Document() is not flagged.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

OWASP Top 10
A05:2021 - Security Misconfiguration (XXE)
CWE Top 25
CWE-611 - Improper Restriction of XML External Entity Reference
PCI DSS v4.0
Requirement 6.2.4 - Protect against injection attacks including XXE
NIST SP 800-53
SI-10: Information Input Validation

References

External resources and documentation

Similar Rules

Explore related security rules for Python

Frequently Asked Questions

Common questions about Insecure xml.dom.minidom Usage (XXE)

toprettyxml() serializes a DOM tree to a string. If the DOM was created by parsing untrusted XML using defusedxml.minidom, the output is safe since XXE was blocked during parsing. If the DOM was created from unsafe minidom parsing, the XXE payload has already been executed. The safety depends on how the DOM was created, not on toprettyxml() itself.
xml.dom.minidom does not include built-in XML schema validation. For schema validation with DOM-style parsing, use lxml with an XMLSchema object. Alternatively, parse with defusedxml.ElementTree and validate the resulting element tree against an expected structure.
xml.dom.minidom is not deprecated as of Python 3.12 but it is considered a legacy API. For new XML processing code, xml.etree.ElementTree (or defusedxml.ElementTree) is generally preferred for its simpler API and better performance.
ElementTree is generally faster and uses less memory than minidom for both parsing and traversal. minidom loads the entire document into a full DOM tree with Node objects, while ElementTree uses a lighter-weight element representation. For performance- sensitive parsing, prefer defusedxml.ElementTree over defusedxml.minidom.
Replace the import: instead of from xml.dom import minidom, use import defusedxml.minidom as minidom. The parse() and parseString() APIs are compatible. The Document class for XML generation is still imported from xml.dom.minidom directly.
The rule flags minidom.parse() and minidom.parseString() which are the parsing functions vulnerable to XXE. xml.dom.minidom.Document() and the DOM manipulation API used for XML generation are not flagged as they do not parse external input.

New feature

Get these findings posted directly on your GitHub pull requests

The Insecure xml.dom.minidom Usage (XXE) rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works