Interactive Playground
Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.
pathfinder scan --ruleset python/PYTHON-LANG-SEC-090 --project .About This Rule
Understanding the vulnerability and how it is detected
Python's xml.etree.ElementTree and related modules (xml.dom.minidom, xml.sax, xml.parsers.expat) are vulnerable to XML External Entity (XXE) injection by default. XXE attacks occur when an XML parser processes external entity references in untrusted XML content.
A malicious XML document can define external entities that reference local files (/etc/passwd, /etc/shadow, SSH private keys) or internal network resources. When the parser resolves these entities, the file contents are embedded in the parsed XML and returned to the attacker, or the parser makes HTTP/HTTPS requests to internal network addresses for server-side request forgery (SSRF).
The defusedxml library provides safe drop-in replacements for all of Python's XML parsers that disable external entities, DTD processing, and other dangerous XML features.
Security Implications
Potential attack scenarios if this vulnerability is exploited
Local File Disclosure via XXE
An XXE payload with <!ENTITY xxe SYSTEM "file:///etc/passwd"> reads local files and embeds their content in the parsed XML. Attackers can read application configuration files, SSH private keys, database credentials, and any file readable by the web server process.
Server-Side Request Forgery (SSRF)
External entity references can point to internal network URLs (http://169.254.169.254/ for AWS metadata, http://10.0.0.1/admin). The XML parser makes HTTP requests to these addresses, enabling internal network discovery and access to cloud metadata credentials.
Denial of Service via Billion Laughs
The "Billion Laughs" attack uses nested entity expansion to cause exponential memory consumption and CPU usage. A small XML document of a few kilobytes can exhaust all available memory, crashing the parser and the application.
Blind XXE via Out-of-Band Channels
When error messages are suppressed, attackers use out-of-band XXE: the entity references a URL they control (xxe.attacker.com), allowing them to see file contents via DNS or HTTP requests to their server even without direct output.
How to Fix
Recommended remediation steps
- 1Replace xml.etree.ElementTree with defusedxml.ElementTree for all XML parsing of untrusted input.
- 2Install defusedxml: pip install defusedxml. It provides safe drop-in replacements for all standard library XML parsers.
- 3For configuration files or SOAP/REST XML responses, validate the XML schema after parsing with a schema validator (lxml's xmlschema).
- 4Limit the size of XML input accepted from external sources to prevent denial-of-service attacks.
- 5Use JSON instead of XML for new API designs to avoid XML security complexity entirely.
Detection Scope
How Code Pathfinder analyzes your code for this vulnerability
This rule detects calls to xml.etree.ElementTree.parse(), xml.etree.ElementTree.fromstring(), xml.etree.ElementTree.fromstringlist(), and xml.etree.ElementTree.iterparse() in Python source code. These functions use the vulnerable default parser unless explicitly replaced.
Compliance & Standards
Industry frameworks and regulations that require detection of this vulnerability
References
External resources and documentation
Similar Rules
Explore related security rules for Python
Insecure xml.dom.minidom Usage (XXE)
xml.dom.minidom is vulnerable to XML External Entity (XXE) attacks. Use defusedxml.minidom for safe XML parsing.
Insecure xmlrpc Usage (XXE Risk)
xmlrpc.client.ServerProxy and xmlrpc.server modules are vulnerable to XXE attacks via malicious XML-RPC payloads. Use defusedxml.xmlrpc for protection.
Frequently Asked Questions
Common questions about Insecure XML Parsing (XXE Vulnerability)
New feature
Get these findings posted directly on your GitHub pull requests
The Insecure XML Parsing (XXE Vulnerability) rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.