Regex DoS Risk

LOW

Detects re.compile(), re.match(), re.search(), and re.findall() calls that should be audited for catastrophic backtracking patterns.

Rule Information

Language

Python

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-LANG-SEC-103 --project .

rule.py

About This Rule

Understanding the vulnerability and how it is detected

This rule flags calls to re.compile(), re.match(), re.search(), and re.findall() as audit items. Regular expressions with nested quantifiers, overlapping alternatives, or unbounded repetition can cause catastrophic backtracking where matching time grows exponentially with input length.

An attacker who controls the input string can craft a payload that causes a single regex match to consume 100% CPU for minutes or hours, effectively creating a denial-of-service condition. This is known as ReDoS (Regular Expression Denial of Service).

The rule operates at audit level because not all regex patterns are vulnerable. Review each flagged pattern for nested quantifiers like (a+)+, (a|a)+, or (a*)*. Consider using the re2 library for untrusted input, which guarantees linear-time matching.

Security Implications

Potential attack scenarios if this vulnerability is exploited

Denial of Service via Catastrophic Backtracking

A regex pattern like ^(a+)+$ takes exponential time on inputs like "aaaaaaaaaaaaaaaaab". Each additional 'a' doubles the matching time. An attacker can send a short string that locks up a worker thread for minutes.

Application Unavailability

In web applications, a ReDoS attack ties up request-handling threads. A handful of crafted requests can exhaust the thread pool and make the application unresponsive to legitimate users.

Amplification in Validation Logic

Regex patterns in input validation (email, URL, phone number) are common ReDoS targets because they process user-controlled input directly. A vulnerability in a validation regex affects every request that triggers it.

How to Fix

Recommended remediation steps

1Audit regex patterns for nested quantifiers like (a+)+, (a|a)+, or (.*a){10}
2Use atomic groups or possessive quantifiers where supported to prevent backtracking
3Consider the re2 library (google-re2) for patterns that process untrusted input, as it guarantees linear-time matching
4Set input length limits before applying regex patterns to user-controlled strings
5Use Python 3.11+ timeout parameter in re.match() and re.search() for defense in depth

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule matches all calls to re.compile(), re.match(), re.search(), and re.findall() via the QueryType pattern ReModule.method("compile", "match", "search", "findall"). It operates as an audit rule that flags all regex usage for manual review of the pattern string.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

OWASP Top 10

A06:2021 - Vulnerable and Outdated Components

CWE

CWE-1333 - Inefficient Regular Expression Complexity

NIST SP 800-53

SC-5: Denial of Service Protection

References

External resources and documentation

CWE-1333: Inefficient Regular Expression Complexity OWASP ReDoS Prevention Python re module documentation Google RE2 library

Similar Rules

Explore related security rules for Python

HIGH

Dangerous eval() Usage Detected

eval() executes arbitrary Python expressions from strings, enabling remote code execution when called with untrusted input.

HIGH

Unverified SSL Context Created

ssl._create_unverified_context() disables certificate verification entirely, making TLS connections vulnerable to man-in-the-middle attacks.

Frequently Asked Questions

Common questions about Regex DoS Risk

No. Simple patterns without nested quantifiers or overlapping alternatives are safe. The risk comes from patterns like (a+)+, (a|a)+, or (.+)+ where the engine can try exponentially many matching paths.

Try matching your pattern against a string that almost matches but doesn't. For example, if your pattern expects a valid email, try a long string of valid-looking characters followed by an invalid character. If matching time grows noticeably with input length, the pattern is vulnerable.

Python 3.11+ added a timeout parameter to re.match(), re.search(), and other functions. For older versions, run regex matching in a separate thread with a timeout or use the google-re2 library.

Not all regex usage is vulnerable. This is an audit rule that flags regex calls for review. The actual severity depends on whether the pattern has nested quantifiers and whether the input is user-controlled.

Run: pathfinder ci --ruleset python/lang --project .

New feature

Get these findings posted directly on your GitHub pull requests

The Regex DoS Risk rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works

Back to Python Core All Languages →