Django Path Traversal via os.path.join()

HIGH

User input flows to os.path.join() leading to file operations, enabling path traversal to access files outside the intended directory.

Rule Information

Language
Python
Category
Django
Author
Shivasurya
Shivasurya
Last Updated
2026-03-22
Tags
pythondjangopath-traversaldirectory-traversalos-path-joinfile-accesstaint-analysisinter-proceduralCWE-22OWASP-A01
CWE References

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-DJANGO-SEC-041 --project .
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
rule.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

About This Rule

Understanding the vulnerability and how it is detected

This rule detects path traversal vulnerabilities in Django applications where untrusted user input from HTTP request parameters flows into os.path.join() before being used in file system operations.

os.path.join() is commonly believed to be safe for constructing file paths because it handles path separator differences. However, os.path.join() has a critical security behavior: if any component is an absolute path (starts with /), it discards all previous components and uses the absolute path as the new base. This means os.path.join('/var/uploads', '/etc/passwd') returns '/etc/passwd'.

Additionally, os.path.join() does not strip or normalize '../' traversal sequences, so os.path.join('/var/uploads', '../../../etc/passwd') returns a path that resolves outside the intended directory. These behaviors make os.path.join() with user input a path traversal vulnerability unless the result is validated with os.path.realpath().

Security Implications

Potential attack scenarios if this vulnerability is exploited

1

Absolute Path Override via os.path.join() Semantics

os.path.join('/uploads', user_input) where user_input is '/etc/passwd' returns '/etc/passwd' -- the upload directory prefix is silently discarded. This is a common misunderstanding of os.path.join() safety. Attackers who know this behavior can supply absolute paths to bypass any attempt to restrict access to a subdirectory.

2

Relative Traversal via ../ Sequences

os.path.join('/uploads', '../../../etc/shadow') returns a path that, when passed to open(), resolves to /etc/shadow. Standard traversal payloads work unchanged through os.path.join() because the function makes no attempt to normalize or restrict component values.

3

Symlink-Based Escape Even After basename()

If symbolic links are present within the intended directory, a path constructed with os.path.join() may resolve outside it even when basename() was applied first. The realpath() check is essential to catch symlink-based traversal that escapes the intended directory boundary.

4

Write Access to Critical Files

If os.path.join() output feeds into open() in write mode, log rotation scripts, or file deletion operations, path traversal enables overwriting configuration files, log files, or application code. Writing to .py files in auto-reload servers causes code execution.

How to Fix

Recommended remediation steps

  • 1After os.path.join(), always call os.path.realpath() to resolve symlinks and normalize the path, then verify the result starts with the intended base directory + os.sep.
  • 2Apply os.path.basename() to user-provided filename components before passing them to os.path.join() to strip absolute path and traversal sequences.
  • 3Use an explicit allowlist of permitted filenames or file IDs stored in the database rather than constructing paths from user input at all.
  • 4Avoid accepting file paths as request parameters; instead, accept file IDs that map to paths stored securely in the database.
  • 5When serving media files, use Django's storage API (default_storage.url()) rather than constructing raw filesystem paths.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule performs inter-procedural taint analysis with global scope. Sources include calls("request.GET.get"), calls("request.POST.get"), calls("request.GET.__getitem__"), calls("request.POST.__getitem__"), calls("request.body"), and calls("request.read"). The sink is calls("os.path.join") where any tainted value appears in the arguments (tracked via .tracks(0)), with the resulting path subsequently used in file operations. The rule also tracks the result of os.path.join() forward to file operation functions. Sanitizers include os.path.basename() applied before join plus os.path.realpath() with startswith verification. The rule follows taint across file and module boundaries.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

CWE Top 25
CWE-22 ranked #8 in 2023 Most Dangerous Software Weaknesses
OWASP Top 10
A01:2021 - Broken Access Control
PCI DSS v4.0
Requirement 6.2.4 - protect against path traversal attacks
NIST SP 800-53
SI-10: Information Input Validation; AC-3: Access Enforcement
ISO 27001
A.9.4.1: Information access restriction

References

External resources and documentation

Similar Rules

Explore related security rules for Python

Frequently Asked Questions

Common questions about Django Path Traversal via os.path.join()

os.path.join() handles path separators for portability, not for security. Its documented behavior includes: if an argument is an absolute path (starts with / on Unix), all preceding components are discarded. And it does not strip ../ components. So os.path.join('/safe/dir', '/etc/passwd') returns '/etc/passwd' and os.path.join('/safe/dir', '../../etc/passwd') returns a path outside /safe/dir. Neither is safe with user input without realpath verification.
os.path.normpath() collapses ../ sequences but does not resolve symbolic links. An attacker can use a symlink within the allowed directory that points outside it. os.path.realpath() resolves all symlinks to their true filesystem locations, making the startswith() check reliable. Use realpath() not normpath() for security-critical path validation.
Only usages where user-controlled data is an argument to os.path.join() are at risk. If the paths are constructed entirely from configuration constants and hardcoded strings, they are not vulnerable. This rule only flags cases where tainted data from request parameters reaches os.path.join() arguments.
No. HTML/URL escaping functions are for preventing XSS or URL-encoding issues, not path traversal. They do not strip ../ or absolute path prefixes from filenames. The correct sanitizers for path traversal are os.path.basename() and os.path.realpath() with a directory boundary check.
Store files using Django's FileField with UUID-based names (use upload_to with a callable that generates UUIDs). Serve downloads by looking up the FileField value from the model (which contains a safe, validated storage path) using the file ID from the URL, not a user-provided filename. This design eliminates the need for any path construction from user input.
Custom storage backends that use os.path.join() for path construction should implement the same basename + realpath + startswith validation pattern. Django's built-in FileSystemStorage does this correctly; custom backends should follow the same pattern. Review any custom storage backend implementations for path traversal vulnerabilities during security audits.

New feature

Get these findings posted directly on your GitHub pull requests

The Django Path Traversal via os.path.join() rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works