# PYTHON-FLASK-SEC-007: Flask Path Traversal via open()

> **Severity:** HIGH | **CWE:** CWE-22 | **OWASP:** A01:2021

- **Language:** Python
- **Category:** Flask
- **URL:** https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-SEC-007
- **Detection:** `pathfinder scan --ruleset python/PYTHON-FLASK-SEC-007 --project .`

## Description

This rule detects path traversal vulnerabilities in Flask applications where
user-controlled input from HTTP request parameters flows to Python's open() or
io.open() built-in functions without path sanitization. When an attacker supplies
directory traversal sequences (../, ..\, URL-encoded variants %2e%2e%2f) in a
filename or path parameter, the application reads or writes files outside the
intended directory.

Path traversal is one of the most common file-handling bugs in web applications.
In Flask apps it typically appears in file download endpoints (open the file,
return its contents) and file processing utilities (open the file, parse it).
The vulnerability is not always obvious because the user-supplied value may pass
through string concatenation with os.path.join() before reaching open() --
os.path.join('uploads/', '../../../etc/passwd') silently produces
'../../../etc/passwd' due to a quirk of the join implementation when later
segments contain separators.

The rule traces tainted data from Flask request sources to the filename argument
of open() and io.open() at position 0. Flows through os.path.basename() or
werkzeug's secure_filename() are recognized as sanitizers because basename()
strips all directory components and secure_filename() additionally removes
unsafe characters that could be used in traversal attempts.


## Vulnerable Code

```python
from flask import Flask, request

app = Flask(__name__)

@app.route('/read')
def read_file():
    filename = request.args.get('file')
    content = open(filename, 'r').read()
    return content
```

## Secure Code

```python
from flask import Flask, request, abort
from werkzeug.utils import secure_filename
import os

app = Flask(__name__)
UPLOAD_DIR = os.path.realpath('/var/app/uploads')

@app.route('/download')
def download_file():
    filename = request.args.get('file', '')

    # SAFE step 1: secure_filename strips directory separators and traversal sequences
    safe_name = secure_filename(filename)
    if not safe_name:
        abort(400, 'Invalid filename')

    # SAFE step 2: join with the restricted directory
    filepath = os.path.join(UPLOAD_DIR, safe_name)

    # SAFE step 3: realpath check ensures symlinks cannot escape the directory
    if not os.path.realpath(filepath).startswith(UPLOAD_DIR + os.sep):
        abort(403, 'Access denied')

    with open(filepath, 'rb') as f:
        return f.read()

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class Builtins(QueryType):
    fqns = ["builtins"]

class IOModule(QueryType):
    fqns = ["io"]


@python_rule(
    id="PYTHON-FLASK-SEC-007",
    name="Flask Path Traversal via open()",
    severity="HIGH",
    category="flask",
    cwe="CWE-22",
    tags="python,flask,path-traversal,file-access,OWASP-A01,CWE-22",
    message="User input flows to open(). Use os.path.basename() or werkzeug.utils.secure_filename().",
    owasp="A01:2021",
)
def detect_flask_path_traversal():
    """Detects Flask request data flowing to file open()."""
    return flows(
        from_sources=[
            calls("request.args.get"),
            calls("request.form.get"),
            calls("request.values.get"),
            calls("request.get_json"),
        ],
        to_sinks=[
            Builtins.method("open").tracks(0),
            IOModule.method("open").tracks(0),
            calls("open"),
        ],
        sanitized_by=[
            calls("os.path.basename"),
            calls("secure_filename"),
            calls("werkzeug.utils.secure_filename"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Apply werkzeug.utils.secure_filename() to any user-supplied filename before constructing a file path -- it strips directory separators and traversal sequences.
- After constructing the full path with os.path.join(), resolve symlinks with os.path.realpath() and verify the result starts with the intended base directory prefix.
- Maintain an allowlist of permitted filenames or file extensions rather than relying on path sanitization alone -- reject anything not in the allowlist.
- Separate file storage from the application root directory so a traversal beyond the upload directory cannot reach application source files or server configuration.
- Use send_from_directory() from Flask instead of open() for serving static files -- it performs its own path safety checks.

## Security Implications

- **Arbitrary File Read:** An attacker who supplies ../../../etc/passwd as a filename reads the system
password file. On servers where application secrets are stored in files
(/etc/ssl/private/server.key, .env, config.ini, ~/.aws/credentials), the
attacker reads production secrets in a single request.

- **Source Code Disclosure:** Python applications store business logic, database schemas, and hardcoded
credentials in .py files. A path traversal from /uploads/ to ../app.py or
../config.py exposes source code that enables more targeted subsequent attacks.

- **Arbitrary File Write (Write Mode):** If the open() call uses write mode ('w', 'a', 'wb'), the attacker can write
arbitrary content to arbitrary paths. Writing to /etc/cron.d/ installs a
cron job. Writing to the application's own Python files modifies source code.
Writing to server configuration files redirects traffic.

- **Log File Poisoning:** Traversal to application log files combined with write mode allows log
injection -- inserting fake log entries that confuse audit trails, or
injecting content into log files that are subsequently parsed by log
processing tools.


## FAQ

**Q: Does os.path.join() protect against path traversal?**

No. os.path.join('/uploads/', '../../../etc/passwd') returns '../../../etc/passwd'
because os.path.join discards earlier components when a later component is an
absolute path or begins with a separator. It does not sanitize traversal sequences.
Use secure_filename() on the user-supplied part before joining, then verify the
resolved absolute path starts with the intended base directory.


**Q: What is the difference between os.path.basename() and secure_filename()?**

os.path.basename() returns only the last component of a path, stripping all
directory components. 'secure_filename' from werkzeug does that and also removes
leading dots, spaces, special characters, and non-ASCII characters that could
cause unexpected behavior on different operating systems. For web application
file uploads, secure_filename() is the more robust choice.


**Q: Is open() in read mode safe if I use basename()?**

basename() alone is sufficient to prevent directory traversal in most cases.
However, the realpath() check is also important for symlink attacks: an attacker
who can write a symlink into the uploads directory (via another vulnerability)
can point it outside the directory. basename() does not protect against this.
Always combine basename() or secure_filename() with a realpath() boundary check.


**Q: Does this rule catch open() in write mode as well?**

Yes. The rule tracks taint to open() at argument position 0 regardless of the
mode argument. Both read and write opens with tainted paths are flagged. Write
mode path traversal is often more severe because it allows file creation and
content modification.


**Q: We store files in cloud object storage (S3, GCS). Does path traversal still apply?**

Cloud storage uses key names rather than filesystem paths. Key injection
(../config or arbitrary key names) is a related but separate issue. If you
pass user-supplied values as S3 key names, ensure the key is constrained to
a prefix using startswith() checks. This rule focuses on Python's open() and
io.open() for local filesystem operations.


**Q: How do I serve user-uploaded files safely in Flask?**

Use Flask's send_from_directory(directory, filename) function. It validates that
the requested file is within the specified directory before serving it. Pass the
upload directory as the first argument and the sanitized filename (from
secure_filename()) as the second. Do not call open() manually for serving files.


**Q: What should I do if secure_filename() returns an empty string?**

secure_filename() returns an empty string when the input filename contains only
unsafe characters (e.g., '../../../etc/passwd' after stripping becomes empty).
Always check for an empty return value and reject the request with a 400 error.
Never use an empty string as a file path.


## References

- [CWE-22: Path Traversal](https://cwe.mitre.org/data/definitions/22.html)
- [OWASP Path Traversal](https://owasp.org/www-community/attacks/Path_Traversal)
- [Werkzeug secure_filename documentation](https://werkzeug.palletsprojects.com/en/latest/utils/#werkzeug.utils.secure_filename)
- [Flask send_from_directory documentation](https://flask.palletsprojects.com/en/latest/api/#flask.send_from_directory)
- [OWASP File Upload Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/File_Upload_Cheat_Sheet.html)
- [Python open() documentation](https://docs.python.org/3/library/functions.html#open)

---

Source: https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-SEC-007
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
