# PYTHON-FLASK-SEC-011: Flask SSRF via Tainted URL Host

> **Severity:** HIGH | **CWE:** CWE-918 | **OWASP:** A10:2021

- **Language:** Python
- **Category:** Flask
- **URL:** https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-SEC-011
- **Detection:** `pathfinder scan --ruleset python/PYTHON-FLASK-SEC-011 --project .`

## Description

This rule detects Server-Side Request Forgery (SSRF) in Flask applications where
user-controlled input from HTTP request parameters is used as the host component
of a URL that is then passed to outbound HTTP request functions (requests.get,
requests.post, requests.put, requests.delete, and stdlib urllib equivalents).

This is distinct from PYTHON-FLASK-SEC-006, which covers cases where the entire
URL is user-supplied. This rule covers a subtler pattern: the application constructs
a URL with an attacker-controlled hostname, for example:
  host = request.args.get('service')
  url = f'https://{host}/api/data'
  response = requests.get(url)

The developer may believe this pattern is safe because the scheme and path are
hardcoded. However, the attacker controls where the request is sent. By supplying
169.254.169.254 as the host, they redirect the request to the cloud metadata endpoint.
By supplying an internal service hostname, they pivot to unexposed internal APIs.

The taint analysis traces the host value from Flask request sources through string
interpolation and f-string construction to the URL argument of requests functions
at position 0. Flows through validate_host() or is_safe_url() are recognized as
sanitizers because these functions typically implement hostname allowlist checks.


## Vulnerable Code

```python
from flask import Flask, request
import requests as http_requests

app = Flask(__name__)

@app.route('/api')
def api_call():
    host = request.args.get('host')
    url = "https://" + host + "/api/data"
    resp = http_requests.get(url)
    return resp.text
```

## Secure Code

```python
from flask import Flask, request, abort
import requests

app = Flask(__name__)

# Explicit allowlist of trusted service hostnames
ALLOWED_SERVICE_HOSTS = frozenset({
    'api.payments.internal',
    'inventory.service.internal',
    'auth.service.internal',
})

def validate_host(host):
    """Allowlist-based host validation. Raises ValueError for untrusted hosts."""
    if host not in ALLOWED_SERVICE_HOSTS:
        raise ValueError(f'Host {host!r} is not in the trusted service allowlist')
    return host

@app.route('/proxy')
def proxy_service():
    service_host = request.args.get('service', '')
    try:
        # SAFE: validate_host() checks against explicit allowlist before URL construction
        safe_host = validate_host(service_host)
    except ValueError as e:
        abort(400, str(e))
    url = f'https://{safe_host}/api/data'
    response = requests.get(url, timeout=10)
    return response.json()

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class RequestsLib(QueryType):
    fqns = ["requests"]


@python_rule(
    id="PYTHON-FLASK-SEC-011",
    name="Flask Tainted URL Host",
    severity="HIGH",
    category="flask",
    cwe="CWE-918",
    tags="python,flask,ssrf,url-host,OWASP-A10,CWE-918",
    message="User input used in URL host construction. Validate against an allowlist of hosts.",
    owasp="A10:2021",
)
def detect_flask_tainted_url_host():
    """Detects Flask request data used in URL host construction flowing to HTTP requests."""
    return flows(
        from_sources=[
            calls("request.args.get"),
            calls("request.form.get"),
            calls("request.values.get"),
            calls("request.get_json"),
        ],
        to_sinks=[
            RequestsLib.method("get", "post", "put", "delete").tracks(0),
            calls("http_requests.get"),
            calls("http_requests.post"),
            calls("urllib.request.urlopen"),
        ],
        sanitized_by=[
            calls("*.validate_host"),
            calls("*.is_safe_url"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Maintain an explicit allowlist of service hostnames that the application is permitted to call and reject any host not on the list before constructing the URL.
- Consider removing the host parameter entirely if the set of callable services is small and known at deployment time -- hardcode the URLs in configuration rather than accepting them from HTTP requests.
- After allowlist validation, resolve the hostname to an IP address and verify it is not in a private, loopback, or link-local range to prevent DNS rebinding attacks that bypass hostname allowlists.
- Use a service registry or service discovery mechanism (Consul, Kubernetes service names) that constrains reachable services at the infrastructure level rather than relying on application validation alone.
- Set a short timeout and response size limit on all server-side HTTP requests to limit the impact of SSRF to slow or large internal services.

## Security Implications

- **Cloud Metadata Service Credential Theft:** An attacker supplies 169.254.169.254 (AWS/GCP/Azure metadata endpoint) as the
host parameter. The Flask server makes a request to its own metadata service,
receives IAM credentials (AWS: /latest/meta-data/iam/security-credentials/),
and the application may return this response to the attacker. This is one of
the highest-impact SSRF attacks in cloud-hosted applications.

- **Internal Service Discovery:** The attacker iterates internal hostnames and IP addresses. Services that
are not exposed to the internet -- databases, admin panels, internal APIs,
container orchestration endpoints -- respond to requests from the Flask
container because they trust traffic from within the internal network.
The attacker maps the internal topology through timing and response differences.

- **Authentication Bypass on Internal Services:** Internal services often trust requests from the application server by source
IP. An attacker who controls the host parameter can make the Flask server
request any internal endpoint with the server's trusted identity, bypassing
network-level access controls that would otherwise block external access.

- **Server-Side Port Scanning via Error Differentiation:** Different error responses for connection refused, connection timeout, and
successful connection reveal which ports are open on internal hosts. An
attacker who can control the host (and optionally port) component of the
URL can build a comprehensive port map of the internal network through the
Flask endpoint.


## FAQ

**Q: How is this different from PYTHON-FLASK-SEC-006?**

SEC-006 catches cases where the entire URL is user-supplied (e.g., requests.get(user_url)).
SEC-011 catches cases where only the host component is user-supplied and the
application constructs the URL by interpolating the host into a template string
(e.g., requests.get(f'https://{user_host}/api')). The latter is often mistakenly
considered safe because the scheme and path are hardcoded. Both rules should run
together for complete SSRF coverage.


**Q: The host comes from a dropdown on the frontend. Is that safe?**

Frontend validation is bypassed by sending raw HTTP requests. An attacker can send
any string as the host parameter regardless of what the dropdown contains. Server-side
allowlist validation is mandatory. The dropdown limits what legitimate users submit;
it does not limit what attackers submit.


**Q: We use IMDSv2 on AWS. Does that prevent SSRF to the metadata endpoint?**

IMDSv2 requires a PUT request with a session token before GET requests. A naive
requests.get('http://169.254.169.254/...') fails under IMDSv2. However, more
sophisticated SSRF payloads that perform the two-step IMDSv2 flow can still
succeed. IMDSv2 raises the bar for exploitation but does not eliminate the need
for host allowlist validation.


**Q: Does this rule fire when the host is extracted from a config file rather than the request?**

No. The rule only tracks taint from Flask request sources (request.args.get,
request.form.get, etc.). A host value read from a config file, environment variable,
or database that is not tainted from a request source does not trigger the rule.


**Q: Can I use a regex to validate the hostname instead of an allowlist?**

Regex-based hostname validation is fragile and error-prone. A regex that matches
valid hostnames may also match 169.254.169.254 or internal service names depending
on how it is written. An explicit allowlist (frozenset of trusted hostnames) is
simpler, easier to audit, and has no edge cases. Use the allowlist approach.


**Q: How should I handle the case where the set of valid service hosts changes at runtime?**

Store the allowlist in a configuration file or environment variable that is loaded
at startup. If it needs to change at runtime (e.g., in a microservice environment
with dynamic service discovery), use a trusted service registry like Consul or
Kubernetes service names and validate against the registry's response -- never trust
the client to tell you what service to call.


**Q: Does DNS-based validation (resolve and check IP range) replace the hostname allowlist?**

DNS-based IP range validation (reject private IPs after resolution) is a useful
defense-in-depth layer but does not replace the allowlist. An attacker who
controls a publicly routable domain can point it to any IP, including private
ranges via DNS rebinding after the initial validation. Both controls together
provide stronger protection.


## References

- [CWE-918: Server-Side Request Forgery](https://cwe.mitre.org/data/definitions/918.html)
- [OWASP SSRF Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html)
- [AWS IMDSv2 and SSRF protection](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html)
- [GCP metadata server and SSRF](https://cloud.google.com/compute/docs/metadata/overview)
- [requests library documentation](https://requests.readthedocs.io/en/latest/)
- [Portswigger SSRF lab](https://portswigger.net/web-security/ssrf)

---

Source: https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-SEC-011
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
