# PYTHON-LAMBDA-SEC-020: Lambda XSS via Tainted HTML Response Body

> **Severity:** HIGH | **CWE:** CWE-79 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** AWS Lambda
- **URL:** https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-020
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LAMBDA-SEC-020 --project .`

## Description

This rule detects Cross-Site Scripting (XSS) vulnerabilities in AWS Lambda functions
where untrusted event data is embedded directly in the HTML response body returned to
API Gateway without HTML-escaping.

Lambda functions acting as API Gateway backends frequently generate HTML responses
dynamically. When the response body (the 'body' key in the Lambda return value) is
set to Content-Type: text/html and contains unescaped event data, the browser renders
any injected HTML or JavaScript. Event sources include event.get("queryStringParameters"),
event.get("body"), event["pathParameters"], and event["headers"], all of which are
attacker-controllable through API Gateway requests.

Unlike web frameworks that escape template variables by default, Lambda handlers
that construct HTML strings manually have no automatic escaping layer. The developer
must explicitly call html.escape() on every piece of event data embedded in an HTML
response. Failure to do so enables reflected XSS, where an attacker crafts a URL
or form that causes the Lambda to reflect malicious script back to the victim's
browser.


## Vulnerable Code

```python
import json
import pickle

# SEC-020: tainted HTML response
def handler_html_response(event, context):
    name = event.get('name')
    body = f"<html><body>Hello {name}</body></html>"
    return {
        "statusCode": 200,
        "body": json.dumps({"html": body}),
        "headers": {"Content-Type": "text/html"}
    }
```

## Secure Code

```python
import html
import json

def lambda_handler(event, context):
    params = event.get('queryStringParameters', {}) or {}
    name = params.get('name', 'World')
    message = params.get('message', '')

    # SECURE: Escape all event data with html.escape() before embedding in HTML
    safe_name = html.escape(name)
    safe_message = html.escape(message)

    body = f'''<!DOCTYPE html>
<html>
<head><title>Greeting</title></head>
<body>
  <h1>Hello, {safe_name}!</h1>
  <p>{safe_message}</p>
</body>
</html>'''

    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'text/html; charset=utf-8',
            'X-Content-Type-Options': 'nosniff',
            'X-XSS-Protection': '1; mode=block',
            'Content-Security-Policy': "default-src 'self'"
        },
        'body': body
    }

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

_LAMBDA_SOURCES = [
    calls("event.get"),
    calls("event.items"),
    calls("event.values"),
    calls("*.get"),
]


@python_rule(
    id="PYTHON-LAMBDA-SEC-020",
    name="Lambda Tainted HTML Response",
    severity="MEDIUM",
    category="aws_lambda",
    cwe="CWE-79",
    tags="python,aws,lambda,xss,html,OWASP-A03,CWE-79",
    message="Lambda event data in HTML response body. Sanitize output with html.escape().",
    owasp="A03:2021",
)
def detect_lambda_html_response():
    """Detects Lambda event data in HTML response returned to API Gateway."""
    return flows(
        from_sources=_LAMBDA_SOURCES,
        to_sinks=[
            calls("json.dumps"),
        ],
        sanitized_by=[
            calls("html.escape"),
            calls("escape"),
            calls("markupsafe.escape"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Call html.escape() on every piece of Lambda event data before embedding it in an HTML response body.
- Add a Content-Security-Policy header to API Gateway responses to limit the impact of any XSS that bypasses output escaping.
- Set X-Content-Type-Options: nosniff to prevent MIME type sniffing that could enable XSS via non-HTML responses.
- Return JSON responses instead of HTML wherever possible; if the client needs HTML, render it client-side with a trusted JavaScript framework that escapes content automatically.
- Validate and restrict the format of event fields that appear in HTML responses to further reduce the injection surface.

## Security Implications

- **Reflected XSS via API Gateway:** An attacker who controls any query parameter, path parameter, or request body
field can inject <script> tags or event handler attributes (onerror, onload)
into the HTML response. The victim's browser executes the injected JavaScript
when they visit the attacker-crafted URL, enabling session hijacking, credential
theft, and malicious redirects.

- **Session Token Theft:** JavaScript injected via Lambda XSS can read document.cookie, localStorage, and
sessionStorage to steal session tokens and authentication credentials. These
tokens can be exfiltrated to attacker-controlled infrastructure in a single
request, giving the attacker persistent access to the victim's account.

- **Phishing and Content Injection:** XSS allows attackers to modify the DOM to display fake login forms, error
messages, or instructions that trick users into submitting credentials or
installing malware. The attack occurs on the legitimate domain served by the
Lambda-backed API, making it harder for users to recognize.

- **SameSite Cookie Bypass:** Lambda-backed APIs that rely on SameSite=Lax cookie protections against CSRF
are still vulnerable to XSS, because XSS executes in the same origin context
and can make authenticated requests directly without triggering CSRF protections.


## FAQ

**Q: Why doesn't Lambda automatically escape HTML in response bodies?**

Lambda returns response bodies as strings without any processing. Unlike template
engines (Jinja2, Django templates) that escape variables by default, Lambda handlers
that construct HTML manually have no automatic escaping layer. The developer must
explicitly call html.escape() on every piece of event data. Lambda's design as a
generic compute service means it does not impose any output encoding conventions.


**Q: Is XSS in a Lambda-backed API as dangerous as XSS in a traditional web application?**

Yes. The XSS vulnerability exists in the browser, not the server. When the Lambda
returns an HTML response that the browser renders, injected scripts execute in the
context of the application's origin. Session cookies, localStorage data, and API
tokens accessible to that origin are all at risk. The Lambda execution model
on the server side is irrelevant to the browser-side XSS impact.


**Q: Can I use Content-Security-Policy headers to prevent XSS instead of html.escape()?**

CSP is a valuable defense-in-depth measure, but it is not a substitute for output
encoding. CSP can be bypassed in various scenarios (JSONP endpoints, allowed CDN
domains, policy misconfigurations). The correct primary defense is html.escape()
applied to all event data before embedding in HTML. Add CSP headers as an
additional layer of protection.


**Q: What if my Lambda uses a Jinja2 template to generate HTML?**

Jinja2's autoescaping feature, when enabled (Environment(autoescape=True) or
using the select_autoescape helper), HTML-escapes variables automatically.
If autoescaping is enabled and you are not using the |safe filter on untrusted
data, Jinja2 prevents XSS. If autoescaping is disabled or you use |safe on
event data, the vulnerability remains.


**Q: My Lambda returns JSON. Is it still vulnerable to XSS?**

If the response uses Content-Type: application/json, browsers typically do not
render it as HTML and XSS is unlikely. However, if the JSON response is later
rendered by a client-side JavaScript framework that uses innerHTML or similar
DOM manipulation without escaping, XSS can still occur on the client side.
Set Content-Type correctly and avoid application/json responses with Content-Type:
text/html to prevent browser misinterpretation.


## References

- [CWE-79: Cross-Site Scripting](https://cwe.mitre.org/data/definitions/79.html)
- [OWASP XSS Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html)
- [OWASP Cross Site Scripting](https://owasp.org/www-community/attacks/xss/)
- [AWS Lambda Security Best Practices](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)
- [Python html.escape() documentation](https://docs.python.org/3/library/html.html#html.escape)
- [Content Security Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP)

---

Source: https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-020
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
