# PYTHON-LAMBDA-SEC-003: Lambda Command Injection via os.spawn*()

> **Severity:** CRITICAL | **CWE:** CWE-78 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** AWS Lambda
- **URL:** https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-003
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LAMBDA-SEC-003 --project .`

## Description

This rule detects OS command injection vulnerabilities in AWS Lambda functions where
untrusted event data flows into os.spawn*() family functions: os.spawnl(),
os.spawnle(), os.spawnlp(), os.spawnlpe(), os.spawnv(), os.spawnve(), os.spawnvp(),
and os.spawnvpe().

Lambda functions receive input from the event dictionary populated by API Gateway,
SQS, SNS, S3, DynamoDB Streams, and other AWS triggers. Fields such as
event.get("body"), event.get("queryStringParameters"), and event["Records"] are
fully attacker-controllable in public-facing deployments. There is no sanitization
layer between the raw event payload and application code.

The os.spawn*() family is a lower-level process spawning API that predates the
subprocess module. Variants ending in 'p' (spawnlp, spawnvp) search PATH for the
executable, making them susceptible to PATH manipulation attacks when combined with
user-controlled executable names. The 'l' variants take arguments as individual
strings; when any argument position contains attacker-controlled event data, the
attacker controls a process argument. In Lambda, successful exploitation yields
access to the execution role's AWS credentials stored in environment variables,
enabling further attacks against the broader AWS environment.


## Vulnerable Code

```python
import os
import subprocess
import asyncio

# SEC-003: os.spawn with event data
def handler_spawn(event, context):
    prog = event.get('program')
    os.spawnl(os.P_NOWAIT, prog, prog)
    return {"statusCode": 200}
```

## Secure Code

```python
import subprocess
import re
import json

ALLOWED_BINARIES = {
    'convert': '/usr/bin/convert',
    'ffmpeg': '/usr/bin/ffmpeg',
}

def lambda_handler(event, context):
    body = json.loads(event.get('body', '{}'))
    binary = body.get('tool', '')
    input_file = body.get('input', '')

    # SECURE: Validate binary against a hardcoded allowlist with full paths
    if binary not in ALLOWED_BINARIES:
        return {'statusCode': 400, 'body': 'Unknown tool'}

    # SECURE: Validate filename with strict regex
    if not re.match(r'^[a-zA-Z0-9_.\-]+$', input_file):
        return {'statusCode': 400, 'body': 'Invalid filename'}

    # SECURE: Use subprocess.run with list — no os.spawn*, no shell
    result = subprocess.run(
        [ALLOWED_BINARIES[binary], '--', f'/tmp/{input_file}'],
        capture_output=True,
        text=True,
        timeout=30
    )
    return {'statusCode': 200, 'body': result.stdout}

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class OSModule(QueryType):
    fqns = ["os"]

# Lambda event sources — event dict is the primary untrusted input
_LAMBDA_SOURCES = [
    calls("event.get"),
    calls("event.items"),
    calls("event.values"),
    calls("event.keys"),
    calls("*.get"),
]


@python_rule(
    id="PYTHON-LAMBDA-SEC-003",
    name="Lambda Command Injection via os.spawn*()",
    severity="CRITICAL",
    category="aws_lambda",
    cwe="CWE-78",
    tags="python,aws,lambda,command-injection,os-spawn,OWASP-A03,CWE-78",
    message="Lambda event data flows to os.spawn*(). Use subprocess with list args instead.",
    owasp="A03:2021",
)
def detect_lambda_os_spawn():
    """Detects Lambda event data flowing to os.spawn*() functions."""
    return flows(
        from_sources=_LAMBDA_SOURCES,
        to_sinks=[
            OSModule.method("spawnl", "spawnle", "spawnlp", "spawnlpe",
                            "spawnv", "spawnve", "spawnvp", "spawnvpe"),
        ],
        sanitized_by=[
            calls("shlex.quote"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Replace all os.spawn*() calls with subprocess.run() using a list of arguments and shell=False to use the modern, well-documented safe API.
- When using spawn variants that search PATH ('p' variants), always pass the full absolute path to the executable instead of relying on PATH lookup.
- Validate all event fields with strict allowlists or regular expressions before they appear in any argument position of a spawn or subprocess call.
- Apply least-privilege IAM policies to the Lambda execution role to minimize the scope of credential exfiltration if injection occurs.
- Monitor Lambda CloudWatch logs for unexpected process output or error patterns that may indicate active exploitation.

## Security Implications

- **Direct Process Spawning with Attacker-Controlled Arguments:** os.spawn*() spawns a new process directly. When event data controls argument
positions, an attacker can influence the behavior of the spawned process.
Argument injection remains possible if the target binary interprets
attacker-controlled flags as commands (e.g., passing '-e cmd' to certain binaries).

- **PATH Manipulation via spawnlp/spawnvp Variants:** The 'p' variants (spawnlp, spawnvp, spawnlpe, spawnvpe) search the PATH
environment variable for the executable. If an attacker can control the executable
name argument and the PATH contains writable directories, the spawn can be
redirected to an attacker-controlled binary in /tmp.

- **AWS Credential Exfiltration:** Spawned processes in the Lambda environment inherit the parent's environment
variables, including AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and
AWS_SESSION_TOKEN. An attacker who controls process arguments can spawn a
process that reads and exfiltrates these credentials via the network.

- **Execution Environment Reconnaissance:** Injected arguments can cause spawned processes to list filesystem contents, read
configuration files, or probe the network. Lambda execution environments contain
deployed code, environment variables with secrets, and /tmp data that can be
enumerated and exfiltrated through injected process arguments.


## FAQ

**Q: How does os.spawn*() differ from os.system() in terms of injection risk?**

os.system() always invokes the shell, so any shell metacharacter in user input
can chain additional commands. os.spawn*() does not invoke a shell by default,
so traditional shell metacharacter injection does not work in the same way.
However, spawn is still dangerous when event data controls argument positions:
the target binary may interpret injected flags as commands, and PATH-searching
variants (spawnlp, spawnvp) can be directed to attacker-controlled executables.


**Q: When should I use os.spawn*() versus subprocess?**

Python's official documentation recommends using the subprocess module in
preference to os.spawn*(). subprocess.run() with a list provides the same
process-spawning capability with a cleaner interface, better error handling,
and well-documented security considerations. There is no modern use case where
os.spawn*() is preferable to subprocess.run().


**Q: Are the 'p' variants of os.spawn more dangerous than non-'p' variants?**

The 'p' variants (spawnlp, spawnvp, spawnlpe, spawnvpe) search the PATH
environment variable for the executable. This creates an additional attack vector:
if the executable name is attacker-controlled or if /tmp appears in PATH, the
spawn can execute an unintended binary. Non-'p' variants require the full path,
limiting this specific risk. Both variants are dangerous when argument positions
contain event data.


**Q: What if I need to spawn a process from Lambda that processes an S3 object?**

Download the S3 object to /tmp using the boto3 S3 client with a sanitized,
hardcoded local filename. Then pass that known-safe local path to subprocess.run()
with a list. Never derive the local filename or subprocess arguments directly
from the S3 key, bucket name, or event metadata without strict validation.


**Q: How do I audit existing Lambda functions for os.spawn*() usage?**

Search all Lambda function source files for 'os.spawn'. Review each call site to
determine whether any argument position can be influenced by event data,
environment variables, or database values. For each finding, replace the spawn
call with subprocess.run() using a list and add input validation.


## References

- [CWE-78: OS Command Injection](https://cwe.mitre.org/data/definitions/78.html)
- [OWASP OS Command Injection Defense Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/OS_Command_Injection_Defense_Cheat_Sheet.html)
- [Python os.spawn* documentation](https://docs.python.org/3/library/os.html#os.spawnl)
- [AWS Lambda Security Best Practices](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)
- [OWASP Command Injection](https://owasp.org/www-community/attacks/Command_Injection)
- [Python subprocess — Recommended replacement for os.spawn*](https://docs.python.org/3/library/subprocess.html)

---

Source: https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-003
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
