# PYTHON-LAMBDA-SEC-011: Lambda SQL Injection via psycopg2 cursor.execute()

> **Severity:** CRITICAL | **CWE:** CWE-89 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** AWS Lambda
- **URL:** https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-011
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LAMBDA-SEC-011 --project .`

## Description

This rule detects SQL injection vulnerabilities in AWS Lambda functions where
untrusted event data flows into psycopg2 cursor.execute() calls without proper
parameterization, enabling SQL injection against RDS PostgreSQL, Aurora PostgreSQL,
or any PostgreSQL-compatible backend.

Lambda functions frequently connect to RDS PostgreSQL or Aurora PostgreSQL using
psycopg2, the standard Python PostgreSQL adapter. Event data from API Gateway,
SQS, SNS, S3, and other triggers (event.get("body"), event.get("queryStringParameters"),
event["Records"]) is attacker-controllable and must not be embedded in SQL strings.

When event data is concatenated or f-stringed into the SQL query string before being
passed as the first argument to cursor.execute(), an attacker can inject arbitrary
SQL. PostgreSQL is particularly powerful as an injection target: attackers can use
COPY TO/FROM PROGRAM to execute OS commands (on unmanaged PostgreSQL; RDS restricts
this but not all extensions), pg_read_file() to read server files, and UNION SELECT
to exfiltrate arbitrary tables. Lambda functions connecting to PostgreSQL typically
lack the ORM layer that encourages parameterization, making raw psycopg2 calls
a common source of injection vulnerabilities.


## Vulnerable Code

```python
import json

# SEC-011: psycopg2
def handler_psycopg2(event, context):
    import psycopg2
    conn = psycopg2.connect("dbname=app")
    cursor = conn.cursor()
    name = event.get('name')
    cursor.execute(f"SELECT * FROM users WHERE name = '{name}'")
    return {"statusCode": 200}
```

## Secure Code

```python
import json
import psycopg2
import psycopg2.extras

def get_db_connection():
    return psycopg2.connect(
        host='rds-host.cluster.us-east-1.rds.amazonaws.com',
        dbname='appdb',
        user='lambda_user',
        password='password',
        sslmode='require'
    )

def lambda_handler(event, context):
    params = event.get('queryStringParameters', {}) or {}
    username = params.get('username', '')
    status = params.get('status', '')

    # SECURE: Use %s placeholders and pass values as a tuple to cursor.execute()
    conn = get_db_connection()
    try:
        with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cursor:
            cursor.execute(
                "SELECT id, username, email FROM users WHERE username = %s AND status = %s",
                (username, status)
            )
            rows = cursor.fetchall()
    finally:
        conn.close()
    return {'statusCode': 200, 'body': json.dumps(rows, default=str)}

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class DBCursor(QueryType):
    fqns = ["sqlite3.Cursor", "mysql.connector.cursor.MySQLCursor",
            "psycopg2.extensions.cursor", "pymysql.cursors.Cursor"]
    patterns = ["*Cursor"]
    match_subclasses = True

_LAMBDA_SOURCES = [
    calls("event.get"),
    calls("event.items"),
    calls("event.values"),
    calls("*.get"),
]


@python_rule(
    id="PYTHON-LAMBDA-SEC-011",
    name="Lambda SQL Injection via psycopg2 Cursor",
    severity="CRITICAL",
    category="aws_lambda",
    cwe="CWE-89",
    tags="python,aws,lambda,sql-injection,psycopg2,OWASP-A03,CWE-89",
    message="Lambda event data flows to psycopg2 cursor.execute(). Use parameterized queries.",
    owasp="A03:2021",
)
def detect_lambda_psycopg2_sqli():
    """Detects Lambda event data flowing to psycopg2 cursor."""
    return flows(
        from_sources=_LAMBDA_SOURCES,
        to_sinks=[
            DBCursor.method("execute", "executemany", "mogrify").tracks(0),
            calls("cursor.execute"),
            calls("cursor.mogrify"),
        ],
        sanitized_by=[
            calls("escape"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Always pass Lambda event data as the second argument to cursor.execute() as a tuple, never by concatenating it into the SQL string.
- Use psycopg2's %s placeholders for all parameterized queries; never use Python string formatting (%, .format(), f-strings) to build SQL strings.
- Grant the Lambda's PostgreSQL user the minimum necessary privileges (CONNECT, SELECT on specific tables) and avoid SUPERUSER or CREATEROLE grants.
- Enable RDS PostgreSQL's pg_stat_activity logging and set log_min_duration_statement to detect slow or unusual queries that may indicate injection.
- Use RDS Proxy for connection pooling to prevent connection exhaustion from Lambda's per-invocation connection model.

## Security Implications

- **Full Database Exfiltration via UNION SELECT:** An attacker who controls any SQL fragment can use UNION SELECT to read from any
table accessible to the Lambda's database user, including user credentials, session
tokens, PII, and financial records stored in the RDS PostgreSQL database.

- **PostgreSQL Extension Abuse:** PostgreSQL supports extensions that can be leveraged via SQL injection. dblink
and postgres_fdw can be used to connect to other database servers. Large object
functions (lo_import, lo_export) can read and write server files on unmanaged
PostgreSQL. Even on RDS where these are restricted, schema enumeration and
data exfiltration remain fully possible.

- **Authentication and Authorization Bypass:** Lambda functions that validate user credentials or check authorization via
SQL queries are vulnerable to ' OR '1'='1 style bypasses that cause the query
to return all rows. This can grant attackers administrative access or allow
them to act as any user in the system.

- **Second-Order Injection via Stored Data:** Injection payloads stored in the database by one Lambda function can be triggered
by a different Lambda function that reads the data and passes it to another
cursor.execute() without re-parameterizing. This cross-function second-order
injection is harder to detect and requires end-to-end taint tracking.


## FAQ

**Q: Why does psycopg2 use %s as a placeholder rather than ? like some other drivers?**

psycopg2 implements the Python DB-API 2.0 with 'pyformat' paramstyle, using %s
as the placeholder regardless of the actual SQL parameter type. The %s is not
Python string formatting; it is a psycopg2-specific placeholder that is replaced
by the driver with properly escaped values. Never confuse this with Python's
%-string formatting, which would introduce injection risk. Always pass values
as the second tuple argument to cursor.execute(), not via Python string formatting.


**Q: Is psycopg2's execute_values() or executemany() also safe for bulk inserts?**

Both are safe when values are passed as separate parameters, not embedded in the
SQL string. execute_values() takes a SQL template with %s placeholders and a
sequence of value tuples. As long as the SQL template is a static string and event
data appears only in the values sequence, injection is prevented. Never build the
SQL template string from event data.


**Q: What if I need dynamic column names or ORDER BY directions from event data?**

Column names and ORDER BY directions cannot be parameterized in SQL. Use an
explicit allowlist: define a set of permitted column names and sort directions,
validate the event value against the allowlist, and use the validated value to
select from a pre-built dictionary of safe SQL fragments. Never interpolate event
data directly into column name or ORDER BY positions.


**Q: Does RDS PostgreSQL restrict dangerous PostgreSQL extensions that could escalate injection?**

Amazon RDS restricts the most dangerous superuser-level operations. COPY TO/FROM
PROGRAM (server-side OS execution) is not available on RDS. However, many other
injection impacts remain fully exploitable on RDS: UNION SELECT for data exfiltration,
stacked queries for data modification, and schema enumeration via information_schema.
Parameterization is always required regardless of the database environment.


**Q: How do I handle Lambda cold starts efficiently with psycopg2 connections?**

Initialize the database connection outside the lambda_handler() function at module
level, and reuse it across warm invocations. Test the connection before use and
reconnect if it has been closed. For high-concurrency Lambda functions, use RDS
Proxy to pool connections, as Lambda's scaling model can exhaust PostgreSQL's
connection limits quickly.


## References

- [CWE-89: SQL Injection](https://cwe.mitre.org/data/definitions/89.html)
- [OWASP SQL Injection Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html)
- [psycopg2 Passing Parameters to SQL Queries](https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries)
- [AWS Lambda with RDS Best Practices](https://docs.aws.amazon.com/lambda/latest/dg/services-rds.html)
- [OWASP SQL Injection](https://owasp.org/www-community/attacks/SQL_Injection)
- [Python DB-API 2.0 Parameterized Queries](https://peps.python.org/pep-0249/#paramstyle)

---

Source: https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-011
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
