# PYTHON-LAMBDA-SEC-014: Lambda SQL Injection via SQLAlchemy execute()

> **Severity:** CRITICAL | **CWE:** CWE-89 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** AWS Lambda
- **URL:** https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-014
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LAMBDA-SEC-014 --project .`

## Description

This rule detects SQL injection vulnerabilities in AWS Lambda functions where
untrusted event data flows into SQLAlchemy session.execute() or connection.execute()
calls without bound parameters, enabling SQL injection against RDS MySQL, PostgreSQL,
SQL Server, or any SQLAlchemy-supported backend.

SQLAlchemy is widely used in Lambda functions for its ORM capabilities and support
for multiple database backends. When used correctly with the ORM or with text()
and bound parameters, SQLAlchemy prevents injection. However, two unsafe patterns
are common in Lambda handlers: passing a raw f-string SQL query to text() without
bound parameters, and passing event data directly to execute() as an unparameterized
string.

Lambda functions receive event data from API Gateway, SQS, SNS, S3, and other
triggers (event.get("body"), event.get("queryStringParameters"), event["Records"]).
This data is attacker-controllable. When it is embedded in the SQL string argument
to text() or execute() via f-strings or concatenation, the injection bypasses all
SQLAlchemy protections since SQLAlchemy only escapes values that are passed through
its bound parameter mechanism (text() with :parameter syntax and bindparams()).


## Vulnerable Code

```python
import json

# SEC-014: SQLAlchemy session.execute
def handler_sqlalchemy(event, context):
    search = event.get('search')
    result = session.execute(f"SELECT * FROM items WHERE name = '{search}'")
    return {"statusCode": 200}
```

## Secure Code

```python
import json
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker

engine = create_engine('postgresql+psycopg2://user:pass@rds-host/appdb')
Session = sessionmaker(bind=engine)

def lambda_handler(event, context):
    params = event.get('queryStringParameters', {}) or {}
    user_id = params.get('user_id', '')
    status = params.get('status', '')

    # SECURE: Validate type before query
    try:
        user_id = int(user_id)
    except (ValueError, TypeError):
        return {'statusCode': 400, 'body': 'Invalid user ID'}

    session = Session()
    try:
        # SECURE: Use text() with :named params and pass values via execute()
        result = session.execute(
            text("SELECT id, username, email FROM users WHERE id = :uid AND status = :status"),
            {'uid': user_id, 'status': status}
        )
        rows = [dict(row._mapping) for row in result]
    finally:
        session.close()
    return {'statusCode': 200, 'body': json.dumps(rows)}

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class SQLAlchemySession(QueryType):
    fqns = ["sqlalchemy.orm.Session", "sqlalchemy.orm.session.Session"]
    patterns = ["*Session"]
    match_subclasses = True

_LAMBDA_SOURCES = [
    calls("event.get"),
    calls("event.items"),
    calls("event.values"),
    calls("*.get"),
]


@python_rule(
    id="PYTHON-LAMBDA-SEC-014",
    name="Lambda SQL Injection via SQLAlchemy",
    severity="CRITICAL",
    category="aws_lambda",
    cwe="CWE-89",
    tags="python,aws,lambda,sql-injection,sqlalchemy,OWASP-A03,CWE-89",
    message="Lambda event data flows to SQLAlchemy session.execute(). Use text() with params.",
    owasp="A03:2021",
)
def detect_lambda_sqlalchemy_sqli():
    """Detects Lambda event data flowing to SQLAlchemy session.execute()."""
    return flows(
        from_sources=_LAMBDA_SOURCES,
        to_sinks=[
            SQLAlchemySession.method("execute").tracks(0),
            calls("session.execute"),
            calls("*.execute"),
        ],
        sanitized_by=[
            calls("text"),
            calls("sqlalchemy.text"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Always use SQLAlchemy's text() with :parameter_name syntax and pass values as a dictionary to execute(), never embed event data in the SQL string.
- Prefer SQLAlchemy ORM methods (session.query(), session.get(), filter()) over raw SQL to benefit from automatic parameterization.
- Validate and type-convert event fields before use (e.g., int() for numeric IDs) as a defense-in-depth measure.
- Grant the Lambda's database user the minimum necessary privileges; avoid broad permissions even when using SQLAlchemy ORM.
- Initialize the SQLAlchemy engine at module level for connection reuse across warm Lambda invocations.

## Security Implications

- **False Safety from SQLAlchemy's Reputation:** Developers often assume that using SQLAlchemy automatically prevents SQL injection.
This is true for ORM operations (query(), filter(), Session.add()) but not for raw
SQL via execute(). When event data is embedded in a string passed to text() or
execute() directly, SQLAlchemy provides no injection protection. The false sense
of security from using SQLAlchemy can cause security reviews to overlook these
vulnerable call sites.

- **Multi-Database Backend Exposure:** SQLAlchemy supports MySQL, PostgreSQL, SQL Server, Oracle, and SQLite. A Lambda
function using SQLAlchemy with any of these backends is equally vulnerable to
injection through unparameterized execute() calls. The attacker's injection
payload adapts to the specific SQL dialect, enabling database-specific attack
techniques on each backend.

- **Full Database Access via Session Context:** SQLAlchemy sessions are often configured with broad database permissions for
ORM convenience. An injected SQL string through session.execute() executes with
all permissions of the session's database user, which may include read/write
access to all tables in the schema.

- **ORM Bypass via Raw execute():** The presence of SQLAlchemy ORM models and safe ORM queries in the same Lambda
function does not protect against injection in raw execute() calls. Both safe and
unsafe call sites can coexist in the same Lambda handler, and only the unsafe
raw execute() calls with event data need to be fixed.


## FAQ

**Q: Does using SQLAlchemy ORM automatically prevent SQL injection?**

Yes for ORM operations: session.query(User).filter(User.id == user_id), session.get(),
and relationship-based queries all use parameterization internally. No for raw SQL:
session.execute("SELECT * FROM users WHERE id = " + user_id) bypasses all ORM
protections. The rule specifically targets raw SQL execute() calls with event data
embedded in the string argument, which are the dangerous pattern even when ORM is
used elsewhere in the same Lambda function.


**Q: What is the correct way to use text() with parameters in SQLAlchemy?**

Use named parameters with colon syntax in the SQL string and pass values as a
dictionary: session.execute(text("SELECT * FROM t WHERE id = :id"), {"id": user_id}).
SQLAlchemy's bindparams() method on the text() object is an alternative:
text("SELECT * FROM t WHERE id = :id").bindparams(id=user_id). Both correctly
use parameterization. Never use Python f-strings or % formatting to embed values
in the string passed to text().


**Q: How do I manage SQLAlchemy connection pools with Lambda's ephemeral execution model?**

Create the engine at module level so it is reused across warm invocations. Use
pool_pre_ping=True to test connections before use. For high-concurrency Lambda
functions, set pool_size and max_overflow conservatively or use NullPool to
create new connections per invocation. AWS RDS Proxy is the recommended solution
for connection pooling at Lambda scale, as it handles connection reuse outside
the Lambda execution environment.


**Q: Can I use SQLAlchemy's Core expression language to build dynamic queries safely?**

Yes. SQLAlchemy's Core expression language (select(), where(), and_(), or_())
builds parameterized queries programmatically using column objects rather than
string concatenation. Dynamic filters built with and_() and column objects are
safe. The injection risk is specifically with raw SQL strings (text() with
f-strings, or execute() with concatenated strings). Use Core expressions for
dynamic query construction instead.


**Q: What if my Lambda uses multiple SQLAlchemy engines for different RDS backends?**

The injection risk applies equally to all backends. The fix is the same for
each engine: use text() with :param syntax and pass values as dictionaries.
If different Lambda functions connect to different backends, each one must be
audited independently. Consider using a shared database utilities module that
enforces parameterization for all execute() calls across all backends.


## References

- [CWE-89: SQL Injection](https://cwe.mitre.org/data/definitions/89.html)
- [OWASP SQL Injection Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html)
- [SQLAlchemy Using Textual SQL](https://docs.sqlalchemy.org/en/14/core/tutorial.html#using-textual-sql)
- [AWS Lambda with RDS Best Practices](https://docs.aws.amazon.com/lambda/latest/dg/services-rds.html)
- [OWASP SQL Injection](https://owasp.org/www-community/attacks/SQL_Injection)
- [SQLAlchemy Security Considerations](https://docs.sqlalchemy.org/en/14/core/sqlelement.html#sqlalchemy.sql.expression.text)

---

Source: https://codepathfinder.dev/registry/python/aws_lambda/PYTHON-LAMBDA-SEC-014
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
