# PYTHON-LANG-SEC-080: psycopg2 SQL Injection via String Formatting

> **Severity:** CRITICAL | **CWE:** CWE-89 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** Python Core
- **URL:** https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-080
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LANG-SEC-080 --project .`

## Description

psycopg2 is the most widely used PostgreSQL adapter for Python. SQL injection occurs
when user-supplied data is concatenated or formatted into the SQL query string instead
of being passed as separate parameters. When the query is constructed with string
formatting (% operator, .format(), or f-strings), an attacker can embed SQL syntax
characters in their input to modify the query structure.

The correct approach with psycopg2 is to use parameterized queries: pass the SQL
template as a literal string with %s placeholders and user data as a separate tuple
argument to cursor.execute(). psycopg2 handles proper escaping and quoting of
parameters at the database protocol level.

This rule detects patterns where SQL strings built with string formatting are passed
to psycopg2's cursor.execute() or cursor.executemany() methods.


## Vulnerable Code

```python
import sqlite3

# SEC-080: psycopg2
import psycopg2
pg_conn = psycopg2.connect("dbname=test")
pg_cursor = pg_conn.cursor()
name = "user_input"
pg_cursor.execute("SELECT * FROM users WHERE name = '" + name + "'")
```

## Secure Code

```python
import psycopg2

# INSECURE: String formatting in SQL
# cursor.execute(f"SELECT * FROM users WHERE username = '{username}'")
# cursor.execute("SELECT * FROM orders WHERE id = " + order_id)

# SECURE: Parameterized queries with %s placeholders
def get_user(conn: psycopg2.extensions.connection, username: str) -> dict:
    with conn.cursor() as cursor:
        cursor.execute(
            "SELECT id, username, email FROM users WHERE username = %s",
            (username,)
        )
        row = cursor.fetchone()
        if row is None:
            return {}
        return {"id": row[0], "username": row[1], "email": row[2]}

def search_orders(conn: psycopg2.extensions.connection,
                  user_id: int, status: str) -> list:
    with conn.cursor() as cursor:
        cursor.execute(
            "SELECT id, total, created_at FROM orders "
            "WHERE user_id = %s AND status = %s",
            (user_id, status)
        )
        return cursor.fetchall()

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class Psycopg2Cursor(QueryType):
    fqns = ["psycopg2.extensions.cursor", "psycopg2.extras.RealDictCursor"]
    patterns = ["*cursor*"]
    match_subclasses = True


@python_rule(
    id="PYTHON-LANG-SEC-080",
    name="psycopg2 SQL Injection",
    severity="CRITICAL",
    category="lang",
    cwe="CWE-89",
    tags="python,psycopg2,sql-injection,database,OWASP-A03,CWE-89",
    message="String formatting in psycopg2 query. Use parameterized queries: cursor.execute(sql, params).",
    owasp="A03:2021",
)
def detect_psycopg2_sqli():
    """Detects potential SQL injection in psycopg2 cursor.execute()."""
    return Psycopg2Cursor.method("execute", "executemany")
```

## How to Fix

- Always use parameterized queries with %s placeholders and pass user data as the second argument tuple to cursor.execute().
- Never construct SQL strings with % operator, .format(), or f-strings when user-supplied values are involved.
- Use psycopg2.sql module (psycopg2.sql.SQL, psycopg2.sql.Identifier) for dynamic column names, table names, or ORDER BY directions that cannot be parameterized.
- Grant the PostgreSQL database user the minimum required privileges to prevent SQL injection from accessing unintended schemas or escalating to system functions.
- Enable PostgreSQL's pg_audit extension to log all SQL statements for security monitoring.

## Security Implications

- **Database Records Exfiltration:** UNION SELECT injection allows an attacker to retrieve data from any table in the
PostgreSQL database. This includes user credentials, session tokens, private data,
and application secrets stored in the database.

- **Authentication Bypass:** Login queries built with string concatenation can be bypassed with ' OR '1'='1
payloads or similar techniques, granting access without valid credentials.

- **Data Manipulation and Destruction:** Stacked queries or subquery injection can issue INSERT, UPDATE, or DELETE
statements, allowing attackers to alter data, create backdoor accounts, or
wipe database tables.

- **PostgreSQL Server Escalation:** PostgreSQL's COPY TO/FROM PROGRAM (superuser), pg_read_file(), and pg_read_binary_file()
can read server filesystem files. On misconfigured servers, SQL injection can
escalate to OS command execution.


## FAQ

**Q: What is the correct way to parameterize queries in psycopg2?**

Use %s as a placeholder in the SQL template string (regardless of the actual data
type — psycopg2 handles type conversion). Pass user values as a tuple (or list)
as the second argument to cursor.execute(): cursor.execute("SELECT * FROM t WHERE id = %s", (user_id,)).
Never use Python's % string formatting operator on the SQL string itself.


**Q: How do I dynamically specify column names or table names safely?**

Column names and table names cannot be parameterized with %s — they are SQL identifiers,
not values. Use psycopg2.sql.Identifier for safe identifier quoting:
from psycopg2 import sql; cursor.execute(sql.SQL("SELECT {} FROM t").format(sql.Identifier(col_name))).
Additionally, validate identifiers against an allowlist of known-valid names.


**Q: Is it safe to use %s in psycopg2 if the value is an integer?**

Yes, and it is required. Always use %s for all parameterized values including integers.
psycopg2 casts the Python value to the appropriate PostgreSQL type. Using string
interpolation for integers (f"WHERE id = {user_id}") is still vulnerable if the
variable could contain a non-integer value due to type coercion or logic bugs.


**Q: Can this rule detect SQL injection that spans multiple lines or functions?**

Yes. Code Pathfinder's inter-procedural taint analysis traces data flow from HTTP
request parameters through variable assignments and function calls to cursor.execute().
It catches cases where the SQL string is built in one function and executed in another.


**Q: What about psycopg2's mogrify() method — is it safe?**

cursor.mogrify() formats a query string with parameters using psycopg2's safe escaping,
returning the formatted bytes. The result is safe to use in cursor.execute() directly.
However, using mogrify() to build SQL strings for manual concatenation is still
dangerous. Use cursor.execute(sql, params) directly instead of mogrify().


**Q: Does this apply to psycopg3 (psycopg) as well?**

psycopg3 uses the same parameterization approach but with different syntax: %s for
positional parameters (as in psycopg2) or %(name)s for named parameters. The same
rule applies: use parameterized queries with separate parameter tuples, never string
formatting in SQL strings.


## References

- [CWE-89: SQL Injection](https://cwe.mitre.org/data/definitions/89.html)
- [psycopg2 docs: Passing parameters to SQL queries](https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries)
- [psycopg2 docs: The problem with the query parameters](https://www.psycopg.org/docs/usage.html#the-problem-with-the-query-parameters)
- [OWASP SQL Injection Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html)
- [OWASP Top 10 A03:2021 Injection](https://owasp.org/Top10/A03_2021-Injection/)

---

Source: https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-080
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
