# PYTHON-LANG-SEC-073: multiprocessing Connection.recv() Usage

> **Severity:** MEDIUM | **CWE:** CWE-502 | **OWASP:** A08:2021

- **Language:** Python
- **Category:** Python Core
- **URL:** https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-073
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LANG-SEC-073 --project .`

## Description

Python's multiprocessing.Connection.recv() method receives and deserializes data using
pickle. When data arrives on the Connection object, recv() calls pickle.loads() internally
to reconstruct the Python object from the received bytes. This means that any data
received via recv() executes arbitrary Python code if the sender can craft a malicious
pickle payload.

multiprocessing Connections are designed for inter-process communication between trusted
Python processes in the same application. They become dangerous when exposed over a
network socket (via multiprocessing.connection.Client()/Listener()) to untrusted parties.

Use recv_bytes() to receive raw bytes without pickle deserialization, then parse the
bytes with a safe format like JSON.


## Vulnerable Code

```python
from multiprocessing.connection import Client

conn = Client(('localhost', 6000))
data = conn.recv()
```

## Secure Code

```python
import json
from multiprocessing.connection import Connection

# INSECURE: recv() uses pickle on received data
# data = conn.recv()  # pickle.loads() internally

# SECURE: Use recv_bytes() and parse with JSON
def receive_json(conn: Connection) -> dict:
    raw = conn.recv_bytes()
    data = json.loads(raw.decode("utf-8"))
    if not isinstance(data, dict):
        raise ValueError("Expected JSON object")
    return data

# SECURE: Sender uses send_bytes() with JSON
def send_json(conn: Connection, data: dict) -> None:
    conn.send_bytes(json.dumps(data).encode("utf-8"))

# SECURE: Listener with authentication key (defense in depth)
from multiprocessing.connection import Listener
import os

def create_authenticated_listener(address: tuple) -> Listener:
    authkey = os.environ.get("IPC_AUTH_KEY", "").encode()
    if not authkey:
        raise RuntimeError("IPC_AUTH_KEY not configured")
    return Listener(address, authkey=authkey)

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, QueryType


@python_rule(
    id="PYTHON-LANG-SEC-073",
    name="multiprocessing Connection.recv()",
    severity="MEDIUM",
    category="lang",
    cwe="CWE-502",
    tags="python,multiprocessing,recv,deserialization,CWE-502",
    message="Connection.recv() uses pickle internally. Not safe for untrusted connections.",
    owasp="A08:2021",
)
def detect_conn_recv():
    """Detects multiprocessing Connection.recv() which uses pickle."""
    return calls("*.recv", "multiprocessing.connection.Connection.recv")
```

## How to Fix

- Replace Connection.recv() with Connection.recv_bytes() and parse the received bytes with json.loads() to avoid pickle deserialization.
- Always bind multiprocessing Listener() to localhost (127.0.0.1) and not to 0.0.0.0 or a network-accessible address.
- Use the authkey parameter for all multiprocessing connections to require HMAC authentication, even between trusted processes.
- For inter-process communication that must cross network boundaries, use a proper message queue (Redis, RabbitMQ) with authenticated, schema-validated payloads.
- Audit all multiprocessing.connection.Listener() configurations to ensure they are not exposed to untrusted network connections.

## Security Implications

- **Pickle Deserialization via IPC Channel:** recv() calls pickle.loads() on the received bytes. An attacker who can write to
the connection (either as a MITM attacker or a malicious process connecting to an
exposed Listener) can send a malicious pickle payload that executes arbitrary code
in the receiving process.

- **Listener Exposed to Network:** multiprocessing.connection.Listener() can bind to a TCP address. If exposed to
a network-accessible address (not localhost), any client can send arbitrary pickle
payloads to the Listener, achieving remote code execution.

- **Authentication Bypass:** multiprocessing connections support HMAC-based authentication via the authkey
parameter. Without authkey, there is no authentication and any connecting client
can send pickle payloads. Even with authkey, compromised keys allow exploitation.

- **Forked Process Trust Assumption:** Code that assumes recv() is safe because "it's only used with forked processes"
may be vulnerable if the connection is also exposed to a network or if the expected
trusted sender can be replaced by an attacker-controlled process.


## FAQ

**Q: Is recv() safe when used only between processes I control?**

recv() is safer when both ends are trusted Python processes in the same deployment.
However, if the multiprocessing Listener is bound to a network-accessible address
without authkey, any process can connect and send malicious data. Even with authkey,
a compromised key or a compromised trusted process enables attacks. Using recv_bytes()
with JSON eliminates the deserialization risk entirely.


**Q: What is authkey and does it make recv() safe?**

The authkey parameter enables HMAC-based challenge-response authentication on
multiprocessing connections. It prevents unauthorized clients from connecting to
a Listener. However, it does not encrypt the connection, and if the authkey is
compromised (e.g., via environment variable exposure), unauthorized clients can
connect and send malicious pickle payloads. authkey is defense-in-depth, not a
substitute for using recv_bytes() + JSON.


**Q: What is the difference between recv() and recv_bytes()?**

recv() calls pickle.loads() on the received data to return a Python object.
recv_bytes() returns the raw bytes without deserialization. Use recv_bytes()
and then parse the bytes with a safe format (json.loads, msgpack.loads with
no Python object extension). This avoids pickle deserialization entirely.


**Q: When might a multiprocessing Connection be exposed to untrusted data?**

Risk scenarios include: (1) Listener bound to 0.0.0.0 or a network interface
accessible to external clients. (2) Shared queue or pipe where external input
can reach the sender process. (3) Test infrastructure that connects to production
Listeners. (4) Container environments where network policies allow cross-container
multiprocessing connections.


**Q: Is multiprocessing.Queue.get() also vulnerable?**

Yes. multiprocessing.Queue uses pickle for serialization. Queue.get() is equivalent
to recv() in terms of deserialization risk. If an attacker can put malicious data
into a Queue (through a shared queue server or process injection), Queue.get()
will execute the malicious pickle payload.


**Q: What is the recommended replacement for inter-process communication?**

For same-host IPC: use a Unix domain socket with JSON messages, or a shared memory
segment (multiprocessing.shared_memory) for raw data. For network IPC: use gRPC
with Protocol Buffers, or a message broker (Redis, RabbitMQ) with authenticated,
schema-validated JSON payloads. Avoid pickle-based IPC for any data that could
be influenced by external input.


## References

- [CWE-502: Deserialization of Untrusted Data](https://cwe.mitre.org/data/definitions/502.html)
- [Python docs: multiprocessing.connection](https://docs.python.org/3/library/multiprocessing.html#connection-objects)
- [OWASP Deserialization Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html)
- [OWASP Top 10 A08:2021 Software and Data Integrity Failures](https://owasp.org/Top10/A08_2021-Software_and_Data_Integrity_Failures/)
- [Python docs: pickle security warning](https://docs.python.org/3/library/pickle.html#restricting-globals)

---

Source: https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-073
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
