Interactive Playground
Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.
pathfinder scan --ruleset python/PYTHON-LANG-SEC-044 --project .About This Rule
Understanding the vulnerability and how it is detected
Python's marshal module serializes Python objects in a binary format used internally for .pyc bytecode files and Python's import system. The module can serialize code objects, which can be executed using exec() or eval(). Unlike pickle, marshal cannot serialize arbitrary class instances, but it can serialize Python bytecode (code objects) that execute arbitrary Python when eval()-ed or exec()-ed.
Python's documentation explicitly states: "The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source."
Marshal is used in Python's internal bytecode caching (.pyc files). Applications that use marshal.loads() to deserialize data from external sources may be vulnerable to code execution if the input contains serialized code objects.
Security Implications
Potential attack scenarios if this vulnerability is exploited
Code Object Deserialization
marshal can serialize and deserialize Python code objects. An attacker who can control marshal input can provide a crafted code object that, when executed via exec() or eval() on the deserialized object, runs arbitrary Python code with the process's privileges.
Bytecode Execution via Code Objects
marshal.loads() on attacker-controlled data can produce a code object containing malicious bytecode. If the application subsequently executes this object (e.g., in a dynamic import or eval context), the attacker achieves code execution.
Process Crash via Malformed Data
The Python documentation warns that marshal is not safe against erroneous data. Malformed marshal streams can crash the Python interpreter with segmentation faults or cause memory corruption in the CPython implementation.
Bytecode Cache Poisoning
Applications that cache compiled code objects using marshal in shared storage (Redis, memcached, filesystem) are vulnerable if an attacker can write to that storage, replacing legitimate bytecode with malicious code objects.
How to Fix
Recommended remediation steps
- 1Never use marshal.loads() or marshal.load() for data received from external sources, networks, or file uploads.
- 2Use JSON or MessagePack for external data interchange instead of marshal.
- 3For internal Python object caching, use pickle with HMAC signing rather than marshal.
- 4If marshal is used for .pyc caching, ensure the cache directory is not writable by untrusted users and has proper filesystem permissions.
- 5Audit all uses of the marshal module to confirm they only process data from trusted, internal Python processes.
Detection Scope
How Code Pathfinder analyzes your code for this vulnerability
This rule detects calls to marshal.loads() and marshal.load() from the Python marshal module. All call sites are flagged since marshal is explicitly documented as unsafe for untrusted data and its use for external data is almost always a security concern.
Compliance & Standards
Industry frameworks and regulations that require detection of this vulnerability
References
External resources and documentation
Similar Rules
Explore related security rules for Python
Pickle Deserialization of Untrusted Data
pickle.loads() and pickle.load() execute arbitrary Python code during deserialization. Never unpickle data from untrusted sources.
PyYAML Unsafe Load Function
yaml.load() and yaml.unsafe_load() can execute arbitrary Python objects during YAML parsing. Use yaml.safe_load() instead.
shelve Module Usage Detected
shelve.open() uses pickle internally for value serialization and is not safe for storing or retrieving data from untrusted sources.
Frequently Asked Questions
Common questions about marshal Deserialization Detected
New feature
Get these findings posted directly on your GitHub pull requests
The marshal Deserialization Detected rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.