Interactive Playground
Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.
pathfinder scan --ruleset python/PYTHON-LANG-SEC-040 --project .About This Rule
Understanding the vulnerability and how it is detected
Python's pickle module serializes and deserializes Python objects by encoding them as a stream of opcodes that are executed by a virtual stack machine during unpickling. The __reduce__() and __reduce_ex__() methods on objects can define arbitrary Python code to be executed when the object is deserialized.
This means that deserializing a pickle stream from an untrusted source is equivalent to executing arbitrary Python code. An attacker who can control the pickled data can achieve full remote code execution, read files, spawn processes, and exfiltrate data. There is no safe subset of pickle operations — the entire pickle format is a code execution vector.
The Python documentation explicitly warns: "The pickle module is not secure. Only unpickle data you trust." Use JSON, MessagePack, or Protocol Buffers for data interchange with untrusted parties.
Security Implications
Potential attack scenarios if this vulnerability is exploited
Arbitrary Code Execution via __reduce__
The pickle __reduce__ protocol allows any pickleable object to specify a callable and arguments to be invoked during deserialization. An attacker crafts a pickle stream that calls os.system(), subprocess.Popen(), or exec() with malicious arguments, achieving RCE simply by having the pickle stream deserialized.
No Sanitization Is Possible
Unlike SQL injection or XSS where sanitization can be effective, there is no way to safely sanitize or validate a pickle stream before deserializing it. Parsing the pickle stream to check for dangerous opcodes requires implementing a pickle interpreter, which can itself be bypassed by encoding techniques.
Session and Cache Poisoning
Applications that store pickled objects in Redis, Memcached, or cookies for session management are vulnerable if an attacker can write to those stores. Session poisoning via pickle injection in shared cache stores has been used in real attacks.
File Upload and Deserialization Chain
Applications that accept file uploads and deserialize them with pickle (e.g., ML model files, scientific data, serialized objects) are vulnerable to malicious uploads that execute code on the server when the file is loaded.
How to Fix
Recommended remediation steps
- 1Replace pickle with JSON, MessagePack, or Protocol Buffers for data received from any external source.
- 2For ML model serialization, use format-specific safe formats: ONNX, SavedModel (TensorFlow), TorchScript, or weights_only=True for PyTorch.
- 3If pickle must be used for internal IPC, sign all pickle payloads with HMAC using a secret key and verify the signature before deserializing.
- 4Never accept pickle data in file uploads, API endpoints, message queues, or any interface reachable by external parties.
- 5For scientific data (NumPy, pandas), use safe alternatives: np.load() with allow_pickle=False, df.to_parquet()/pd.read_parquet(), or HDF5.
Detection Scope
How Code Pathfinder analyzes your code for this vulnerability
This rule detects calls to pickle.loads(), pickle.load(), pickle.Unpickler(), and equivalent methods from the pickle module (including cPickle). All call sites are flagged since the safety depends entirely on the trust level of the data source, which requires human review.
Compliance & Standards
Industry frameworks and regulations that require detection of this vulnerability
References
External resources and documentation
Similar Rules
Explore related security rules for Python
PyYAML Unsafe Load Function
yaml.load() and yaml.unsafe_load() can execute arbitrary Python objects during YAML parsing. Use yaml.safe_load() instead.
jsonpickle Deserialization Detected
jsonpickle.decode() can execute arbitrary Python code during deserialization. Use the standard json module for untrusted data.
ruamel.yaml Unsafe Loader Configuration
ruamel.yaml configured with typ='unsafe' can instantiate arbitrary Python objects during YAML parsing. Use typ='safe' or the default round-trip loader.
Frequently Asked Questions
Common questions about Pickle Deserialization of Untrusted Data
New feature
Get these findings posted directly on your GitHub pull requests
The Pickle Deserialization of Untrusted Data rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.