Django Insecure Deserialization of Request Data

CRITICAL

User input flows to unsafe deserialization functions (pickle, yaml.load, dill, shelve), enabling arbitrary code execution during deserialization.

Rule Information

Language

Python

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-DJANGO-SEC-072 --project .

rule.py

About This Rule

Understanding the vulnerability and how it is detected

This rule detects insecure deserialization vulnerabilities in Django applications where untrusted user input from HTTP request parameters flows into unsafe deserialization functions: pickle.loads(), yaml.load(), dill.loads(), shelve.open() with user-controlled keys, or similar functions that execute code during deserialization.

Python's pickle module, by design, executes arbitrary code during deserialization. YAML's yaml.load() with the default Loader executes Python constructors (!!python/object) during loading. These are not bugs -- they are features for legitimate internal use -- but they become critical vulnerabilities when applied to user-controlled data.

An attacker who can supply a crafted pickle or YAML payload can execute arbitrary Python code on the server during the deserialization call, with the same privileges as the Django process. This is equivalent in severity to eval() or exec() injection.

Security Implications

Potential attack scenarios if this vulnerability is exploited

Direct Remote Code Execution via Pickle

Python's pickle module executes __reduce__ methods during deserialization. A crafted pickle payload can define a __reduce__ that calls os.system(), subprocess.Popen(), or any other function at deserialization time. This is one of the most well-known RCE vectors in Python web applications.

YAML Deserialization RCE via Python Tags

yaml.load() with the default or FullLoader can execute Python constructors using YAML tags like !!python/object/apply:os.system ['whoami']. An attacker who controls YAML input can execute arbitrary OS commands at load time. yaml.safe_load() restricts to standard YAML types only and is not vulnerable.

dill and shelve Deserialization

The dill library is a superset of pickle with even more serialization capabilities, making dill.loads() with user input equally dangerous. shelve.open() uses pickle internally, so opening shelve databases with user-controlled keys against user-provided data creates the same risk.

Persistent Backdoor Installation

Deserialization RCE can be used to install persistent backdoors: writing malicious code to application directories, modifying the codebase, adding admin accounts, or establishing reverse shell connections that persist after the initial attack.

How to Fix

Recommended remediation steps

1Replace pickle.loads(), dill.loads(), and similar calls with json.loads() for data exchange between systems.
2When YAML is required, always use yaml.safe_load() which restricts parsing to standard YAML types and does not execute Python constructors.
3Never deserialize data from HTTP requests using pickle, dill, or yaml.load() with any Loader that supports Python object construction.
4Store application state and user data in the database using Django's ORM, not as serialized Python objects.
5If pickle must be used for caching or internal IPC, sign the serialized data with HMAC and verify the signature before deserializing to prevent tampering.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule performs inter-procedural taint analysis with global scope. Sources include calls("request.GET.get"), calls("request.POST.get"), calls("request.GET.__getitem__"), calls("request.POST.__getitem__"), calls("request.body"), and calls("request.read"). Sinks include calls("pickle.loads"), calls("pickle.load"), calls("dill.loads"), calls("dill.load"), calls("yaml.load"), calls("yaml.full_load"), and calls("yaml.unsafe_load") with tainted input tracked via .tracks(0). Sanitizers include calls("yaml.safe_load") and calls("json.loads") which parse data safely without code execution. The rule follows taint across file and module boundaries.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

CWE Top 25

CWE-502 ranked in Most Dangerous Software Weaknesses list

OWASP Top 10

A08:2021 - Software and Data Integrity Failures (includes insecure deserialization)

PCI DSS v4.0

Requirement 6.2.4 - protect against all known attack types including deserialization

NIST SP 800-53

SI-10: Information Input Validation; SI-3: Malicious Code Protection

SOC 2 Type II

CC6.1 - Logical access controls; CC7.1 - System components monitored for threats

References

External resources and documentation

CWE-502: Deserialization of Untrusted Data OWASP Deserialization Cheat Sheet Python pickle security warning PyYAML safe_load vs load OWASP Software and Data Integrity Failures Exploiting Python Deserialization Vulnerabilities

Similar Rules

Explore related security rules for Python

CRITICAL

Django Code Injection via eval()

User input flows to eval(), enabling arbitrary Python code execution on the server.

CRITICAL

Django Code Injection via exec()

User input flows to exec(), enabling arbitrary Python statement execution on the server.

Frequently Asked Questions

Common questions about Django Insecure Deserialization of Request Data

Python's pickle documentation explicitly states: "The pickle module is not secure. Only unpickle data you trust." pickle.loads() executes __reduce__ methods during deserialization. A crafted payload with a custom __reduce__ can call os.system(), exec(), or any other function. There is no safe way to call pickle.loads() on data from untrusted sources -- the only fix is to use a different serialization format.

yaml.load() with most Loaders (default, FullLoader) processes YAML Python-specific tags like !!python/object and !!python/object/apply that instantiate Python objects and call functions during loading. yaml.safe_load() restricts parsing to standard YAML types: mappings, sequences, strings, numbers, booleans, null. It raises a ConstructorError for any !!python/ tags, making it safe for user-controlled input.

Pickle is safe for internal use: caching Python objects in Redis or Memcached where the data never leaves the application's trust boundary, deserializing objects that the application itself serialized in the same request or trusted internal pipeline, and scientific computing workflows where data provenance is controlled. The key requirement is that the serialized data must never be influenced by user input.

HMAC signing provides integrity protection: it ensures the data was not modified after signing. If users can receive a signed pickle blob and send it back, HMAC prevents them from modifying the blob but does not prevent them from replaying it. More importantly, if users ever obtain the signing key or a signing oracle, the protection is lost. For external data exchange, JSON is always safer.

Django's database-backed session backend stores session data as JSON by default, which is safe. Older Django configurations may use pickle for session serialization (configured via SESSION_SERIALIZER), but current Django defaults to JSON. If your settings include SESSION_SERIALIZER = 'django.contrib.sessions.serializers.PickleSerializer', change it to the JSON serializer immediately.

Insecure deserialization via pickle is generally more severe than SQL injection. SQL injection is limited to database operations (unless the database has OS-level features). Pickle deserialization executes arbitrary Python code in the application process directly, with access to all imports, environment variables, the filesystem, and network. It is equivalent to eval() injection -- direct Remote Code Execution.

New feature

Get these findings posted directly on your GitHub pull requests

The Django Insecure Deserialization of Request Data rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works

Back to Django All Languages →

Django Insecure Deserialization of Request Data

Rule Information

Interactive Playground

About This Rule

Security Implications

Direct Remote Code Execution via Pickle

YAML Deserialization RCE via Python Tags

dill and shelve Deserialization

Persistent Backdoor Installation

How to Fix

Detection Scope

Compliance & Standards

References

Similar Rules

Django Code Injection via eval()

Django Code Injection via exec()

Frequently Asked Questions

Why is pickle.loads() with user input always critical regardless of context?

What is the difference between yaml.load() and yaml.safe_load()?

Are there cases where pickle is safe to use in Django applications?

Can signing pickle data with HMAC make it safe to deserialize from users?

Is Django's built-in session framework vulnerable to this?

How does this vulnerability compare to SQL injection in severity?

Get these findings posted directly on your GitHub pull requests