PyYAML Unsafe Load Function

HIGH

yaml.load() and yaml.unsafe_load() can execute arbitrary Python objects during YAML parsing. Use yaml.safe_load() instead.

Rule Information

Language
Python
Category
Python Core
Author
Shivasurya
Shivasurya
Last Updated
2026-03-22
Tags
pythonyamlpyyamlunsafe-loaddeserializationCWE-502OWASP-A08
CWE References

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-LANG-SEC-041 --project .
1
2
3
4
5
6
7
8
9
rule.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

About This Rule

Understanding the vulnerability and how it is detected

PyYAML's yaml.load() function, when called without an explicit Loader argument or with Loader=yaml.Loader/yaml.UnsafeLoader, can instantiate arbitrary Python objects during parsing using YAML's !!python/object and !!python/object/apply tags. This enables remote code execution when processing YAML from untrusted sources.

The vulnerability is triggered by YAML content such as: !!python/object/apply:os.system ["id"] or more sophisticated payloads using subprocess or socket. PyYAML versions before 5.1 used the unsafe loader by default; since 5.1 a warning is issued unless an explicit Loader is provided.

The safe alternative is yaml.safe_load() or yaml.load(data, Loader=yaml.SafeLoader), which only processes YAML scalars, sequences, and mappings without instantiating Python objects.

Security Implications

Potential attack scenarios if this vulnerability is exploited

1

Remote Code Execution via !!python Tags

The !!python/object/apply tag in YAML invokes arbitrary Python callables. An attacker who can control YAML input can execute os.system(), subprocess.Popen(), or any other callable, achieving full RCE with a single YAML document.

2

Configuration File Injection

Applications that load YAML configuration files and process them with yaml.load() are vulnerable if an attacker can modify the configuration file, inject content through environment variable expansion, or write to the configuration directory.

3

API and Webhook Payload Injection

REST APIs, CI/CD pipelines, and infrastructure-as-code tools that accept YAML input from users and parse it with yaml.load() are directly exploitable. This is a common vector in DevOps tooling.

4

Kubernetes and Helm Chart Injection

Tools that process Kubernetes manifests or Helm chart values using PyYAML's unsafe loader can be exploited through crafted chart values or manifest files submitted by unprivileged users.

How to Fix

Recommended remediation steps

  • 1Replace all yaml.load() calls with yaml.safe_load() or yaml.load(data, Loader=yaml.SafeLoader).
  • 2Never use yaml.unsafe_load() or yaml.load() with yaml.Loader/yaml.UnsafeLoader on external input.
  • 3If custom Python objects must be serialized to YAML, use explicit schema validation rather than relying on YAML's !!python tags.
  • 4Audit all YAML parsing in CI/CD pipelines, configuration loaders, and API endpoints that accept YAML input.
  • 5Consider restricting YAML features to a safe subset (scalars, sequences, mappings) by using yaml.safe_load() universally.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule detects calls to yaml.load() without an explicit Loader argument or with unsafe Loader values, and calls to yaml.unsafe_load(). The rule flags these patterns as they enable Python object instantiation during YAML parsing.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

CWE Top 25
CWE-502 - Deserialization of Untrusted Data
OWASP Top 10
A08:2021 - Software and Data Integrity Failures
NIST SP 800-53
SI-10: Information Input Validation
PCI DSS v4.0
Requirement 6.2.4 - Protect against deserialization attacks

References

External resources and documentation

Similar Rules

Explore related security rules for Python

Frequently Asked Questions

Common questions about PyYAML Unsafe Load Function

yaml.safe_load() uses SafeLoader which only supports standard YAML tags and Python built-in types (str, int, float, list, dict, None, bool, datetime). yaml.load() with yaml.Loader uses the full loader that supports !!python/object and !!python/apply tags, allowing arbitrary Python objects to be instantiated during parsing.
yaml.BaseLoader loads all values as strings without interpreting any YAML tags. It is safe but does not perform type coercion (numbers remain strings). yaml.SafeLoader is usually the right choice as it handles type coercion for standard YAML types while blocking Python-specific tags.
A simple payload: !!python/object/apply:os.system ["id"]. More sophisticated payloads use subprocess.Popen with encoded commands, or construct chains through __reduce__ methods. The !! prefix denotes a YAML tag, and python/object/apply invokes the specified callable with the given arguments during parsing.
Since PyYAML 5.1, calling yaml.load() without an explicit Loader argument raises a YAMLLoadWarning. However, the warning is often ignored or suppressed in practice. The rule flags the call regardless of whether the Loader is explicitly specified, since the absence of an explicit SafeLoader indicates potential risk.
ruamel.yaml has its own unsafe loading behavior when configured with typ='unsafe'. See PYTHON-LANG-SEC-043 for ruamel.yaml-specific guidance. When using ruamel.yaml, always use the default or safe round-trip loader.
yaml.CSafeLoader is the C-accelerated version of SafeLoader and is both safe and faster when the libyaml C extension is available. Use yaml.load(data, Loader=yaml.CSafeLoader) for production performance or yaml.safe_load() (which uses CSafeLoader when available).

New feature

Get these findings posted directly on your GitHub pull requests

The PyYAML Unsafe Load Function rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works