# PYTHON-LANG-SEC-043: ruamel.yaml Unsafe Loader Configuration

> **Severity:** HIGH | **CWE:** CWE-502 | **OWASP:** A08:2021

- **Language:** Python
- **Category:** Python Core
- **URL:** https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-043
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LANG-SEC-043 --project .`

## Description

ruamel.yaml is a YAML parser that supports multiple loading modes via the typ parameter:
'rt' (round-trip, default), 'safe', 'base', 'unsafe', and 'full'. When configured with
typ='unsafe' or typ='full', ruamel.yaml enables Python-specific YAML tags (!!python/object,
!!python/apply) that can instantiate arbitrary Python classes during parsing.

Like PyYAML's unsafe loader, this creates a remote code execution vulnerability when
processing YAML from untrusted sources. An attacker who can control the YAML content can
craft a document that executes arbitrary Python code when the YAML object's load() method
is called.

The default typ='rt' (round-trip) loader does not support Python object instantiation and
is safe for parsing untrusted YAML content. Always use typ='safe' or the default when
processing external data.


## Vulnerable Code

```python
import pickle
import yaml
import marshal
import shelve

# SEC-043: ruamel.yaml unsafe
from ruamel.yaml import YAML
ym = YAML(typ="unsafe")
```

## Secure Code

```python
from ruamel.yaml import YAML

# INSECURE: ruamel.yaml with unsafe typ
# yaml = YAML(typ='unsafe')
# yaml = YAML(typ='full')
# data = yaml.load(user_input)

# SECURE: Use default round-trip loader (safe for standard YAML)
def parse_config(yaml_content: str) -> dict:
    yaml = YAML()  # default typ='rt' - does not instantiate Python objects
    import io
    data = yaml.load(io.StringIO(yaml_content))
    if not isinstance(data, dict):
        raise ValueError("Expected YAML mapping")
    return data

# SECURE: Explicitly use typ='safe' for maximum safety
def parse_user_yaml(yaml_content: str):
    yaml = YAML(typ='safe')
    import io
    return yaml.load(io.StringIO(yaml_content))

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, QueryType

class YamlModule(QueryType):
    fqns = ["yaml"]

class RuamelYamlModule(QueryType):
    fqns = ["ruamel.yaml"]


@python_rule(
    id="PYTHON-LANG-SEC-043",
    name="ruamel.yaml Unsafe Usage",
    severity="HIGH",
    category="lang",
    cwe="CWE-502",
    tags="python,ruamel,yaml,deserialization,rce,CWE-502",
    message="ruamel.yaml with unsafe typ detected. Use typ='safe' instead.",
    owasp="A08:2021",
)
def detect_ruamel_unsafe():
    """Detects ruamel.yaml YAML() with unsafe typ."""
    return RuamelYamlModule.method("YAML").where("typ", "unsafe")
```

## How to Fix

- Replace YAML(typ='unsafe') with YAML(typ='safe') or YAML() (default round-trip) for all external YAML parsing.
- Never use typ='full' or typ='unsafe' for YAML content that could be influenced by external users.
- If Python object serialization is required for internal use, consider alternative approaches such as explicit JSON schemas or Protocol Buffers.
- Audit all ruamel.yaml usage in CI/CD tools, configuration loaders, and infrastructure scripts.
- Validate YAML structure and types after safe loading to ensure the input matches the expected schema.

## Security Implications

- **Python Object Instantiation via YAML Tags:** With typ='unsafe', ruamel.yaml processes !!python/object and !!python/apply YAML
tags that import and instantiate arbitrary Python classes. An attacker crafts YAML
with !!python/object/apply:os.system to execute system commands when the YAML is parsed.

- **Configuration File Attack Surface:** Infrastructure tools, DevOps scripts, and application configuration loaders that
use ruamel.yaml to process user-editable or network-sourced YAML files with typ='unsafe'
are vulnerable. This is especially common in Python-based configuration management tools.

- **CI/CD Pipeline Exploitation:** CI/CD tools written in Python that parse pipeline definition files or workflow
configurations using ruamel.yaml with unsafe mode allow users who can submit pipeline
files to execute code with the pipeline runner's privileges.

- **Infrastructure-as-Code Attacks:** Ansible, SaltStack, and similar Python-based IaC tools use YAML for playbooks and
configuration. If any component uses ruamel.yaml with unsafe mode to process playbooks,
submitted playbook content could execute code on the management node.


## FAQ

**Q: What is the difference between ruamel.yaml typ values?**

'rt' (round-trip, default): preserves comments and formatting, does not instantiate
Python objects. 'safe': equivalent to PyYAML's SafeLoader, only standard YAML types.
'base': minimal loader for scalars only. 'unsafe': enables all Python tags including
!!python/object/apply. 'full': similar to unsafe, supports full Python type set. Use
'safe' or 'rt' (default) for untrusted data.


**Q: Is ruamel.yaml's default mode safe?**

Yes. ruamel.yaml's default typ='rt' (round-trip) does not process Python-specific
YAML tags and is safe for parsing untrusted YAML. The vulnerability only occurs with
typ='unsafe' or typ='full'. If you're using YAML() without a typ argument, you are
using the safer round-trip loader.


**Q: What is ruamel.yaml used for that requires the unsafe loader?**

The unsafe loader is intended for deserializing Python objects that were serialized
using ruamel.yaml's unsafe dumper, typically for Python-to-Python data exchange where
both ends are trusted. It is not intended for processing user-provided YAML or any
data from external sources.


**Q: How does ruamel.yaml compare to PyYAML for security?**

Both have equivalent YAML deserialization risks when configured with unsafe loaders.
ruamel.yaml's default mode is safer than old PyYAML (pre-5.1) since it does not
process Python tags by default. When explicitly configured with typ='unsafe', they
have the same RCE risk. Prefer ruamel.yaml's default mode or typ='safe' for all
external YAML parsing.


**Q: Can ruamel.yaml safe mode handle all standard YAML features?**

Yes. typ='safe' handles all standard YAML 1.1 and 1.2 features including anchors,
aliases, multi-document streams, and all standard scalar types (strings, integers,
floats, booleans, nulls, timestamps). The only restriction is Python-specific tags.


**Q: How do I serialize Python objects safely if I can't use typ='unsafe'?**

Implement explicit serialization using typ='safe' YAML with manual dict/list conversion.
Use dataclasses or pydantic models with .model_dump() to convert to plain dicts before
YAML serialization. This produces portable YAML that can be loaded with any safe YAML
parser without Python-specific tag dependencies.


## References

- [CWE-502: Deserialization of Untrusted Data](https://cwe.mitre.org/data/definitions/502.html)
- [ruamel.yaml documentation](https://yaml.readthedocs.io/docs/)
- [OWASP Deserialization Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html)
- [OWASP Top 10 A08:2021 Software and Data Integrity Failures](https://owasp.org/Top10/A08_2021-Software_and_Data_Integrity_Failures/)
- [PyYAML vs ruamel.yaml security comparison](https://cwe.mitre.org/data/definitions/502.html)

---

Source: https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-043
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
