shelve Module Usage Detected

MEDIUM

shelve.open() uses pickle internally for value serialization and is not safe for storing or retrieving data from untrusted sources.

Rule Information

Language

Python

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-LANG-SEC-045 --project .

rule.py

About This Rule

Understanding the vulnerability and how it is detected

Python's shelve module provides a persistent dictionary interface backed by a dbm file. Values stored in a shelve database are serialized using pickle when written and deserialized using pickle when read. This means that reading values from a shelve database is equivalent to calling pickle.loads() on those values.

If an attacker can write data to the shelve database file (via file upload, directory traversal, shared filesystem access, or any other means), they can cause arbitrary code execution the next time any value is read from the database using shelve.open().

Shelve databases are also not portable between Python versions or platforms due to their reliance on pickle and dbm. For persistent data storage, use SQLite, JSON files, or a proper database engine.

Security Implications

Potential attack scenarios if this vulnerability is exploited

Pickle-based Deserialization Risk

Every shelve[key] read operation calls pickle.loads() on the stored value. An attacker who can write to the shelve database file can plant malicious pickle payloads that execute arbitrary code when read. The innocent-looking shelf access hides the underlying pickle deserialization.

Shared Filesystem Attack

Applications using shelve on shared filesystems (NFS, container-shared volumes, cloud storage) are vulnerable if other tenants or processes with filesystem access can modify the database files. The attacker replaces a legitimate entry with a malicious pickle payload.

Backup and Restore Injection

Restoring a shelve database from an attacker-controlled backup source triggers pickle deserialization of all stored values, enabling code execution through the restore process.

Unpredictable File Format

shelve uses dbm which has multiple backends (ndbm, gdbm, dumbdbm) with different file extensions and compatibility. Switching platforms or Python versions may silently fail or corrupt data, making it unsuitable for reliable production use.

How to Fix

Recommended remediation steps

1Replace shelve with SQLite + JSON for persistent key-value storage, which is portable, safe, and version-independent.
2If shelve must be used, ensure the database files are stored on a filesystem accessible only to the application process with no external write access.
3Never restore shelve database files from untrusted or external backup sources without treating the restore as equivalent to executing arbitrary code.
4Consider using a proper database (SQLite, PostgreSQL) for production data storage instead of file-based shelve.
5For simple configuration persistence, use JSON files with appropriate filesystem permissions.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule detects calls to shelve.open() from the Python shelve module. All call sites are flagged since shelve uses pickle for all value serialization, and any shelve database that can be written to by untrusted parties enables code execution through subsequent reads.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

CWE Top 25

CWE-502 - Deserialization of Untrusted Data

OWASP Top 10

A08:2021 - Software and Data Integrity Failures

NIST SP 800-53

SI-10: Information Input Validation

PCI DSS v4.0

Requirement 6.2.4 - Protect against deserialization attacks

References

External resources and documentation

CWE-502: Deserialization of Untrusted Data Python docs: shelve module Python docs: pickle security warning OWASP Deserialization Cheat Sheet OWASP Top 10 A08:2021 Software and Data Integrity Failures

Similar Rules

Explore related security rules for Python

HIGH

Pickle Deserialization of Untrusted Data

pickle.loads() and pickle.load() execute arbitrary Python code during deserialization. Never unpickle data from untrusted sources.

MEDIUM

marshal Deserialization Detected

marshal.loads() and marshal.load() are not secure against erroneous or malicious data and should not be used to deserialize untrusted input.

HIGH

dill Deserialization Detected

dill.loads() and dill.load() extend pickle with broader serialization capabilities and can execute arbitrary code when deserializing untrusted data.

Frequently Asked Questions

Common questions about shelve Module Usage Detected

shelve is safe when the database file is: (1) only written by trusted application code, (2) stored on a filesystem with strict permissions preventing external writes, (3) never restored from untrusted backup sources, and (4) not used to store data derived from external input. In practice, these constraints are difficult to guarantee reliably, making SQLite + JSON a safer alternative for most use cases.

shelve provides persistence — data is stored to disk and survives process restarts. It has the same interface as a dict but is backed by a dbm file. The key difference from a security perspective is that reading from shelve implicitly calls pickle.loads(), making every read a potential deserialization attack vector if the storage is compromised.

No. Web session storage requires concurrent access safety, proper expiry, and protection from session fixation. shelve provides none of these. Use a proper session backend such as Redis, memcached, or database-backed sessions through your web framework. All of these avoid pickle-based deserialization of user-controlled data.

dbm.open() stores raw bytes and does not use pickle, so it is safer than shelve. However, dbm still has the same portability issues and limited functionality. For production use, SQLite via the sqlite3 module provides a more robust and portable alternative to both shelve and raw dbm.

For single-threaded small datasets, shelve and SQLite have comparable performance. SQLite has advantages for concurrent access, complex queries, transactions, and larger datasets. The security benefits of SQLite + JSON (no pickle deserialization, portable format, explicit schema) justify any minor performance tradeoff for most applications.

Open the shelve database, iterate over all keys, and insert each key-value pair into SQLite with json.dumps() for serialization. Ensure the migration script runs in a controlled environment where the shelve database is known to be untampered. After migration, delete the shelve database files and update all code to use SQLite.

New feature

Get these findings posted directly on your GitHub pull requests

The shelve Module Usage Detected rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works

Back to Python Core All Languages →