Insecure MD5 Hash Usage

MEDIUM

MD5 is cryptographically broken and unsuitable for security-sensitive purposes. Use SHA-256 or SHA-3 instead.

Rule Information

Language

Python

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-LANG-SEC-030 --project .

rule.py

About This Rule

Understanding the vulnerability and how it is detected

MD5 (Message Digest Algorithm 5) was once widely used for cryptographic hashing but is now considered cryptographically broken. Practical collision attacks against MD5 were demonstrated in 2004, and chosen-prefix collision attacks are feasible in under an hour on consumer hardware. MD5 should not be used for any security-sensitive purpose.

MD5 is broken for: digital signatures (collision attacks allow forging signatures), certificate fingerprinting (collisions allow creating malicious certificates with the same fingerprint), password storage (rainbow tables and GPU cracking reduce MD5 passwords to seconds), and file integrity verification when the attacker can choose file content.

MD5 remains suitable for non-security purposes such as checksums to detect accidental corruption (not adversarial modification), content-addressed storage keys where collision resistance is not required, and hash table keys. However, it must never be used where an adversary could craft colliding inputs.

Security Implications

Potential attack scenarios if this vulnerability is exploited

Collision Attacks

An attacker can generate two different inputs with the same MD5 hash in seconds to minutes on modern hardware. This allows forging digital signatures, creating malicious files that match expected checksums, and bypassing integrity checks.

Password Cracking

MD5 is extremely fast (billions of hashes per second on GPUs) and has no salt or stretching by default. MD5-hashed passwords are trivially cracked using rainbow tables, dictionary attacks, or brute force.

Certificate Forgery

MD5 collisions have been used to forge X.509 certificates and create rogue CA certificates. Applications using MD5 for certificate fingerprinting or verification can be deceived by crafted certificates.

Integrity Check Bypass

File or message integrity checks using MD5 can be bypassed by an attacker who can influence the content being hashed. The attacker crafts a malicious file that has the same MD5 hash as the legitimate file.

How to Fix

Recommended remediation steps

1Replace hashlib.md5() with hashlib.sha256() or hashlib.sha3_256() for all security-sensitive hashing operations.
2For password hashing specifically, use hashlib.pbkdf2_hmac(), bcrypt, scrypt, or argon2 — never bare MD5 or any fast hash.
3For file integrity verification against adversarial modification, use SHA-256 or SHA-512.
4Reserve MD5 only for non-security checksums where collision resistance is not a security requirement, and document why it is acceptable.
5Audit all places where MD5 digests are compared to expected values to determine if they are security-sensitive.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule detects calls to hashlib.md5() in Python source code. All call sites are flagged since the security-sensitivity of each use requires human review to determine. Non-security uses may be suppressed with documentation explaining the acceptable use case.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

NIST SP 800-131A

Revision 2: MD5 is disallowed for digital signatures and key derivation

OWASP Top 10

A02:2021 - Cryptographic Failures

PCI DSS v4.0

Requirement 4.2.1 - Use strong cryptography for data in transit

FIPS 140-3

MD5 is not an approved hash algorithm for security applications

References

External resources and documentation

CWE-327: Use of a Broken or Risky Cryptographic Algorithm Python docs: hashlib module MD5 Collision Attacks - Wang et al. 2004 OWASP Cryptographic Storage Cheat Sheet NIST SP 800-107 Revision 1 - Secure Hash Standard

Similar Rules

Explore related security rules for Python

MEDIUM

Insecure SHA-1 Hash Usage

SHA-1 is cryptographically weak due to practical collision attacks. Use SHA-256 or SHA-3 for security-sensitive hashing.

MEDIUM

Insecure Hash via hashlib.new()

hashlib.new() with an insecure algorithm name (MD5, SHA1, SHA-224) creates a cryptographically weak hash. Use SHA-256 or SHA-3.

LOW

SHA-224 or SHA3-224 Weak Hash Usage

SHA-224 and SHA3-224 provide only 112-bit collision resistance, which is below the 128-bit minimum recommended by NIST for new applications.

Frequently Asked Questions

Common questions about Insecure MD5 Hash Usage

MD5 is broken for security purposes: passwords, digital signatures, certificate fingerprints, and integrity verification against adversarial attacks. It remains acceptable for non-security checksums (detecting accidental file corruption), content-addressed storage keys (deduplication), and hash table keys where an attacker cannot control input and collision resistance is not security-critical.

Never use bare MD5 (or any other general-purpose hash) for password hashing. Use memory-hard key derivation functions: argon2-cffi (recommended), bcrypt, or hashlib.scrypt(). These functions are specifically designed to be slow and memory- intensive to resist GPU-based brute force attacks.

Adding a salt prevents rainbow table attacks but does not fix MD5's collision vulnerability or its excessive speed. Salted MD5 is still trivially brute-forced with GPUs. Use a proper password hashing function instead.

For password hashes: implement a migration that rehashes passwords using a secure algorithm when users next log in. For file checksums: recompute checksums for all files using SHA-256 and update the stored values. Maintain backward compatibility during the transition period by accepting both old MD5 and new SHA-256 hashes.

HMAC-MD5 is considered computationally secure against known attacks and is used in legacy protocols such as TLS 1.0's MAC. However, it is deprecated in modern standards and should be replaced with HMAC-SHA-256 in new code. The collision properties of MD5 do not directly apply to HMAC constructions, but SHA-256 is strongly preferred.

Python 3.9+ added the usedforsecurity=False parameter to hashlib constructors for use in FIPS mode systems. Code using this parameter explicitly acknowledges that MD5 is not being used for security. This is appropriate for non-security checksums and may be used to document and suppress findings for legitimate MD5 uses.

New feature

Get these findings posted directly on your GitHub pull requests

The Insecure MD5 Hash Usage rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works

Back to Python Core All Languages →

Insecure MD5 Hash Usage

Rule Information

Interactive Playground

About This Rule

Security Implications

Collision Attacks

Password Cracking

Certificate Forgery

Integrity Check Bypass

How to Fix

Detection Scope

Compliance & Standards

References

Similar Rules

Insecure SHA-1 Hash Usage

Insecure Hash via hashlib.new()

SHA-224 or SHA3-224 Weak Hash Usage

Frequently Asked Questions

Is MD5 completely forbidden or are there acceptable uses?

What should I use instead of MD5 for password hashing?

Is MD5 safe with a salt?

How do I migrate existing MD5 hashes in a database?

Is HMAC-MD5 safe?

Does hashlib.md5(usedforsecurity=False) suppress this finding?

Get these findings posted directly on your GitHub pull requests