Insecure MD5 Hash Usage

MEDIUM

MD5 is cryptographically broken and unsuitable for security-sensitive purposes. Use SHA-256 or SHA-3 instead.

Rule Information

Language
Python
Category
Python Core
Author
Shivasurya
Shivasurya
Last Updated
2026-03-22
Tags
pythonmd5weak-hashcryptographyhashlibCWE-327OWASP-A02
CWE References

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-LANG-SEC-030 --project .
1
2
3
4
5
rule.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

About This Rule

Understanding the vulnerability and how it is detected

MD5 (Message Digest Algorithm 5) was once widely used for cryptographic hashing but is now considered cryptographically broken. Practical collision attacks against MD5 were demonstrated in 2004, and chosen-prefix collision attacks are feasible in under an hour on consumer hardware. MD5 should not be used for any security-sensitive purpose.

MD5 is broken for: digital signatures (collision attacks allow forging signatures), certificate fingerprinting (collisions allow creating malicious certificates with the same fingerprint), password storage (rainbow tables and GPU cracking reduce MD5 passwords to seconds), and file integrity verification when the attacker can choose file content.

MD5 remains suitable for non-security purposes such as checksums to detect accidental corruption (not adversarial modification), content-addressed storage keys where collision resistance is not required, and hash table keys. However, it must never be used where an adversary could craft colliding inputs.

Security Implications

Potential attack scenarios if this vulnerability is exploited

1

Collision Attacks

An attacker can generate two different inputs with the same MD5 hash in seconds to minutes on modern hardware. This allows forging digital signatures, creating malicious files that match expected checksums, and bypassing integrity checks.

2

Password Cracking

MD5 is extremely fast (billions of hashes per second on GPUs) and has no salt or stretching by default. MD5-hashed passwords are trivially cracked using rainbow tables, dictionary attacks, or brute force.

3

Certificate Forgery

MD5 collisions have been used to forge X.509 certificates and create rogue CA certificates. Applications using MD5 for certificate fingerprinting or verification can be deceived by crafted certificates.

4

Integrity Check Bypass

File or message integrity checks using MD5 can be bypassed by an attacker who can influence the content being hashed. The attacker crafts a malicious file that has the same MD5 hash as the legitimate file.

How to Fix

Recommended remediation steps

  • 1Replace hashlib.md5() with hashlib.sha256() or hashlib.sha3_256() for all security-sensitive hashing operations.
  • 2For password hashing specifically, use hashlib.pbkdf2_hmac(), bcrypt, scrypt, or argon2 — never bare MD5 or any fast hash.
  • 3For file integrity verification against adversarial modification, use SHA-256 or SHA-512.
  • 4Reserve MD5 only for non-security checksums where collision resistance is not a security requirement, and document why it is acceptable.
  • 5Audit all places where MD5 digests are compared to expected values to determine if they are security-sensitive.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

This rule detects calls to hashlib.md5() in Python source code. All call sites are flagged since the security-sensitivity of each use requires human review to determine. Non-security uses may be suppressed with documentation explaining the acceptable use case.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

NIST SP 800-131A
Revision 2: MD5 is disallowed for digital signatures and key derivation
OWASP Top 10
A02:2021 - Cryptographic Failures
PCI DSS v4.0
Requirement 4.2.1 - Use strong cryptography for data in transit
FIPS 140-3
MD5 is not an approved hash algorithm for security applications

References

External resources and documentation

Similar Rules

Explore related security rules for Python

Frequently Asked Questions

Common questions about Insecure MD5 Hash Usage

MD5 is broken for security purposes: passwords, digital signatures, certificate fingerprints, and integrity verification against adversarial attacks. It remains acceptable for non-security checksums (detecting accidental file corruption), content-addressed storage keys (deduplication), and hash table keys where an attacker cannot control input and collision resistance is not security-critical.
Never use bare MD5 (or any other general-purpose hash) for password hashing. Use memory-hard key derivation functions: argon2-cffi (recommended), bcrypt, or hashlib.scrypt(). These functions are specifically designed to be slow and memory- intensive to resist GPU-based brute force attacks.
Adding a salt prevents rainbow table attacks but does not fix MD5's collision vulnerability or its excessive speed. Salted MD5 is still trivially brute-forced with GPUs. Use a proper password hashing function instead.
For password hashes: implement a migration that rehashes passwords using a secure algorithm when users next log in. For file checksums: recompute checksums for all files using SHA-256 and update the stored values. Maintain backward compatibility during the transition period by accepting both old MD5 and new SHA-256 hashes.
HMAC-MD5 is considered computationally secure against known attacks and is used in legacy protocols such as TLS 1.0's MAC. However, it is deprecated in modern standards and should be replaced with HMAC-SHA-256 in new code. The collision properties of MD5 do not directly apply to HMAC constructions, but SHA-256 is strongly preferred.
Python 3.9+ added the usedforsecurity=False parameter to hashlib constructors for use in FIPS mode systems. Code using this parameter explicitly acknowledges that MD5 is not being used for security. This is appropriate for non-security checksums and may be used to document and suppress findings for legitimate MD5 uses.

New feature

Get these findings posted directly on your GitHub pull requests

The Insecure MD5 Hash Usage rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works