# GO-CRYPTO-001: Use of MD5 Weak Hash Algorithm

> **Severity:** HIGH | **CWE:** CWE-328, CWE-916 | **OWASP:** A02:2021

- **Language:** Go
- **Category:** Security
- **URL:** https://codepathfinder.dev/registry/golang/security/GO-CRYPTO-001
- **Detection:** `pathfinder scan --ruleset golang/GO-CRYPTO-001 --project .`

## Description

MD5 was designed in 1992 and has been cryptographically broken since Xiaoyun Wang et al.
demonstrated practical collision attacks at CRYPTO 2004. The Wang-Yu attack finds two
distinct inputs with the same MD5 digest in under one second on modern hardware using
differential cryptanalysis with modular arithmetic differentials.

The Flame malware (June 2012) is the highest-profile real-world exploitation: Flame's
operators used a chosen-prefix MD5 collision to forge a Microsoft code-signing certificate.
The forged certificate passed Windows Update's chain-of-trust validation, allowing Flame
to spread via a man-in-the-middle attack against Windows Update — the first documented
deployment of a live MD5 collision attack in the wild.

On an NVIDIA RTX 4090, hashcat computes 164.1 billion MD5 hashes per second. An 8-character
mixed-case alphanumeric password space (218 trillion combinations) is exhausted in under
22 minutes. Any MD5-hashed password database is practically unprotected against GPU cracking.

MD5 is not recognized as an approved algorithm in any FIPS 140-2/3 validated cryptographic
module. NIST SP 800-131A Rev 2 does not list MD5 as acceptable for any cryptographic security
purpose. The Go standard library retains crypto/md5 only for legacy interoperability —
the package documentation notes it is "cryptographically broken and should not be used
for secure applications."

**Acceptable uses**: Non-security file transfer checksums (detecting accidental bit-flips
when an adversary is not present), partition key derivation in distributed databases,
legacy cache key generation where collision does not have security consequences.


## Vulnerable Code

```python
# --- file: vulnerable.go ---
// GO-CRYPTO-001 positive test cases — all SHOULD be detected
package main

import "crypto/md5"

func weakMD5New() []byte {
	h := md5.New()            // SINK: MD5 is collision-broken
	h.Write([]byte("data"))
	return h.Sum(nil)
}

func weakMD5Sum() {
	data := []byte("important data")
	hash := md5.Sum(data)    // SINK: MD5 weak hash
	_ = hash
}

# --- file: go.mod ---
module example.com/go-crypto-001/positive

go 1.21

# --- file: go.sum ---

```

## Secure Code

```python
// SECURE: SHA-256 for general integrity hashing
import "crypto/sha256"

func checksumFile(data []byte) string {
    h := sha256.Sum256(data)
    return fmt.Sprintf("%x", h)
}

// SECURE: bcrypt for password storage (minimum cost 12 for 2024+)
import "golang.org/x/crypto/bcrypt"

func hashPassword(password string) (string, error) {
    hash, err := bcrypt.GenerateFromPassword([]byte(password), 12)
    if err != nil {
        return "", err
    }
    return string(hash), nil
}

// SECURE: argon2id for new systems (PHC winner, memory-hard)
import (
    "crypto/rand"
    "golang.org/x/crypto/argon2"
)

func hashPasswordArgon2(password string) ([]byte, []byte, error) {
    salt := make([]byte, 16)
    if _, err := rand.Read(salt); err != nil {
        return nil, nil, err
    }
    // OWASP recommended: time=1, memory=46MiB, threads=1
    hash := argon2.IDKey([]byte(password), salt, 1, 46*1024, 1, 32)
    return hash, salt, nil
}

// SECURE: crypto/rand for token generation
import "crypto/rand"

func generateToken() (string, error) {
    b := make([]byte, 32)
    if _, err := rand.Read(b); err != nil {
        return "", err
    }
    return fmt.Sprintf("%x", b), nil
}

```

## Detection Rule (Python SDK)

```python
"""GO-CRYPTO-001: Use of MD5 weak hash algorithm."""

from codepathfinder.go_rule import QueryType
from codepathfinder import flows
from codepathfinder.go_decorators import go_rule


class GoCryptoMD5(QueryType):
    fqns = ["crypto/md5"]
    patterns = ["md5.*"]
    match_subclasses = False


@go_rule(
    id="GO-CRYPTO-001",
    severity="HIGH",
    cwe="CWE-328",
    owasp="A02:2021",
    tags="go,security,crypto,md5,weak-hash,CWE-328,OWASP-A02",
    message=(
        "Detected use of the MD5 hash algorithm (crypto/md5). "
        "MD5 is cryptographically broken — it has known collision attacks and "
        "should not be used for any security-sensitive purpose. "
        "Use crypto/sha256 or crypto/sha512 instead."
    ),
)
def detect_md5_weak_hash():
    """Detect use of MD5 hashing (crypto/md5.New or md5.Sum)."""
    return GoCryptoMD5.method("New", "Sum")
```

## How to Fix

- Replace md5.New() and md5.Sum() with sha256.New() and sha256.Sum256() for integrity hashing.
- For password hashing, use golang.org/x/crypto/bcrypt (cost >= 12) or argon2.IDKey.
- Never use MD5 for digital signatures, certificate fingerprints, or token generation.
- If migrating from MD5 passwords: at next successful login, re-hash with bcrypt/argon2id.
- For TLS/HMAC: the Go crypto/tls package and crypto/hmac with SHA-256 are safe defaults.
- MD5 is acceptable ONLY for non-security checksums (detecting accidental bit-flips) when no adversary is present and cannot substitute inputs.

## Security Implications

- **Forged Digital Signatures:** A chosen-prefix MD5 collision allows an attacker to create two documents with the same
MD5 digest but different content. If a CA signs one, the signature is valid for the
other. The Flame malware exploited exactly this to forge a Microsoft Windows Update
code-signing certificate (June 2012).

- **File Integrity Bypass:** If MD5 is used to verify file integrity (firmware updates, software downloads), an
attacker who can intercept the download or control the distribution server can replace
a legitimate binary with a malicious one that shares the same MD5 hash, passing the
integrity check.

- **Password Database Cracking:** MD5 password hashes can be cracked at 164 billion attempts per second on a single
RTX 4090 GPU. An 8-character password is recovered in under 22 minutes. Rainbow tables
for unsalted MD5 cover most real-world passwords. MD5-hashed password databases from
breaches are routinely cracked and published within hours.

- **Certificate Spoofing:** PKI systems that use MD5 for certificate fingerprinting or signing are vulnerable to
certificate impersonation. The CA/Browser Forum banned MD5 in TLS certificates in 2008
after the Sotirov et al. rogue CA attack.


## References

- [IACR: Wang et al. MD5 Collision Attack (2004)](https://eprint.iacr.org/2004/199.pdf)
- [IACR: Wang & Yu 'How to Break MD5' (Eurocrypt 2005)](https://iacr.org/archive/eurocrypt2005/34940019/34940019.pdf)
- [Flame malware collision attack explained — Microsoft MSRC](https://www.microsoft.com/en-us/msrc/blog/2012/06/flame-malware-collision-attack-explained)
- [Analyzing the MD5 collision in Flame — Trail of Bits](https://blog.trailofbits.com/2012/06/11/analyzing-the-md5-collision-in-flame/)
- [Flame, certificates, collisions. Oh my. — Matthew Green's Cryptography Blog](https://blog.cryptographyengineering.com/2012/06/05/flame-certificates-collisions-oh-my/)
- [HashClash — MD5 chosen-prefix collision tool (Marc Stevens, CWI)](https://github.com/cr-marcstevens/hashclash)
- [NIST SP 800-131A Revision 2 — Transitioning Cryptographic Algorithms](https://csrc.nist.gov/pubs/sp/800/131/a/r2/final)
- [NIST SP 800-131A Rev 2 — Transitioning Cryptographic Algorithms (CSRC)](https://csrc.nist.gov/pubs/sp/800/131/a/r2/final)
- [Hashcat RTX 4090 benchmark (MD5: 164.1 GH/s)](https://gist.github.com/Chick3nman/32e662a5bb63bc4f51b847bb422222fd)
- [OWASP Password Storage Cheat Sheet (bcrypt, argon2id recommendations)](https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html)
- [Go crypto/md5 package documentation](https://pkg.go.dev/crypto/md5)
- [Go crypto/sha256 package documentation](https://pkg.go.dev/crypto/sha256)
- [Go golang.org/x/crypto/bcrypt package](https://pkg.go.dev/golang.org/x/crypto/bcrypt)
- [Go golang.org/x/crypto/argon2 package](https://pkg.go.dev/golang.org/x/crypto/argon2)
- [RFC 9106 — Argon2 Memory-Hard Function for Password Hashing](https://www.rfc-editor.org/rfc/rfc9106)
- [CWE-328: Use of Weak Hash](https://cwe.mitre.org/data/definitions/328.html)

---

Source: https://codepathfinder.dev/registry/golang/security/GO-CRYPTO-001
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
