The Quick Answer
Use SHA-256 for new implementations. MD5 is cryptographically broken and should be avoided for security purposes, though it persists in legacy systems and databases. For hash lookup of files, SHA-256 provides better security, is supported everywhere, and should be your default choice.
However, the decision isn't always binary—sometimes both are used, sometimes legacy systems require MD5, and understanding the tradeoffs helps you make informed decisions.
MD5: The Deprecated Hashing Algorithm
History
MD5 (Message Digest 5) was designed by Ronald Rivest in 1991 as a cryptographic hash function. It was widely adopted and became a standard for file verification and integrity checking.
MD5 characteristics:
- Output: 128-bit hash (32 hexadecimal characters)
- Speed: Very fast
- Collision resistance: Broken since 2004
Why MD5 is Broken
Collision attacks discovered (2004): MD5's collision resistance property was broken, meaning two different inputs can produce the same hash output. This fundamentally violates the core security property of hash functions.
Practical attacks:
- 2008: Forged SSL certificates using MD5 collisions
- Ongoing: MD5 collisions are relatively easy to generate
- Real-world impact: Malicious files can be crafted to match legitimate file hashes
Example collision (real attack):
File A: Legitimate executable
MD5(File A) = "5d41402abc4b2a76b9719d911017c592"
File B: Malicious executable (carefully crafted by attacker)
MD5(File B) = "5d41402abc4b2a76b9719d911017c592" (same!)
Attacker distributes File B, claiming it's File A
Integrity check passes because hashes match
Users execute malware thinking it's legitimate
When MD5 Still Appears
Despite being broken, MD5 persists in:
- Legacy systems: Old software still using MD5
- Backward compatibility: Supporting old file formats
- Database records: Billions of MD5 hashes already in systems
- Non-security uses: File deduplication, checksums (where collision not concern)
- Hash lookup databases: Many include MD5 entries for historical coverage
Examples:
- VirusTotal: Accepts MD5 lookups (though uses SHA-256 primarily)
- Linux distributions: Some still provide MD5 checksums (legacy reasons)
- Legacy security software: Older antivirus products used MD5
SHA-256: The Modern Standard
History
SHA-256 (Secure Hash Algorithm 256-bit) was published by NIST in 2001 as part of the SHA-2 family, addressing weaknesses in MD5 and SHA-1.
SHA-256 characteristics:
- Output: 256-bit hash (64 hexadecimal characters)
- Speed: Fast (slower than MD5, but acceptable)
- Collision resistance: Theoretically secure for billions of years
- No known practical attacks
Why SHA-256 is Secure
Design improvements over MD5:
- Larger output (256-bit vs 128-bit) makes collisions exponentially harder
- More complex mathematical operations
- Designed with modern cryptanalysis in mind
- Extensively studied and peer-reviewed
Security properties:
- No known practical attacks
- No collision method discovered
- Theoretically secure through 2100+
- Resistant to length-extension attacks (with proper padding)
Real-world adoption:
- NIST standard
- TLS/SSL certificates
- Bitcoin blockchain
- Digital signatures
- Password hashing
- File integrity verification
MD5 vs SHA-256: Detailed Comparison
| Aspect | MD5 | SHA-256 |
|---|---|---|
| Release Date | 1991 | 2001 |
| Output Size | 128-bit | 256-bit |
| Security Status | Cryptographically Broken | Secure |
| Known Attacks | Collision attacks practical | No practical attacks |
| Speed | Very fast (~600 MB/s) | Fast (~400 MB/s) |
| Collision Resistance | Failed | Secure |
| Preimage Resistance | Weak | Strong |
| Database Coverage | Legacy systems | Universal |
| Verification Use | Not recommended | Recommended |
| Certificate Signing | Deprecated | Standard |
| Recommended for New Systems | No | Yes |
When to Use Each
Use SHA-256
Always use SHA-256 for:
- New implementations
- Security-critical applications
- File integrity verification
- Digital signatures
- Password hashing (with proper salting)
- Certificate signing
- Hash lookups for malware detection
Examples:
# Verifying downloaded Linux ISO
sha256sum ubuntu-24.04-desktop-amd64.iso
# Checking file integrity after transfer
sha256sum important_document.pdf
# Verifying software authenticity
sha256sum software_installer.exe
# Hash lookup for security analysis
virustotal.com (upload file or SHA-256)
Use MD5 Only When
Legacy compatibility necessary:
- Supporting old systems that only provide MD5
- Integrating with systems that can't be updated
- Backward compatibility with existing databases
- Migrating from MD5 to SHA-256
Non-security uses:
- File deduplication (where collision not security risk)
- Checksums for file transfer integrity (non-adversarial)
- Cache invalidation
- Database indexing (non-security)
Examples:
# Legacy system that requires MD5
# Old antivirus database lookup
# Supporting outdated API that only accepts MD5
Never Use MD5 For
- ✗ Security-critical integrity checking
- ✗ Digital signatures or certificate signing
- ✗ Password hashing
- ✗ Malware detection hash lookup
- ✗ Authenticating downloads
- ✗ Access control decisions
Hash Lookup: MD5 vs SHA-256
Hash Lookup Databases
VirusTotal:
- Accepts: MD5, SHA-1, SHA-256
- Recommends: SHA-256 or SHA-1
- Deprecating: MD5 for security-critical lookups
- Storage: Has records for billions of MD5 hashes (legacy coverage)
NSRL (National Software Reference Library):
- Primarily: SHA-1 and MD5
- Newer entries: Include SHA-256
- Legacy: Extensive MD5 coverage from decades of collection
YARA/Threat Intelligence:
- Modern implementations: SHA-256 primary
- Legacy: May include MD5
- Best practice: Use SHA-256
Recommendation for Hash Lookup
For current investigations: Use SHA-256
# Get SHA-256 of suspicious file
sha256sum suspicious_file.exe
# Look up in VirusTotal/Hybrid Analysis
For legacy searches: May need MD5
# If database only supports MD5
md5sum suspicious_file.exe
# Look up in older security tools
Best practice: Compute both
# Generate both hashes
sha256sum file.exe → abc123...
md5sum file.exe → def456...
# Check SHA-256 in modern databases first
# Fall back to MD5 if needed for legacy systems
Migration Path: MD5 to SHA-256
Organizations should plan migration:
Phase 1: Dual Support (Current)
New systems use SHA-256
Legacy systems continue MD5
Both supported where applicable
Phase 2: Gradual Transition
Compute and store both hashes
Prioritize SHA-256 in new workflows
Maintain MD5 for backward compatibility
Phase 3: SHA-256 Primary
All new implementations: SHA-256
Legacy MD5 queries: Supported but not recommended
Documentation emphasizes SHA-256
Phase 4: MD5 Deprecation (Years Away)
MD5 support removed from security-critical functions
Legacy systems individually upgraded
MD5 retained only for non-security deduplication
Timeline: 5-10 years before MD5 truly phased out from security systems.
Practical Hash Lookup Examples
Example 1: Verifying Downloaded Software
Scenario: Download Firefox installer, publisher provides SHA-256 hash
Process:
# Compute hash
sha256sum Firefox-Setup-130.0.exe
# Verify matches published hash
Published: 3a9d7b2c1e4f6a8b5c7d9e0f1a2b3c4d...
Computed: 3a9d7b2c1e4f6a8b5c7d9e0f1a2b3c4d...
Match: ✓ Verified
Result: File integrity confirmed, safe to install.
Example 2: Investigating Suspicious File
Scenario: Received suspicious email attachment, want to check if it's malware
Process:
# Compute SHA-256
sha256sum unknown_attachment.exe
abc123...
# Look up in VirusTotal
# Result: 42 malware detections, known as Trojan.Win32.Generic
Result: File is malware, don't execute, quarantine.
Example 3: Legacy System Hash Lookup
Scenario: Old antivirus tool only accepts MD5
Process:
# Compute MD5 (only option for this tool)
md5sum old_suspicious_file.exe
5d41402abc4b2a76b9719d911017c592
# Look up in legacy database
# Result: Known malware, quarantine
Note: In modern system, would use SHA-256 instead.
Why Hash Lookup Works Better with SHA-256
Coverage
Modern threat intelligence databases prioritize SHA-256:
- New malware samples: Submitted with SHA-256
- Modern tools: Generate SHA-256 hashes
- Future databases: SHA-256 native
MD5 has better historical coverage but declining new entries.
Reliability
SHA-256 lookups are more reliable because:
- No collision risks (MD5 collisions theoretically possible)
- Stronger filtering of false positives
- Better detection algorithm integration
- More database contributors use SHA-256
Integration
Modern security tools integrate SHA-256:
- VirusTotal API: Prefers SHA-256
- Hybrid Analysis: SHA-256 primary
- EDR platforms: Use SHA-256
- SIEM systems: SHA-256 standard
Conclusion
For hash lookup and all security purposes: Use SHA-256.
MD5 is cryptographically broken and should be avoided for anything security-related. While legacy systems and databases still contain MD5 hashes, and some tools still accept MD5 input, SHA-256 is the clear modern standard.
When investigating files or verifying integrity:
- Compute SHA-256 hash
- Look up in modern databases (VirusTotal, Hybrid Analysis, etc.)
- Use MD5 only if specifically required by legacy systems
- Plan migration away from MD5 in your organization
The extra 32 hexadecimal characters in a SHA-256 hash versus MD5 represent 128 additional bits of security—a worthwhile investment for protecting your systems and data.


