Home/Blog/Should I use MD5 or SHA-256 for lookup?
Cybersecurity

Should I use MD5 or SHA-256 for lookup?

Compare MD5 and SHA-256 for hash lookup purposes and understand which algorithm to choose for your use case.

By Inventive HQ Team
Should I use MD5 or SHA-256 for lookup?

The Quick Answer

Use SHA-256 for new implementations. MD5 is cryptographically broken and should be avoided for security purposes, though it persists in legacy systems and databases. For hash lookup of files, SHA-256 provides better security, is supported everywhere, and should be your default choice.

However, the decision isn't always binary—sometimes both are used, sometimes legacy systems require MD5, and understanding the tradeoffs helps you make informed decisions.

MD5: The Deprecated Hashing Algorithm

History

MD5 (Message Digest 5) was designed by Ronald Rivest in 1991 as a cryptographic hash function. It was widely adopted and became a standard for file verification and integrity checking.

MD5 characteristics:

  • Output: 128-bit hash (32 hexadecimal characters)
  • Speed: Very fast
  • Collision resistance: Broken since 2004

Why MD5 is Broken

Collision attacks discovered (2004): MD5's collision resistance property was broken, meaning two different inputs can produce the same hash output. This fundamentally violates the core security property of hash functions.

Practical attacks:

  • 2008: Forged SSL certificates using MD5 collisions
  • Ongoing: MD5 collisions are relatively easy to generate
  • Real-world impact: Malicious files can be crafted to match legitimate file hashes

Example collision (real attack):

File A: Legitimate executable
MD5(File A) = "5d41402abc4b2a76b9719d911017c592"

File B: Malicious executable (carefully crafted by attacker)
MD5(File B) = "5d41402abc4b2a76b9719d911017c592" (same!)

Attacker distributes File B, claiming it's File A
Integrity check passes because hashes match
Users execute malware thinking it's legitimate

When MD5 Still Appears

Despite being broken, MD5 persists in:

  1. Legacy systems: Old software still using MD5
  2. Backward compatibility: Supporting old file formats
  3. Database records: Billions of MD5 hashes already in systems
  4. Non-security uses: File deduplication, checksums (where collision not concern)
  5. Hash lookup databases: Many include MD5 entries for historical coverage

Examples:

  • VirusTotal: Accepts MD5 lookups (though uses SHA-256 primarily)
  • Linux distributions: Some still provide MD5 checksums (legacy reasons)
  • Legacy security software: Older antivirus products used MD5

SHA-256: The Modern Standard

History

SHA-256 (Secure Hash Algorithm 256-bit) was published by NIST in 2001 as part of the SHA-2 family, addressing weaknesses in MD5 and SHA-1.

SHA-256 characteristics:

  • Output: 256-bit hash (64 hexadecimal characters)
  • Speed: Fast (slower than MD5, but acceptable)
  • Collision resistance: Theoretically secure for billions of years
  • No known practical attacks

Why SHA-256 is Secure

Design improvements over MD5:

  • Larger output (256-bit vs 128-bit) makes collisions exponentially harder
  • More complex mathematical operations
  • Designed with modern cryptanalysis in mind
  • Extensively studied and peer-reviewed

Security properties:

  • No known practical attacks
  • No collision method discovered
  • Theoretically secure through 2100+
  • Resistant to length-extension attacks (with proper padding)

Real-world adoption:

  • NIST standard
  • TLS/SSL certificates
  • Bitcoin blockchain
  • Digital signatures
  • Password hashing
  • File integrity verification

MD5 vs SHA-256: Detailed Comparison

AspectMD5SHA-256
Release Date19912001
Output Size128-bit256-bit
Security StatusCryptographically BrokenSecure
Known AttacksCollision attacks practicalNo practical attacks
SpeedVery fast (~600 MB/s)Fast (~400 MB/s)
Collision ResistanceFailedSecure
Preimage ResistanceWeakStrong
Database CoverageLegacy systemsUniversal
Verification UseNot recommendedRecommended
Certificate SigningDeprecatedStandard
Recommended for New SystemsNoYes

When to Use Each

Use SHA-256

Always use SHA-256 for:

  • New implementations
  • Security-critical applications
  • File integrity verification
  • Digital signatures
  • Password hashing (with proper salting)
  • Certificate signing
  • Hash lookups for malware detection

Examples:

# Verifying downloaded Linux ISO
sha256sum ubuntu-24.04-desktop-amd64.iso

# Checking file integrity after transfer
sha256sum important_document.pdf

# Verifying software authenticity
sha256sum software_installer.exe

# Hash lookup for security analysis
virustotal.com (upload file or SHA-256)

Use MD5 Only When

Legacy compatibility necessary:

  • Supporting old systems that only provide MD5
  • Integrating with systems that can't be updated
  • Backward compatibility with existing databases
  • Migrating from MD5 to SHA-256

Non-security uses:

  • File deduplication (where collision not security risk)
  • Checksums for file transfer integrity (non-adversarial)
  • Cache invalidation
  • Database indexing (non-security)

Examples:

# Legacy system that requires MD5
# Old antivirus database lookup
# Supporting outdated API that only accepts MD5

Never Use MD5 For

  • ✗ Security-critical integrity checking
  • ✗ Digital signatures or certificate signing
  • ✗ Password hashing
  • ✗ Malware detection hash lookup
  • ✗ Authenticating downloads
  • ✗ Access control decisions

Hash Lookup: MD5 vs SHA-256

Hash Lookup Databases

VirusTotal:

  • Accepts: MD5, SHA-1, SHA-256
  • Recommends: SHA-256 or SHA-1
  • Deprecating: MD5 for security-critical lookups
  • Storage: Has records for billions of MD5 hashes (legacy coverage)

NSRL (National Software Reference Library):

  • Primarily: SHA-1 and MD5
  • Newer entries: Include SHA-256
  • Legacy: Extensive MD5 coverage from decades of collection

YARA/Threat Intelligence:

  • Modern implementations: SHA-256 primary
  • Legacy: May include MD5
  • Best practice: Use SHA-256

Recommendation for Hash Lookup

For current investigations: Use SHA-256

# Get SHA-256 of suspicious file
sha256sum suspicious_file.exe
# Look up in VirusTotal/Hybrid Analysis

For legacy searches: May need MD5

# If database only supports MD5
md5sum suspicious_file.exe
# Look up in older security tools

Best practice: Compute both

# Generate both hashes
sha256sum file.exe → abc123...
md5sum file.exe → def456...

# Check SHA-256 in modern databases first
# Fall back to MD5 if needed for legacy systems

Migration Path: MD5 to SHA-256

Organizations should plan migration:

Phase 1: Dual Support (Current)

New systems use SHA-256
Legacy systems continue MD5
Both supported where applicable

Phase 2: Gradual Transition

Compute and store both hashes
Prioritize SHA-256 in new workflows
Maintain MD5 for backward compatibility

Phase 3: SHA-256 Primary

All new implementations: SHA-256
Legacy MD5 queries: Supported but not recommended
Documentation emphasizes SHA-256

Phase 4: MD5 Deprecation (Years Away)

MD5 support removed from security-critical functions
Legacy systems individually upgraded
MD5 retained only for non-security deduplication

Timeline: 5-10 years before MD5 truly phased out from security systems.

Practical Hash Lookup Examples

Example 1: Verifying Downloaded Software

Scenario: Download Firefox installer, publisher provides SHA-256 hash

Process:

# Compute hash
sha256sum Firefox-Setup-130.0.exe

# Verify matches published hash
Published: 3a9d7b2c1e4f6a8b5c7d9e0f1a2b3c4d...
Computed:  3a9d7b2c1e4f6a8b5c7d9e0f1a2b3c4d...
Match: ✓ Verified

Result: File integrity confirmed, safe to install.

Example 2: Investigating Suspicious File

Scenario: Received suspicious email attachment, want to check if it's malware

Process:

# Compute SHA-256
sha256sum unknown_attachment.exe
abc123...

# Look up in VirusTotal
# Result: 42 malware detections, known as Trojan.Win32.Generic

Result: File is malware, don't execute, quarantine.

Example 3: Legacy System Hash Lookup

Scenario: Old antivirus tool only accepts MD5

Process:

# Compute MD5 (only option for this tool)
md5sum old_suspicious_file.exe
5d41402abc4b2a76b9719d911017c592

# Look up in legacy database
# Result: Known malware, quarantine

Note: In modern system, would use SHA-256 instead.

Why Hash Lookup Works Better with SHA-256

Coverage

Modern threat intelligence databases prioritize SHA-256:

  • New malware samples: Submitted with SHA-256
  • Modern tools: Generate SHA-256 hashes
  • Future databases: SHA-256 native

MD5 has better historical coverage but declining new entries.

Reliability

SHA-256 lookups are more reliable because:

  • No collision risks (MD5 collisions theoretically possible)
  • Stronger filtering of false positives
  • Better detection algorithm integration
  • More database contributors use SHA-256

Integration

Modern security tools integrate SHA-256:

  • VirusTotal API: Prefers SHA-256
  • Hybrid Analysis: SHA-256 primary
  • EDR platforms: Use SHA-256
  • SIEM systems: SHA-256 standard

Conclusion

For hash lookup and all security purposes: Use SHA-256.

MD5 is cryptographically broken and should be avoided for anything security-related. While legacy systems and databases still contain MD5 hashes, and some tools still accept MD5 input, SHA-256 is the clear modern standard.

When investigating files or verifying integrity:

  1. Compute SHA-256 hash
  2. Look up in modern databases (VirusTotal, Hybrid Analysis, etc.)
  3. Use MD5 only if specifically required by legacy systems
  4. Plan migration away from MD5 in your organization

The extra 32 hexadecimal characters in a SHA-256 hash versus MD5 represent 128 additional bits of security—a worthwhile investment for protecting your systems and data.

Need Expert Cybersecurity Guidance?

Our team of security experts is ready to help protect your business from evolving threats.