`), blocks scripts (`
Want to learn more?
Learn when to encode vs sanitize HTML to prevent XSS attacks and display user content safely.
Read the guidePreventing XSS Vulnerabilities?
Our DevSecOps team implements secure output encoding and input validation.
What Is HTML Encoding
HTML encoding (also called HTML entity encoding) converts special characters into their HTML entity equivalents so they display correctly in web pages rather than being interpreted as HTML markup. Characters like <, >, &, ", and ' have special meaning in HTML—they define tags, attributes, and entities. When these characters appear in user content, they must be encoded to prevent rendering issues and security vulnerabilities.
HTML encoding is one of the most important defenses against Cross-Site Scripting (XSS), the most prevalent web security vulnerability. When user input is inserted into a web page without encoding, an attacker can inject malicious HTML or JavaScript that executes in other users' browsers. Proper encoding neutralizes these attacks by ensuring special characters are treated as text, not code.
How HTML Encoding Works
HTML encoding replaces special characters with named or numeric entity references:
| Character | Named Entity | Numeric Entity | Context |
|---|---|---|---|
| < | < | < | Opening tag delimiter |
| > | > | > | Closing tag delimiter |
| & | & | & | Entity start character |
| " | " | " | Attribute value delimiter |
| ' | ' | ' | Attribute value delimiter |
| / | / | / | Tag closing character |
| Space (non-breaking) | Preserved whitespace |
Encoding contexts matter: Different insertion points in HTML require different encoding strategies:
- HTML body: Encode
<,>,&,",' - HTML attributes: Encode all non-alphanumeric characters as entities
- JavaScript context: Use JavaScript string escaping, not HTML encoding
- URL context: Use URL/percent encoding, not HTML encoding
- CSS context: Use CSS escaping
Using the wrong encoding for the context is a common source of XSS vulnerabilities.
Common Use Cases
- XSS prevention: Encode user-supplied data before inserting it into HTML to prevent script injection
- Content display: Ensure code snippets, math formulas, and special characters render correctly on web pages
- Email templates: Encode special characters in HTML emails to prevent rendering issues across email clients
- CMS content: Safely display user-generated content (comments, forum posts, profiles) without allowing HTML injection
- API responses: Encode HTML entities in JSON responses that will be rendered in the browser
Best Practices
- Encode on output, not input — Store raw data and encode when rendering; this preserves data integrity and allows context-appropriate encoding
- Use context-appropriate encoding — HTML body encoding is different from attribute encoding, JavaScript encoding, and URL encoding
- Use framework auto-encoding — Modern frameworks (React, Angular, Vue) auto-encode by default; don't disable this protection
- Never rely on blocklist filtering — Trying to strip dangerous tags is fragile; encoding is the correct defense
- Double-check dangerouslySetInnerHTML / v-html — When frameworks require raw HTML insertion, sanitize with a library like DOMPurify first
References & Citations
- W3C. (2024). HTML5 Character References. Retrieved from https://dev.w3.org/html5/html-author/charref (accessed January 2025)
- OWASP. (2024). OWASP XSS Prevention Cheat Sheet. Retrieved from https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html (accessed January 2025)
Note: These citations are provided for informational and educational purposes. Always verify information with the original sources and consult with qualified professionals for specific advice related to your situation.
Frequently Asked Questions
Common questions about the HTML Entity Encoder/Decoder
HTML entities encode special characters that have meaning in HTML: < becomes <, > becomes >, & becomes &, " becomes ", ' becomes ' or '. Why important: prevents breaking HTML structure, avoids XSS (cross-site scripting) attacks, displays reserved characters literally, ensures proper rendering. Example: displaying code <script> without executing it. Two formats: named entities ( ), numeric entities (  decimal,   hex). Always encode user input before displaying in HTML to prevent security vulnerabilities.
XSS (Cross-Site Scripting) injects malicious scripts into pages. Without encoding: user input <script>alert("XSS")</script> executes as code. With encoding: <script>alert("XSS")</script> displays as text. Attack vectors: form inputs, URL parameters, cookies, database content. Defense layers: (1) Encode output (HTML entities), (2) Validate input (whitelist), (3) Content Security Policy headers, (4) HttpOnly cookies. Encoding alone not sufficient: use comprehensive XSS prevention, sanitize HTML if allowing markup, use frameworks that auto-encode (React, Vue). This tool helps encode untrusted content before rendering.
HTML encoding: for HTML content, encodes <>&"' to entities, used in HTML body/attributes, prevents HTML interpretation. URL encoding (percent encoding): for URLs, encodes space as %20 or +, special chars as %XX hex, used in query strings/paths, prevents URL parsing issues. Different contexts need different encoding: HTML entity: < for <, URL encoding: %3C for <. Don't mix: URL-encoded in HTML looks wrong (%20 displays as %20). When to use: HTML encoding in page content, URL encoding in hrefs/src attributes, both in JavaScript strings. This tool does HTML entity encoding; use separate tool for URL encoding.
Required in attribute values: double quote becomes " if using double-quoted attributes, single quote becomes ' if using single-quoted attributes, < becomes < (less common but safe), & becomes & (always). Example:
Unicode characters can be: (1) Used directly if UTF-8 charset: <meta charset="UTF-8">, no encoding needed. (2) HTML entities: é for é (named), é for é (decimal), é for é (hex). Modern approach: use UTF-8 directly, it's simpler and more readable. Encode entities only for: HTML special chars (<>&"), control characters, invisible chars, compatibility with non-UTF-8 systems. Emoji: use directly in UTF-8 or numeric entities 😀 😀. Right-to-left text: use proper HTML markup (dir="rtl"), not entities. This tool preserves Unicode by default, encodes only HTML special characters.
Encoding: converts special chars to entities, displays everything as text, no HTML tags work, safest for untrusted content, example: <b> → <b>. Sanitizing: allows some HTML, removes dangerous tags/attributes, permits formatting (<b>, <i>, <p>), blocks scripts (<script>, onclick), more complex. Use encoding when: displaying plain text, user input shown as-is, no formatting needed, maximum security. Use sanitizing when: allowing rich text editors, blog comments with formatting, markdown converted to HTML. Libraries: DOMPurify, sanitize-html for sanitizing. Never trust user HTML: always sanitize or encode. This tool does encoding (safest); use sanitizing libraries for rich text.
JavaScript: no built-in HTML encode. Manual: text.replace(/</g, "<").replace(/>/g, ">"). Libraries: DOMPurify, he, lodash escape. DOM method: textContent auto-encodes (safe), innerHTML doesn't (unsafe). Python: html.escape(text) built-in, or markupsafe.escape(). PHP: htmlspecialchars($text) or htmlentities($text). Java: StringEscapeUtils.escapeHtml4() (Apache Commons). Ruby: ERB::Util.html_escape() or CGI.escapeHTML(). C#: HttpUtility.HtmlEncode() or WebUtility.HtmlEncode(). Server-side encoding preferred: do before sending to browser. This tool for quick manual encoding and testing; use language built-ins in production code.
Double encoding: encoding already-encoded text, < becomes &lt;, displays as < not <. Fix: decode first, then encode. Wrong context: HTML encoding in JavaScript strings, needs JSON escaping too. Incomplete encoding: missing quotes or ampersands, still vulnerable. Over-encoding: encoding Unicode unnecessarily, makes text unreadable. Not encoding: forgetting to encode user input, XSS vulnerability. Encoding at wrong time: too early (breaks processing), too late (already executed). Inconsistent encoding: some fields encoded, others not. Testing: verify with XSS payloads: <script>alert(1)</script>, " onclick="...". This tool helps test encoding correctness before deploying.
⚠️ Security Notice
This tool is provided for educational and authorized security testing purposes only. Always ensure you have proper authorization before testing any systems or networks you do not own. Unauthorized access or security testing may be illegal in your jurisdiction. All processing happens client-side in your browser - no data is sent to our servers.