A URL uniquely identifies and locates resources on the web. Understanding URL structure is essential for web development, security analysis, and API integration.
Anatomy of a URL
https://user:[email protected]:443/path/to/page?query=value&foo=bar#section
└──┬─┘ └───┬───┘ └────┬─────┘└┬┘ └─────┬─────┘ └────────┬────────┘ └───┬──┘
scheme auth domain port path query fragment
- Scheme (protocol): http, https, ftp, mailto, etc.
- Authentication: Optional username:password (deprecated for security).
- Domain (hostname): The server address (example.com, 192.168.1.1, [::1]).
- Port: Optional service port (defaults: 80 for HTTP, 443 for HTTPS).
- Path: Hierarchical location of the resource (/api/users/123).
- Query string: Parameters passed to the resource (?search=test&page=2).
- Fragment: Client-side identifier within the resource (#section-3).
Common URL schemes
- http/https: Web pages and APIs (https is encrypted).
- ftp/ftps: File transfer protocol.
- mailto: Email addresses (mailto:[email protected]).
- tel: Phone numbers (tel:+1-555-0100).
- file: Local file system access (file:///C:/path/to/file).
- data: Inline data (data:image/png;base64,iVBORw0K...).
- ws/wss: WebSocket connections (wss:// is encrypted).
URL encoding (percent-encoding) Special characters must be encoded as %XX hex values:
- Space: %20 (or + in query strings)
- Special chars: ! = %21, # = %23, $ = %24, & = %26, etc.
- Unicode: Multi-byte UTF-8 sequences (é = %C3%A9)
Security considerations
- Open redirects: Validate redirect URLs to prevent phishing (use allowlists).
- URL injection: Sanitize user input before constructing URLs.
- Information disclosure: Avoid sensitive data in URLs (logged in server logs, browser history).
- Homograph attacks: Visually similar Unicode characters (examp1e.com vs exampℓe.com).
- SSRF vulnerabilities: Validate URLs before server-side fetches.
- Protocol smuggling: Attackers can use data:, javascript:, or file: schemes to bypass filters.
Best practices
- Always use HTTPS for sensitive data transmission.
- Keep URLs short and descriptive for better SEO and user experience.
- Use hyphens (-) instead of underscores (_) in paths.
- Avoid exposing session IDs or tokens in URLs (use cookies or headers).
- Implement proper URL validation and sanitization on both client and server.
- Use canonical URLs to prevent duplicate content issues.
URL vs URI
- URI (Uniform Resource Identifier): Generic term for resource identifiers (includes URL and URN).
- URL: Specifies location and access method (https://example.com/page).
- URN (Uniform Resource Name): Name-based identifier (urn:isbn:0-486-27557-4).
All URLs are URIs, but not all URIs are URLs.
Related Articles
View all articlesFormal Security Models Explained: Bell-LaPadula, Biba, Clark-Wilson, and Beyond
Master the formal security models that underpin all access control systems. This comprehensive guide covers Bell-LaPadula, Biba, Clark-Wilson, Brewer-Nash, lattice-based access control, and how to choose the right model for your organization.
Read article →Biometric Authentication: Understanding FAR, FRR, and CER for Security Professionals
Master the critical metrics behind biometric authentication systems including False Acceptance Rate (FAR), False Rejection Rate (FRR), and Crossover Error Rate (CER). Learn how to evaluate, tune, and deploy biometric systems across enterprise, consumer, and high-security environments.
Read article →Database Inference & Aggregation Attacks: The Complete Defense Guide
Learn how inference and aggregation attacks exploit aggregate queries and combined data to reveal protected information, and discover proven countermeasures including differential privacy, polyinstantiation, and query restriction controls.
Read article →NIST 800-88 Media Sanitization Complete Guide: Clear, Purge, and Destroy Methods Explained
Master NIST SP 800-88 Rev. 1 media sanitization methods including Clear, Purge, and Destroy. Covers SSD vs HDD sanitization, crypto erase, degaussing, regulatory compliance, and building a media sanitization program.
Read article →Explore More Web Technologies
View all termsAPI Endpoint
A specific URL where an API can be accessed, representing a function or resource in a web service.
Read more →HTTP Status Codes
Three-digit codes returned by web servers to indicate the result of an HTTP request.
Read more →Link Rot
The phenomenon where hyperlinks become permanently unavailable as web pages are moved or deleted.
Read more →User Agent String
A text string sent by web browsers to identify the browser, operating system, and device to web servers.
Read more →