A URL uniquely identifies and locates resources on the web. Understanding URL structure is essential for web development, security analysis, and API integration.
Anatomy of a URL
https://user:[email protected]:443/path/to/page?query=value&foo=bar#section
└──┬─┘ └───┬───┘ └────┬─────┘└┬┘ └─────┬─────┘ └────────┬────────┘ └───┬──┘
scheme auth domain port path query fragment
- Scheme (protocol): http, https, ftp, mailto, etc.
- Authentication: Optional username:password (deprecated for security).
- Domain (hostname): The server address (example.com, 192.168.1.1, [::1]).
- Port: Optional service port (defaults: 80 for HTTP, 443 for HTTPS).
- Path: Hierarchical location of the resource (/api/users/123).
- Query string: Parameters passed to the resource (?search=test&page=2).
- Fragment: Client-side identifier within the resource (#section-3).
Common URL schemes
- http/https: Web pages and APIs (https is encrypted).
- ftp/ftps: File transfer protocol.
- mailto: Email addresses (mailto:[email protected]).
- tel: Phone numbers (tel:+1-555-0100).
- file: Local file system access (file:///C:/path/to/file).
- data: Inline data (data:image/png;base64,iVBORw0K...).
- ws/wss: WebSocket connections (wss:// is encrypted).
URL encoding (percent-encoding) Special characters must be encoded as %XX hex values:
- Space: %20 (or + in query strings)
- Special chars: ! = %21, # = %23, $ = %24, & = %26, etc.
- Unicode: Multi-byte UTF-8 sequences (é = %C3%A9)
Security considerations
- Open redirects: Validate redirect URLs to prevent phishing (use allowlists).
- URL injection: Sanitize user input before constructing URLs.
- Information disclosure: Avoid sensitive data in URLs (logged in server logs, browser history).
- Homograph attacks: Visually similar Unicode characters (examp1e.com vs exampℓe.com).
- SSRF vulnerabilities: Validate URLs before server-side fetches.
- Protocol smuggling: Attackers can use data:, javascript:, or file: schemes to bypass filters.
Best practices
- Always use HTTPS for sensitive data transmission.
- Keep URLs short and descriptive for better SEO and user experience.
- Use hyphens (-) instead of underscores (_) in paths.
- Avoid exposing session IDs or tokens in URLs (use cookies or headers).
- Implement proper URL validation and sanitization on both client and server.
- Use canonical URLs to prevent duplicate content issues.
URL vs URI
- URI (Uniform Resource Identifier): Generic term for resource identifiers (includes URL and URN).
- URL: Specifies location and access method (https://example.com/page).
- URN (Uniform Resource Name): Name-based identifier (urn:isbn:0-486-27557-4).
All URLs are URIs, but not all URIs are URLs.
Explore More Web Technologies
View all termsAPI Endpoint
A specific URL where an API can be accessed, representing a function or resource in a web service.
Read more →HTTP Status Codes
Three-digit codes returned by web servers to indicate the result of an HTTP request.
Read more →Link Rot
The phenomenon where hyperlinks become permanently unavailable as web pages are moved or deleted.
Read more →User Agent String
A text string sent by web browsers to identify the browser, operating system, and device to web servers.
Read more →