If you have ever stared at a wall of nginx access logs or syslog lines and wished you could turn them into clean, queryable fields, grok patterns are the tool for the job. Grok is the parsing language used by Logstash, Elasticsearch ingest pipelines, and many other log shippers to extract structured data from unstructured text. Instead of writing a raw regular expression for every log format, you compose named patterns like %{IP:client} and %{TIMESTAMP_ISO8601:timestamp} into a readable expression that both matches the line and labels each captured field.
This guide is a working reference of grok pattern examples for the log formats you are most likely to parse: Nginx, Apache, syslog, SSH auth logs, Cisco ASA, Java application logs, AWS ELB, HAProxy, PostgreSQL, IIS, and Docker. Every pattern below has been tested against the sample log shown with it, so you can copy them into your own pipeline with confidence. If you want the deeper background on how grok relates to plain regular expressions and why you would choose one over the other, read Grok vs Regex: When to Use Each for Log Parsing first. Otherwise, jump straight to the format you need.
Each example includes a "Try it" link that opens the pattern and sample log directly in the Grok Pattern Builder & Debugger so you can see the extracted fields highlighted in real time.
Nginx Access Logs
Sample log line:
192.168.1.50 - frank [10/Oct/2026:13:55:36 -0700] "GET /api/users?page=2 HTTP/1.1" 200 2326 "https://example.com/dashboard" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
Grok pattern:
%{IPORHOST:remote_addr} - %{HTTPDUSER:remote_user} \[%{HTTPDATE:timestamp}\] "%{WORD:method} %{NOTSPACE:request} HTTP/%{NUMBER:http_version}" %{NUMBER:status:int} %{NUMBER:bytes_sent:int} "%{DATA:referrer}" "%{DATA:user_agent}"
Fields extracted:
| Field | Pattern | Example value |
|---|---|---|
| remote_addr | %{IPORHOST} | 192.168.1.50 |
| remote_user | %{HTTPDUSER} | frank |
| timestamp | %{HTTPDATE} | 10/Oct/2026:13:55:36 -0700 |
| method | %{WORD} | GET |
| request | %{NOTSPACE} | /api/users?page=2 |
| http_version | %{NUMBER} | 1.1 |
| status | %{NUMBER:int} | 200 |
| bytes_sent | %{NUMBER:int} | 2326 |
| referrer | %{DATA} | https://example.com/dashboard |
| user_agent | %{DATA} | Mozilla/5.0 ... |
The default nginx combined log format is nearly identical to Apache's, which is why you can often reuse the same pattern across both. The two details that trip people up: the request field is a single token captured with %{NOTSPACE} because the query string can contain characters that %{URIPATH} will not match, and the trailing referrer and user_agent fields use %{DATA} rather than %{GREEDYDATA} because they are bounded by quotes. Casting status and bytes_sent to :int matters if you plan to do numeric aggregations in Elasticsearch — without the cast they are indexed as strings and you cannot compute averages or ranges. If your nginx config adds custom fields (request time, upstream address), append matching patterns to the end rather than trying to retrofit them in the middle.
Try it in the Grok Pattern Builder →
Apache Access Logs
Sample log line:
127.0.0.1 - peter [9/Feb/2026:10:34:12 -0700] "GET /sample-image.png HTTP/2" 200 1479 "https://example.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)"
Grok pattern:
%{COMBINEDAPACHELOG}
Fields extracted:
Using the built-in %{COMBINEDAPACHELOG} macro, you get: clientip, ident, auth, timestamp, verb, request, httpversion, response, bytes, referrer, and agent. The field names are fixed by the macro definition.
This is the single most useful shortcut in the entire grok library. %{COMBINEDAPACHELOG} is a composite pattern that expands to the full combined-log expression, so you do not have to write any of it by hand. There is also %{COMMONAPACHELOG} for the shorter common-log format that omits the referrer and user agent. The catch is that the field names are baked in (clientip, verb, response) and differ from the names you might choose yourself, so downstream dashboards must use those exact names. If your Apache LogFormat directive deviates even slightly from the standard combined format — a common cause is adding %D for request duration or %{X-Forwarded-For}i — the macro will fail to match and you will need to fall back to spelling out the pattern explicitly, much like the nginx example above.
Try it in the Grok Pattern Builder →
Syslog (RFC 3164)
Sample log line:
Jan 15 10:30:45 webserver01 sshd[12345]: Failed password for invalid user admin from 203.0.113.42 port 51234 ssh2
Grok pattern:
%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:hostname} %{PROG:program}(?:\[%{POSINT:pid:int}\])?: %{GREEDYDATA:message}
Fields extracted:
| Field | Pattern | Example value |
|---|---|---|
| timestamp | %{SYSLOGTIMESTAMP} | Jan 15 10:30:45 |
| hostname | %{SYSLOGHOST} | webserver01 |
| program | %{PROG} | sshd |
| pid | %{POSINT:int} | 12345 |
| message | %{GREEDYDATA} | Failed password for ... |
Traditional BSD syslog (RFC 3164) is deceptively annoying to parse because the timestamp has no year and uses a fixed-width, space-padded day (Jan 5 has two spaces, Jan 15 has one). %{SYSLOGTIMESTAMP} handles both cases. The PID is optional — many programs log without one — which is why it sits inside the non-capturing optional group (?:\[%{POSINT:pid:int}\])?. Drop that optionality and any line without a PID will fail to match. Everything after the colon is swept up by %{GREEDYDATA:message}; in practice you then run a second, message-specific grok pattern against that field to extract the structured detail, which is exactly what the SSH example below does. If you are on a modern stack emitting RFC 5424 structured syslog, this pattern will not fit — that format puts a version digit and ISO timestamp up front instead.
Try it in the Grok Pattern Builder →
SSH Authentication Failures
Sample log line:
Jan 15 10:30:45 webserver01 sshd[12345]: Failed password for invalid user admin from 203.0.113.42 port 51234 ssh2
Grok pattern:
%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:hostname} sshd\[%{POSINT:pid:int}\]: Failed password for (?:invalid user )?%{USERNAME:username} from %{IP:source_ip} port %{POSINT:source_port:int} ssh2
Fields extracted:
| Field | Pattern | Example value |
|---|---|---|
| timestamp | %{SYSLOGTIMESTAMP} | Jan 15 10:30:45 |
| hostname | %{SYSLOGHOST} | webserver01 |
| pid | %{POSINT:int} | 12345 |
| username | %{USERNAME} | admin |
| source_ip | %{IP} | 203.0.113.42 |
| source_port | %{POSINT:int} | 51234 |
This is the syslog pattern specialized for one specific message: failed SSH logins from /var/log/auth.log. It is a workhorse for security monitoring and brute-force detection because it pulls out the attacking source_ip and the targeted username as discrete fields you can aggregate, alert on, or feed to a blocklist. The key subtlety is the (?:invalid user )? non-capturing group: OpenSSH logs "Failed password for admin" when the account exists and "Failed password for invalid user admin" when it does not, and you want a single pattern to catch both. Once these events are structured, a simple count of failures per source_ip over a time window is enough to drive an automated alert. If you are wiring this into a detection pipeline, an incident alerting tool like Alert24 can turn a spike in failed-auth events into a real-time notification.
Try it in the Grok Pattern Builder →
Cisco ASA Firewall Logs
Sample log line:
Jan 15 10:30:45 fw01 %ASA-6-302013: Built outbound TCP connection 110577 for outside:198.51.100.10/443 (198.51.100.10/443) to inside:10.0.0.25/52837 (203.0.113.5/52837)
Grok pattern:
%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:device} %ASA-%{INT:level:int}-%{INT:message_id}: Built %{WORD:direction} %{WORD:protocol} connection %{INT:connection_id} for %{DATA:src_interface}:%{IP:src_ip}/%{INT:src_port:int} \(%{IP:src_mapped_ip}/%{INT:src_mapped_port:int}\) to %{DATA:dst_interface}:%{IP:dst_ip}/%{INT:dst_port:int} \(%{IP:dst_mapped_ip}/%{INT:dst_mapped_port:int}\)
Fields extracted:
Key fields: timestamp, device, level, message_id (302013), direction (outbound), protocol (TCP), connection_id, plus source and destination interface, IP, and port pairs (src_ip/src_port, dst_ip/dst_port) and their NAT-mapped equivalents.
Cisco ASA syslog is one of the harder formats to grok because a single device emits hundreds of different message IDs, each with its own body structure. This pattern targets message 302013 ("Built ... connection"), which records new connections through the firewall. The literal %ASA- prefix is awkward in grok because % introduces a pattern reference — but here it is followed by ASA, which is not a valid pattern name, so it is treated as a literal. The real complexity is the NAT mapping: ASA logs both the real address and the translated address in parentheses, so you need paired captures like %{IP:src_ip}/%{INT:src_port:int} \(%{IP:src_mapped_ip}/%{INT:src_mapped_port:int}\). Because each message ID needs its own pattern, the standard approach is a grok filter with an array of patterns tried in order, or a conditional keyed on message_id extracted by a lightweight first-pass pattern.
Try it in the Grok Pattern Builder →
Generic Application Logs (ISO 8601)
Sample log line:
2026-01-15T10:30:45.123Z [ERROR] com.example.PaymentService - Failed to process payment for order 12345: connection timeout
Grok pattern:
%{TIMESTAMP_ISO8601:timestamp} \[%{LOGLEVEL:level}\] %{NOTSPACE:logger} - %{GREEDYDATA:message}
Fields extracted:
| Field | Pattern | Example value |
|---|---|---|
| timestamp | %{TIMESTAMP_ISO8601} | 2026-01-15T10:30:45.123Z |
| level | %{LOGLEVEL} | ERROR |
| logger | %{NOTSPACE} | com.example.PaymentService |
| message | %{GREEDYDATA} | Failed to process payment ... |
This is the pattern you reach for first with any modern application that you control the log format of. %{TIMESTAMP_ISO8601} matches the ISO 8601 timestamps that most logging frameworks emit by default, including the optional fractional seconds and trailing Z or numeric timezone offset. %{LOGLEVEL} is a built-in alternation that matches the standard level names (TRACE, DEBUG, INFO, WARN, ERROR, FATAL and a few variants) so you do not have to enumerate them yourself. The one thing this pattern does not handle is multi-line stack traces: when an exception spans several lines, only the first line matches and the stack trace lands in following lines that grok sees separately. Solve that upstream with a multiline codec or filebeat multiline setting that stitches the stack trace onto the original event before grok ever runs.
Try it in the Grok Pattern Builder →
Java Logs (Log4j / Logback)
Sample log line:
2026-01-15 10:30:45,123 ERROR [http-nio-8080-exec-5] c.e.api.OrderController - Order validation failed: missing customer ID
Grok pattern:
%{TIMESTAMP_ISO8601:timestamp} +%{LOGLEVEL:level} +\[%{DATA:thread}\] +%{NOTSPACE:class} *- *%{GREEDYDATA:message}
Fields extracted:
| Field | Pattern | Example value |
|---|---|---|
| timestamp | %{TIMESTAMP_ISO8601} | 2026-01-15 10:30:45,123 |
| level | %{LOGLEVEL} | ERROR |
| thread | %{DATA} | http-nio-8080-exec-5 |
| class | %{NOTSPACE} | c.e.api.OrderController |
| message | %{GREEDYDATA} | Order validation failed ... |
Java logging frameworks have two quirks this pattern accounts for. First, the default Log4j/Logback date pattern uses a comma before the milliseconds (10:30:45,123) rather than a period — %{TIMESTAMP_ISO8601} accepts both, so it still matches. Second, Java log layouts are typically column-aligned with variable runs of spaces for readability, so the pattern uses + (one or more spaces) and * around the dash separator rather than assuming a single space. If you skip that flexibility, the pattern works on one application's output and mysteriously fails on another whose layout pads differently. As with all application logs, multi-line exceptions need to be merged before parsing — a Java stack trace is the canonical multi-line log problem.
Try it in the Grok Pattern Builder →
AWS ELB Access Logs
Sample log line:
2026-01-15T10:30:45.123456Z my-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.000073 0.001048 0.000057 200 200 0 29 "GET https://example.com:443/api/health HTTP/1.1"
Grok pattern:
%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:elb_name} %{IP:client_ip}:%{INT:client_port:int} %{IP:backend_ip}:%{INT:backend_port:int} %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} %{INT:elb_status_code:int} %{INT:backend_status_code:int} %{INT:received_bytes:int} %{INT:sent_bytes:int} "%{WORD:method} %{NOTSPACE:request} HTTP/%{NUMBER:http_version}"
Fields extracted:
Fields include timestamp, elb_name, client_ip/client_port, backend_ip/backend_port, three processing-time floats (request_processing_time, backend_processing_time, response_processing_time), elb_status_code, backend_status_code, received_bytes, sent_bytes, and the embedded method/request/http_version.
AWS Classic Load Balancer access logs pack a lot into one line, and the most valuable fields are the three timing measurements. Capturing them as %{NUMBER:...:float} (note the :float cast) lets you build latency percentiles and spot whether slowness is in the ELB, the backend, or the response phase. A subtle gotcha: AWS writes -1 for the timing fields when a request fails before a backend is selected, and %{NUMBER} does match the negative value, but your downstream math needs to treat -1 as "not measured" rather than a real duration. The distinction between elb_status_code and backend_status_code is also worth keeping — a 504 from the ELB with no backend status usually means the backend never responded. Note that the Application Load Balancer (ALB) log format is different and longer than this Classic ELB format, with additional fields for target groups and TLS, so do not reuse this pattern for ALB.
Try it in the Grok Pattern Builder →
HAProxy Logs
Sample log line:
Jan 15 10:30:45 lb01 haproxy[2120]: 10.0.1.2:33317 [15/Jan/2026:10:30:45.123] http-in web-backend/web01 0/0/1/94/95 200 1750
Grok pattern:
%{SYSLOGTIMESTAMP:timestamp} %{IPORHOST:syslog_host} haproxy\[%{INT:pid:int}\]: %{IP:client_ip}:%{INT:client_port:int} \[%{DATA:accept_date}\] %{NOTSPACE:frontend} %{NOTSPACE:backend}/%{NOTSPACE:server} %{INT:time_request:int}/%{INT:time_queue:int}/%{INT:time_connect:int}/%{INT:time_response:int}/%{INT:time_total:int} %{INT:status_code:int} %{INT:bytes_read:int}
Fields extracted:
Fields: timestamp, syslog_host, pid, client_ip/client_port, accept_date, frontend, backend/server, the five timers (time_request, time_queue, time_connect, time_response, time_total), status_code, and bytes_read.
HAProxy's HTTP-format log is dense, and its signature feature is the five-part timing string Tq/Tw/Tc/Tr/Tt (0/0/1/94/95 in the sample). Each segment is a separate latency stage — request read, queue wait, connect, response, and total — and splitting them into individual integer fields is the whole reason to parse HAProxy logs. Watch out for the same -1 convention HAProxy uses to mark a stage that never completed (for example a client that aborted before sending the full request). The backend/server pair captured with %{NOTSPACE:backend}/%{NOTSPACE:server} tells you which specific backend server handled the request, invaluable when one node in a pool is misbehaving. HAProxy can be configured for either HTTP or TCP log format; this pattern is for the HTTP format, and the TCP format omits the request-related fields.
Try it in the Grok Pattern Builder →
PostgreSQL Logs
Sample log line:
2026-01-15 10:30:45 UTC [12345] ERROR: duplicate key value violates unique constraint "users_email_key"
Grok pattern:
%{TIMESTAMP_ISO8601:timestamp} %{TZ:timezone} \[%{POSINT:pid:int}\] %{WORD:level}: +%{GREEDYDATA:message}
Fields extracted:
| Field | Pattern | Example value |
|---|---|---|
| timestamp | %{TIMESTAMP_ISO8601} | 2026-01-15 10:30:45 |
| timezone | %{TZ} | UTC |
| pid | %{POSINT:int} | 12345 |
| level | %{WORD} | ERROR |
| message | %{GREEDYDATA} | duplicate key value ... |
PostgreSQL's default log_line_prefix is configurable, which means this pattern matches the common %t [%p] style but may need tweaking for your instance — if you have customized the prefix to include the database name, user, or application name, add the corresponding patterns. The %{TZ} pattern captures the timezone abbreviation that Postgres appends to its timestamp. Note the two spaces after the level (ERROR: duplicate) — Postgres pads its severity output, which is why %{WORD:level}: + uses + to absorb one or more spaces. The bigger challenge with Postgres logs is that detail lines (the DETAIL:, HINT:, and STATEMENT: continuation lines that follow an error) are emitted as separate log lines, so the same multiline-merge advice applies if you want them attached to the parent error event.
Try it in the Grok Pattern Builder →
IIS W3C Logs
Sample log line:
2026-01-15 10:30:45 10.0.0.10 GET /default.aspx - 443 - 203.0.113.42 Mozilla/5.0+(Windows+NT+10.0) https://example.com/ 200 0 0 312
Grok pattern:
%{TIMESTAMP_ISO8601:timestamp} %{IP:server_ip} %{WORD:method} %{URIPATH:uri_path} %{NOTSPACE:uri_query} %{INT:server_port:int} %{NOTSPACE:username} %{IP:client_ip} %{NOTSPACE:user_agent} %{NOTSPACE:referrer} %{INT:status:int} %{INT:substatus:int} %{INT:win32_status:int} %{INT:time_taken:int}
Fields extracted:
Fields: timestamp, server_ip, method, uri_path, uri_query, server_port, username, client_ip, user_agent, referrer, status, substatus, win32_status, and time_taken.
Microsoft IIS W3C extended logs have two properties that make them unusual. First, fields are space-delimited and a literal - is written wherever a value is absent (no authenticated user, no query string), which is why most fields use %{NOTSPACE} so the dash itself is captured rather than causing a match failure. Second, IIS URL-encodes spaces inside the user-agent and referrer as + signs (Mozilla/5.0+(Windows+NT+10.0)), so the entire user agent is a single space-free token — convenient for %{NOTSPACE}, but you will want to decode the + back to spaces downstream if you display it. Critically, the set and order of fields in an IIS log is controlled by the #Fields: directive in the log file's header, and administrators can add or remove columns. Always check the actual #Fields: line for the server you are parsing; if it differs from this example, reorder the pattern to match. Also remember to skip the #-prefixed comment lines.
Try it in the Grok Pattern Builder →
Docker Container Logs
Sample log line:
2026-01-15T10:30:45.123456789Z stdout F Server listening on port 8080
Grok pattern:
%{TIMESTAMP_ISO8601:timestamp} %{WORD:stream} %{NOTSPACE:log_tag} %{GREEDYDATA:message}
Fields extracted:
| Field | Pattern | Example value |
|---|---|---|
| timestamp | %{TIMESTAMP_ISO8601} | 2026-01-15T10:30:45.123456789Z |
| stream | %{WORD} | stdout |
| log_tag | %{NOTSPACE} | F |
| message | %{GREEDYDATA} | Server listening on port 8080 |
This pattern targets the line prefix that containerd and Docker's JSON-file logging driver add when logs are written in the CRI text format: an RFC 3339 nanosecond timestamp, the stream name (stdout or stderr), a partial/full tag (P or F), and then the actual application output. The nanosecond-precision timestamp (nine fractional digits) is matched fine by %{TIMESTAMP_ISO8601}. The important realization is that %{GREEDYDATA:message} captures your container's raw log line, which is itself often structured — JSON from a Node app, or an ISO-timestamped line from a Java app. The standard approach is to parse the prefix with this pattern, then run a second filter (a JSON filter or one of the application patterns above) against the extracted message field. If your platform already strips the CRI prefix (many Kubernetes log shippers do), skip straight to parsing the inner format.
Try it in the Grok Pattern Builder →
The Grok Patterns You Will Use Most
Behind every example above is a small set of core building-block patterns. Learn these five and you can read or write most grok expressions without a reference open.
| Pattern | What it matches | Notes |
|---|---|---|
%{TIMESTAMP_ISO8601} | ISO 8601 datetime: 2026-01-15T10:30:45.123Z, with optional fractional seconds and timezone | Accepts both comma and period millisecond separators; the default for most modern apps |
%{IP} | An IPv4 or IPv6 address | Use %{IPV4} or %{IPV6} if you need to be strict; %{IPORHOST} also matches hostnames |
%{NUMBER} | An integer or decimal, optionally signed: 200, 0.001048, -1 | Add a type cast — %{NUMBER:bytes:int} or %{NUMBER:latency:float} — so the field indexes as a number, not a string |
%{LOGLEVEL} | Standard severity words: TRACE, DEBUG, INFO, WARN, WARNING, ERROR, FATAL, etc. | Saves you from spelling out an alternation; case-sensitive to the canonical names |
%{GREEDYDATA} | Everything remaining, including spaces: .* | Only ever use it last; it is greedy and will swallow fields you meant to capture if placed mid-pattern |
Two more deserve a mention because they cause the most confusion. %{DATA} is the non-greedy cousin of %{GREEDYDATA} — it matches the minimum needed and is what you use between delimiters like quotes ("%{DATA:referrer}"). %{NOTSPACE} matches a run of non-whitespace characters and is the right choice for single tokens like a URL path or a class name. A huge share of "my grok pattern does not match" problems come down to using a greedy pattern where a lazy one belongs, or forgetting that %{WORD} matches only [A-Za-z0-9_] and breaks on a hyphen or dot.
Testing Your Patterns
Grok patterns are unforgiving: a single mismatched space, an unescaped bracket, or a greedy pattern in the wrong spot means the whole line fails to match and you get no fields at all. The fastest way to develop a pattern is iteratively, against a real sample line, with instant feedback on which fields matched.
That is exactly what the Grok Pattern Builder & Debugger is for. Paste a sample log line, build up your pattern piece by piece, and watch the extracted fields appear (or see exactly where the match breaks) as you type. Every example in this article links straight into it with the pattern and sample pre-loaded — click any "Try it" link above to start from a known-good baseline and adapt it to your own logs.
When a pattern that "looks right" still will not match, the problem is almost always one of a handful of recurring causes: literal characters like [, ], and ( that need escaping, greedy patterns consuming too much, or a type cast applied to a value that does not fit. For a systematic walkthrough of diagnosing and fixing a broken pattern, see How to Debug Grok Patterns. And if you are still deciding whether grok is even the right tool versus a hand-written regular expression — or you need a pattern that grok's library does not cover — our Regex Tester and the comparison in Grok vs Regex will help you choose.
Copy, Test, Ship
The twelve patterns above cover the formats behind the overwhelming majority of production log parsing work. Start from the example closest to your format, open it in the builder, paste in a line from your own logs, and adjust the field names and optional groups to fit. Because each pattern is composed from named building blocks rather than raw regex, the result stays readable for the next person who has to maintain your pipeline — which, more often than not, is you in six months.