The ultimate regex cheat sheet with examples for email validation, URL matching, phone numbers, and more. Test your patterns with our free online regex tester.
This is the regex cheat sheet I keep open in a tab every single day. Not the academic one that explains the theory of finite automata. Not the one that lists every obscure flag nobody uses. This is the one with the patterns I actually reach for when I'm writing code, validating input, or parsing logs at 2 AM.
I've organized everything from basic building blocks to production-ready patterns you can copy and use immediately. Every example is tested. Every pattern is explained. And if you want to try any of them live, you can paste them straight into the regex tester on akousa.net to see matches highlighted in real time.
Before the cheat sheet makes sense, you need six concepts. That's it. Six.
Most characters match themselves. The pattern cat matches the string "cat" in "concatenate". Nothing fancy.
Pattern: cat
Matches: "The cat sat" → "cat"
"concatenate" → "cat"
These characters have special meaning and need to be escaped with a backslash if you want to match them literally:
. ^ $ * + ? { } [ ] ( ) | \
To match a literal dot, use \. instead of . which matches any character.
Square brackets define a set of characters to match:
[abc] → matches a, b, or c
[a-z] → matches any lowercase letter
[A-Z] → matches any uppercase letter
[0-9] → matches any digit
[a-zA-Z] → matches any letter
[^abc] → matches anything EXCEPT a, b, or c
These save you from writing out full character classes:
\d → any digit (same as [0-9])
\D → any non-digit (same as [^0-9])
\w → any word character (same as [a-zA-Z0-9_])
\W → any non-word char (same as [^a-zA-Z0-9_])
\s → any whitespace (space, tab, newline)
\S → any non-whitespace
. → any character except newline
Quantifiers control how many times a token repeats:
* → 0 or more a* matches "", "a", "aaa"
+ → 1 or more a+ matches "a", "aaa" but NOT ""
? → 0 or 1 a? matches "" or "a"
{3} → exactly 3 a{3} matches "aaa"
{2,5} → between 2 and 5 a{2,5} matches "aa" through "aaaaa"
{3,} → 3 or more a{3,} matches "aaa", "aaaa", etc.
Anchors match positions, not characters:
^ → start of string ^Hello matches "Hello world"
$ → end of string world$ matches "Hello world"
\b → word boundary \bcat\b matches "cat" but not "concatenate"
\B → non-word boundary \Bcat\B matches "concatenate" but not "cat"
Parentheses create groups. The pipe character creates alternatives.
(abc) → captures "abc" as group 1
(a)(b)(c) → three groups: group 1 = "a", group 2 = "b", group 3 = "c"
(ab|cd) → matches "ab" or "cd"
When you need grouping but don't need the captured value:
(?:abc) → groups "abc" without capturing
This matters for performance when you have many groups but only care about some of them.
Refer back to a previously captured group:
(a]b)\1 → matches "abab" (group 1 captured "ab", \1 repeats it)
(\w+)\s+\1 → matches repeated words like "the the" or "is is"
Give your groups meaningful names instead of numbers:
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})
In JavaScript, access via match.groups.year, match.groups.month, match.groups.day.
These assert what comes before or after the current position without consuming characters. They are incredibly powerful for complex validation.
Matches if the pattern ahead exists:
\d+(?= dollars) → matches "100" in "100 dollars" but not "100 euros"
Matches if the pattern ahead does NOT exist:
\d+(?! dollars) → matches "100" in "100 euros" but not in "100 dollars"
Matches if the pattern behind exists:
(?<=\$)\d+ → matches "50" in "$50" but not in "50"
Matches if the pattern behind does NOT exist:
(?<!\$)\d+ → matches "50" in "50 items" but not in "$50"
This is where lookaheads shine. The classic password rule (at least 8 characters, one uppercase, one lowercase, one digit, one special character):
^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[!@#$%^&*])[A-Za-z\d!@#$%^&*]{8,}$
Each (?=.*X) asserts that character class X exists somewhere in the string. The final part [...]{8,} enforces the minimum length and allowed characters.
Flags change how the engine interprets your pattern:
g → global: find all matches, not just the first
i → case-insensitive: /hello/i matches "Hello", "HELLO", "hello"
m → multiline: ^ and $ match start/end of each line, not just the string
s → dotAll: . matches newline characters too
u → unicode: enables full Unicode matching
y → sticky: matches only at lastIndex position
In JavaScript:
const regex = /pattern/gi;
// or
const regex = new RegExp('pattern', 'gi');These are patterns you can use right now. I've tested each one against edge cases. Paste them into the akousa.net regex tester if you want to verify them against your own data.
The pragmatic version that covers 99.9% of real email addresses:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
What it matches: user@example.com, first.last+tag@sub.domain.org, user123@company.co.uk
What it rejects: @missing-local.com, no-at-sign.com, spaces in@email.com
Note: The RFC 5322 compliant regex is over 6,000 characters long. Nobody uses it. The pattern above is what production applications actually use.
Matches HTTP and HTTPS URLs including paths, query strings, and fragments:
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
What it matches: https://example.com, http://www.test.co.uk/path?q=1&r=2, https://sub.domain.com/page#section
A flexible pattern for international phone numbers:
^\+?(\d{1,3})?[-.\s]?\(?\d{1,4}\)?[-.\s]?\d{1,4}[-.\s]?\d{1,9}$
What it matches: +1-234-567-8900, (234) 567-8900, +44 20 7946 0958, 1234567890
For strict US/Canada formatting:
^\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$
IPv4 with proper octet validation (0-255):
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
What it matches: 192.168.1.1, 10.0.0.255, 0.0.0.0
What it rejects: 256.1.1.1, 192.168.1.999, 1.2.3.4.5
IPv6 (simplified):
^([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}$
ISO 8601 date (YYYY-MM-DD):
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
US date (MM/DD/YYYY):
^(0[1-9]|1[0-2])\/(0[1-9]|[12]\d|3[01])\/\d{4}$
European date (DD/MM/YYYY):
^(0[1-9]|[12]\d|3[01])\/(0[1-9]|1[0-2])\/\d{4}$
24-hour time (HH:MM or HH:MM:SS):
^([01]\d|2[0-3]):([0-5]\d)(:[0-5]\d)?$
12-hour time with AM/PM:
^(0?[1-9]|1[0-2]):[0-5]\d\s?(AM|PM|am|pm)$
Visa:
^4\d{12}(\d{3})?$
Mastercard:
^(5[1-5]\d{4}|2(2[2-9]\d{2}|[3-6]\d{3}|7[01]\d{2}|720\d)\d{12})$
Any major card (Visa, MC, Amex, Discover):
^(?:4\d{12}(?:\d{3})?|5[1-5]\d{14}|3[47]\d{13}|6(?:011|5\d{2})\d{12})$
Matches 3-digit and 6-digit hex colors with optional hash:
^#?([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$
What it matches: #FFF, #ff5733, abc123, #000
^[a-z0-9]+(-[a-z0-9]+)*$
What it matches: my-blog-post, hello-world-123, about
What it rejects: My Blog Post, --double-dash, trailing-
Alphanumeric, 3-20 characters, underscores and hyphens allowed but not at start or end:
^[a-zA-Z0-9]([a-zA-Z0-9_-]{1,18}[a-zA-Z0-9])?$
At least 8 characters, requires uppercase, lowercase, digit, and special character:
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
These patterns are for parsing and transforming text, not just validating it.
-?\d+\.?\d*
Handles integers, decimals, and negative numbers. Applied to "Price: $19.99, discount: -5.50, qty: 3", this captures 19.99, -5.50, and 3.
<[^>]+>
Replace matches with an empty string to strip HTML. Applied to "<p>Hello <b>world</b></p>", removing matches gives "Hello world".
Warning: Do not use regex to parse complex HTML documents. Use a proper HTML parser. This pattern is fine for simple sanitization tasks.
Double quotes:
"([^"]*)"
Single or double quotes:
(['"])([^'"]*)\1
^#{1,6}\s+(.+)$
With the m (multiline) flag, this matches # Title, ## Subtitle, through ###### Deepest heading.
\b(\w+)\s+\1\b
Catches "the the", "is is", "and and" -- common typos in writing.
(?:^|,)("(?:[^"]|"")*"|[^,]*)
Handles quoted fields containing commas and escaped quotes.
https?:\/\/(?:www\.)?([^\/\s]+)
Group 1 captures the domain. Applied to "https://www.example.com/path", group 1 is example.com.
^\s*$
With the m flag, matches lines that are empty or contain only spaces and tabs. Useful for cleaning up text files.
Here's how to actually use these patterns in JavaScript code.
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
emailRegex.test('user@example.com'); // true
emailRegex.test('not-an-email'); // falseconst text = 'Call 555-1234 or 555-5678';
const phones = text.match(/\d{3}-\d{4}/g);
// ['555-1234', '555-5678']const text = 'Date: 2026-03-27, Updated: 2026-04-01';
const dates = [...text.matchAll(/(\d{4})-(\d{2})-(\d{2})/g)];
for (const match of dates) {
console.log(`Full: ${match[0]}, Year: ${match[1]}, Month: ${match[2]}, Day: ${match[3]}`);
}// Censor credit card numbers, keep last 4 digits
const text = 'Card: 4111-1111-1111-1234';
const censored = text.replace(/\d{4}-\d{4}-\d{4}-(\d{4})/, '****-****-****-$1');
// 'Card: ****-****-****-1234'const csv = 'one, two , three, four';
const values = csv.split(/\s*,\s*/);
// ['one', 'two', 'three', 'four']const dateRegex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = '2026-03-27'.match(dateRegex);
console.log(match.groups.year); // '2026'
console.log(match.groups.month); // '03'
console.log(match.groups.day); // '27'Python's re module uses the same core syntax with slightly different API calls.
import re
# Search (first match)
match = re.search(r'\d+', 'abc 123 def')
if match:
print(match.group()) # '123'
# Find all matches
numbers = re.findall(r'\d+', 'abc 12 def 345')
# ['12', '345']
# Substitution
clean = re.sub(r'<[^>]+>', '', '<p>Hello</p>')
# 'Hello'
# Compile for reuse
pattern = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
pattern.match('user@example.com') # Match objectKey difference: Python raw strings (r'...') prevent backslash escaping issues. Always use raw strings for regex patterns in Python.
By default, quantifiers are greedy -- they match as much as possible.
Pattern: <.+>
Input: <b>bold</b>
Greedy: <b>bold</b> (matches everything)
Lazy: <b> and </b> (matches smallest possible)
Add ? after a quantifier to make it lazy:
<.+?> → lazy version of <.+>
\d+? → lazy version of \d+
.*? → lazy version of .*
These will not work as expected:
192.168.1.1 → the dots match ANY character
$100 → $ means end of string
file.txt → the dot matches any character
Correct versions:
192\.168\.1\.1
\$100
file\.txt
Certain patterns can cause the regex engine to take exponentially long on specific inputs. The classic example:
(a+)+b
On an input like aaaaaaaaaaaaaaaaac, the engine tries every possible way to divide the as between the inner and outer + before concluding there's no match. This can freeze your application.
Rules to avoid backtracking problems:
(a+)+ or (a*)*[a-z]+ is safer than .+If you're validating input, always anchor both ends:
\d+ → matches "123" inside "abc123def" (probably not what you want)
^\d+$ → matches only if the ENTIRE string is digits
Forgetting anchors in validation patterns is one of the most common security bugs in web applications.
Most regex syntax is shared across languages, but there are notable exceptions:
(?<=ab) works, (?<=a+) does not)JavaScript: (?<name>pattern) → match.groups.name
Python: (?P<name>pattern) → match.group('name')
Java: (?<name>pattern) → matcher.group("name")
PHP: (?P<name>pattern) → $matches['name']
JavaScript: /pattern/gi
Python: re.compile(r'pattern', re.IGNORECASE | re.MULTILINE)
Java: Pattern.compile("pattern", Pattern.CASE_INSENSITIVE)
PHP: '/pattern/gi'
Go: (?i)pattern (inline flag only)
Extract timestamp, level, and message from common log formats:
^(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?Z?)\s+\[?(INFO|WARN|ERROR|DEBUG)\]?\s+(.+)$
Applied to 2026-03-27T14:30:00.123Z [ERROR] Connection timeout after 30s, this captures the timestamp, level, and message as separate groups.
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Matches 1.0.0, 2.1.3-beta.1, 1.0.0-alpha+build.123 according to the SemVer spec.
"(?:[^"\\]|\\.)*"
Properly handles escaped quotes inside JSON strings. Matches "hello", "say \"hi\"", "path\\to\\file".
\[([^\]]+)\]\(([^)]+)\)
Group 1 is the link text, group 2 is the URL. Applied to [Click here](https://example.com), captures Click here and https://example.com.
#(?:[0-9a-fA-F]{3}){1,2}\b|rgb\(\s*\d{1,3}\s*,\s*\d{1,3}\s*,\s*\d{1,3}\s*\)
Matches both #ff5733 and rgb(255, 87, 51).
Unix paths:
^(\/[a-zA-Z0-9._-]+)+\/?$
Windows paths:
^[a-zA-Z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*$
Regex engines are fast, but poorly written patterns can be orders of magnitude slower than well-written ones.
Slow: .*foo.*
Fast: [^f]*foo.*
The more specific your character classes, the fewer paths the engine needs to explore.
Slow: (https?):\/\/(www\.)?([^\/]+)
Fast: https?:\/\/(?:www\.)?[^\/]+
If you don't need the captured values, use non-capturing groups (?:...) or remove the parentheses entirely.
In any language, if you're using the same regex in a loop, compile it once outside the loop:
// Bad: recompiles every iteration
for (const line of lines) {
if (line.match(/^\d{4}-\d{2}-\d{2}/)) { ... }
}
// Good: compile once
const datePattern = /^\d{4}-\d{2}-\d{2}/;
for (const line of lines) {
if (datePattern.test(line)) { ... }
}Anchored patterns (^...$) let the engine fail fast on non-matching strings instead of trying every position in the string.
Writing a regex pattern without testing it is like writing code without running it. You need to see what matches, what doesn't, and what matches by accident.
The regex tester tool on akousa.net lets you type a pattern, paste in test strings, and see matches highlighted instantly. It supports all JavaScript regex flags, shows capture group contents, and provides a match breakdown so you can debug complex patterns step by step. No signup required, nothing to install, and your data stays in your browser.
Bookmark it. You'll use it more than you think.
Here's the complete cheat sheet in the most compact form possible. Print it, pin it, or keep it in a browser tab.
Characters: . any char, \d digit, \w word char, \s whitespace, \b word boundary
Quantifiers: * 0+, + 1+, ? 0-1, {n} exactly n, {n,m} n to m, {n,} n or more
Groups: (x) capture, (?:x) non-capture, (?<name>x) named, \1 backreference
Assertions: ^ start, $ end, (?=x) lookahead, (?!x) neg lookahead, (?<=x) lookbehind, (?<!x) neg lookbehind
Flags: g global, i case-insensitive, m multiline, s dotAll, u unicode
Escapes: \. literal dot, \\ literal backslash, \* literal asterisk -- escape any metacharacter with \
That covers every regex concept you'll encounter in day-to-day development. The patterns in this cheat sheet handle the vast majority of validation, parsing, and text processing tasks. For anything exotic, the fundamentals above give you enough understanding to construct or decode whatever pattern you come across.
Keep building. Keep testing. And when a regex makes no sense, break it apart one token at a time -- it always clicks eventually.