Regex Tester
Test and debug your regular expressions with real-time pattern matching and detailed results.
How to Use the Regex Tester?
Enter your regular expression pattern and test string to see matches, groups, and positions.
What Are Regular Expressions?
Regular expressions (regex or regexp) are sequences of characters that define search patterns. They originated in formal language theory in the 1950s and were first implemented in Unix tools like grep, sed, and awk in the 1970s. Today, regex is supported in virtually every programming language — JavaScript, Python, Java, PHP, C#, Go, Ruby, and more — making it one of the most universally useful skills a developer can learn.
At their core, regular expressions let you describe a pattern rather than a literal string. Instead of searching for the exact word "cat", you can search for "any three-letter word starting with c" using the pattern c\w{2}. This power makes regex essential for data validation, text parsing, search-and-replace operations, log analysis, and input sanitisation.
Regex Syntax Cheat Sheet
Character Classes
| Pattern | Meaning | Example Match |
|---|---|---|
. | Any character except newline | a, Z, 5, @ |
\d | Any digit (0–9) | 0, 7, 3 |
\D | Any non-digit | a, !, space |
\w | Word character (a-z, A-Z, 0-9, _) | a, Z, 5, _ |
\W | Non-word character | !, @, space |
\s | Whitespace (space, tab, newline) | space, \t, \n |
[abc] | Any character in the set | a, b, or c |
[^abc] | Any character NOT in the set | d, 1, ! |
[a-z] | Any character in range | a through z |
Quantifiers
| Pattern | Meaning | Example |
|---|---|---|
* | 0 or more | ab*c matches ac, abc, abbc |
+ | 1 or more | ab+c matches abc, abbc (not ac) |
? | 0 or 1 (optional) | colou?r matches color, colour |
{n} | Exactly n times | \d{4} matches 2024, 1999 |
{n,m} | Between n and m times | \d{2,4} matches 12, 123, 1234 |
*?, +? | Lazy (non-greedy) versions | Match as few characters as possible |
Anchors and Groups
| Pattern | Meaning |
|---|---|
^ | Start of string (or line with m flag) |
$ | End of string (or line with m flag) |
\b | Word boundary |
(abc) | Capturing group — matches "abc" and remembers it |
(?:abc) | Non-capturing group — groups without capturing |
a|b | Alternation — matches "a" or "b" |
(?=abc) | Positive lookahead — matches if followed by "abc" |
(?<=abc) | Positive lookbehind — matches if preceded by "abc" |
Common Regex Patterns
- Email:
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}— Covers 99% of real-world email addresses. For strict RFC 5322 compliance, use your language's built-in email validator. - Phone (US):
\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}— Matches (555) 123-4567, 555-123-4567, 555.123.4567, and 5551234567. - Phone (India):
[6-9]\d{9}— Indian mobile numbers start with 6, 7, 8, or 9 followed by 9 digits. - URL:
https?://[^\s/$.?#].[^\s]*— Matches HTTP and HTTPS URLs with various path structures. - Date (MM/DD/YYYY):
(0[1-9]|1[0-2])/(0[1-9]|[12]\d|3[01])/\d{4}— Validates month (01-12) and day (01-31) ranges. - IP Address (IPv4):
\b(?:\d{1,3}\.){3}\d{1,3}\b— Matches dot-separated decimal format. Add range validation for strict 0-255 checking. - HTML Tag:
<([a-z]+)([^<]*?)(?:>(.*?)</\1>|/>)— Matches opening/closing tags and self-closing tags. Note: regex is not recommended for full HTML parsing. - Password (Strong):
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$— Requires lowercase, uppercase, digit, special character, and minimum 8 characters.
Greedy vs Lazy Matching
By default, quantifiers (*, +, {n,m}) are greedy — they match as many characters as possible. Adding ? after a quantifier makes it lazy (non-greedy), matching as few characters as possible.
This distinction matters when parsing structured text. Given the input <b>bold</b> and <i>italic</i>, the greedy pattern <.*> matches the entire string from the first < to the last >. The lazy pattern <.*?> matches only <b>, then </b>, then <i>, then </i> — each tag individually.
Regex Flags Explained
- i (Case Insensitive): Makes the pattern match regardless of letter case.
/hello/imatches "Hello", "HELLO", and "hello". - g (Global): Finds all matches in the string, not just the first one. Without
g, the engine stops after the first match. - m (Multiline): Changes the behaviour of
^and$to match the start and end of each line (separated by \n), rather than the entire string. - s (Dotall / Single-line): Makes
.match newline characters too. By default,.matches everything except\n. - x (Extended): Allows whitespace and comments inside the pattern for readability. The engine ignores unescaped spaces and treats
#as a comment delimiter.
Common Mistakes and How to Avoid Them
- Forgetting to escape special characters: Characters like
. * + ? ^ $ { } [ ] | ( ) \have special meaning in regex. To match them literally, escape with a backslash:\.matches a literal period. - Using regex for HTML parsing: Regular expressions cannot reliably parse nested or recursive structures like HTML. Use a proper DOM parser (DOMDocument in PHP, cheerio in Node.js, BeautifulSoup in Python) for HTML manipulation.
- Catastrophic backtracking: Nested quantifiers like
(a+)+can cause exponential processing time on certain inputs (known as ReDoS — Regular Expression Denial of Service). Avoid nesting quantifiers and test performance with long inputs. - Overly broad patterns: A pattern like
.*matches everything and is rarely useful on its own. Be as specific as possible —\d{3}-\d{4}is better than.{3}-.{4}for matching phone number segments.
Frequently Asked Questions — Regex Tester
A regular expression is a sequence of characters that defines a search pattern. Used in programming to find, match, and manipulate text. Regex patterns can match literal strings ("hello"), character classes ([a-z]), repetitions (\d{3}), anchors (^start, end$), and complex structures. Supported in virtually every programming language — though syntax varies slightly between implementations.
Flag meanings: i (case-insensitive) — matches regardless of letter case. g (global) — find all matches, not just the first. m (multiline) — ^ and $ match line starts/ends instead of string start/end. s (dotall/single-line) — the dot (.) matches newline characters too. x (extended) — allows whitespace and comments in the pattern for readability. Not all flags are available in all languages.
Greedy quantifiers (*, +, {n,}) match as many characters as possible. Lazy/non-greedy quantifiers (*?, +?, {n,}?) match as few as possible. Example: given "<b>bold</b><i>italic</i>", the pattern <.*> greedily matches the entire string, while <.*?> lazily matches just "<b>". Greedy is the default — add ? to make any quantifier lazy.
A simple email pattern: ^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$ — this covers most real-world emails. Note: the official RFC 5322 email spec is extraordinarily complex. For production use, prefer your language's built-in email validation or a well-tested library rather than a custom regex. The simple pattern above catches ~99% of practical cases.
Regex special characters that must be escaped with a backslash (\): . * + ? ^ $ { } [ ] | ( ) \. For example, to match a literal dot use \., to match a literal parenthesis use \(. In most programming languages, you also need to escape the backslash itself in string literals — so \. in a regex string becomes "\\." in Java or C#, but just "\." in JavaScript regex literals.