Regular expressions (regex) are one of the most powerful — and most feared — tools in programming. They let you search, match, and manipulate text with surgical precision. The syntax looks cryptic at first, but the core concepts are surprisingly simple. This guide teaches you regex through practical, real-world examples.
What Regular Expressions Are
A regular expression is a pattern that describes a set of strings. Think of it as a search query with superpowers. Where a normal search finds exact matches ("hello" finds only "hello"), a regex can find patterns ("any word starting with h and ending with o").
Every programming language supports regex: JavaScript, Python, Java, Go, PHP, Ruby, C#. The syntax is mostly consistent across languages, with minor flavor differences. You can test patterns instantly with our Regex Tester.
Literal Characters
The simplest regex is just literal text. The pattern hello matches the string "hello" exactly. Most characters match themselves — letters, numbers, spaces. The exceptions are special characters (metacharacters) that have special meaning: . * + ? ^ $ { } [ ] ( ) | \.
To match a literal special character, escape it with a backslash: \. matches an actual period, \$ matches a dollar sign.
Character Classes
Character classes match one character from a set of options:
[aeiou]— matches any vowel[0-9]— matches any digit[a-zA-Z]— matches any letter[^0-9]— matches anything except a digit (the^inside brackets negates)
Common shorthand classes save typing:
\d— any digit (same as[0-9])\w— any "word character" (letter, digit, or underscore)\s— any whitespace (space, tab, newline)\D,\W,\S— the negated versions of the above.— any character except newline (the wildcard)
Quantifiers — How Many?
Quantifiers specify how many times the preceding element should repeat:
*— zero or more times+— one or more times?— zero or one time (optional){3}— exactly 3 times{2,5}— between 2 and 5 times{3,}— 3 or more times
Examples:
\d+— one or more digits (matches "42", "7", "12345")\w{3,8}— a word between 3 and 8 characters longhttps?— matches "http" or "https" (the "s" is optional)
Anchors — Where to Match
^— start of string (or line, with multiline flag)$— end of string (or line)\b— word boundary (between a word character and a non-word character)
Examples:
^Hello— matches "Hello" only at the start of the stringworld$— matches "world" only at the end\bcat\b— matches the word "cat" but not "category" or "concatenate"
^ and $ to ensure your pattern matches the entire string, not just a substring.
Groups and Alternation
Parentheses () create groups. Groups serve two purposes: they apply quantifiers to multiple characters, and they capture the matched text for later use.
(abc)+— matches "abc" repeated one or more times: "abc", "abcabc"(cat|dog)— matches "cat" or "dog" (alternation with|)(\d{3})-(\d{4})— matches and captures phone number parts like "555-1234"
Practical Regex Patterns
Email Address (Simplified)
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
This matches most common email formats. For production use, email validation is notoriously complex — the full RFC 5322 spec is impractical as a regex. Our Extract Emails tool handles this for you.
URL
https?://[\w.-]+(?:\.[a-zA-Z]{2,})(?:/[\w./?#&=-]*)?
Matches HTTP and HTTPS URLs. Extract all URLs from text with our URL Extractor.
IP Address (IPv4)
\b(?:\d{1,3}\.){3}\d{1,3}\b
Note: this matches the format but doesn't validate that each octet is 0-255. A proper validator needs additional logic.
Date (YYYY-MM-DD)
\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])
Hex Color
#(?:[0-9a-fA-F]{3}){1,2}\b
Matches both short (#FFF) and long (#FFFFFF) hex color codes.
Greedy vs Lazy Matching
By default, quantifiers are greedy — they match as much as possible. Adding ? after a quantifier makes it lazy — it matches as little as possible.
Consider matching HTML tags in <b>bold</b> and <i>italic</i>:
<.+>(greedy) matches<b>bold</b> and <i>italic</i>— everything from first<to last><.+?>(lazy) matches<b>, then</b>, then<i>, then</i>— each tag individually
Tips for Writing Better Regex
- Start simple, refine incrementally: Get a basic pattern working first, then add edge case handling
- Test with edge cases: Try empty strings, strings with only whitespace, very long inputs, and boundary conditions
- Use non-capturing groups when you don't need captures:
(?:abc)groups without capturing, which is slightly more efficient - Anchor when possible:
^\d{5}$is faster and more precise than\d{5}when validating an entire string - Comment complex patterns: Most regex engines support verbose mode (the
xflag) that allows whitespace and comments - Don't regex everything: Some things are better handled with a parser — HTML, JSON, and email addresses are famously hard to regex correctly
Practice with Real Tools
The best way to learn regex is by doing. Our Regex Tester lets you write patterns and see matches highlighted in real-time, with match groups and flags support. Try the patterns from this guide, then experiment with your own text. You can also use Find and Replace with regex support for text transformations.