Regular Expressions (Regex) - Complete Guide

Regular expressions (regex) are patterns used to match character combinations in strings. This guide covers everything you need to master regex.

What are Regular Expressions? #

Regular expressions are a powerful tool for pattern matching and text manipulation. They’re supported in most programming languages.

Basic Syntax #

Literal Characters #

/hello/       // Matches "hello"
/123/         // Matches "123"

Special Characters #

Need to be escaped with \:

. ^ $ * + ? { } [ ] \ | ( )
/\./          // Matches "."
/\*/          // Matches "*"
/\\/          // Matches "\"

Character Classes #

Basic Classes #

/./ // Any character except newline
/\d/ // Digit [0-9]
/\D/ // Not a digit
/\w/ // Word character [a-zA-Z0-9_]
/\W/ // Not a word character
/\s/ // Whitespace [ \t\r\n\f]
/\S/ // Not whitespace

Custom Character Classes #

/[abc]/      // Match a, b, or c
/[a-z]/      // Any lowercase letter
/[A-Z]/      // Any uppercase letter
/[0-9]/      // Any digit
/[a-zA-Z]/   // Any letter
/[^abc]/     // NOT a, b, or c
/[a-z0-9]/   // Letter or digit

Quantifiers #

/a*/         // 0 or more 'a'
/a+/         // 1 or more 'a'
/a?/         // 0 or 1 'a'
/a{3}/       // Exactly 3 'a'
/a{2,4}/     // 2 to 4 'a'
/a{2,}/      // 2 or more 'a'

Greedy vs Lazy #

// Greedy (default)
/<.*>/       // Matches "<div>text</div>" in "<div>text</div><span>"
             // Result: "<div>text</div><span>"

// Lazy
/<.*?>/      // Matches shortest possible
             // Result: "<div>"

Anchors #

/^hello/     // Start of string
/world$/     // End of string
/^hello$/    // Exact match
/\bword\b/   // Word boundary
/\Bword\B/   // Not word boundary

Groups and Capturing #

Capturing Groups #

/(abc)/      // Capture "abc"
/(\d{3})-(\d{2})-(\d{4})/  // SSN: 123-45-6789

const regex = /(\d{3})-(\d{2})-(\d{4})/;
const match = "123-45-6789".match(regex);
console.log(match[1]);  // "123"
console.log(match[2]);  // "45"
console.log(match[3]);  // "6789"

Non-Capturing Groups #

/(?:abc)/    // Group but don't capture

Named Groups #

/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/

const regex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = "2025-08-30".match(regex);
console.log(match.groups.year);   // "2025"
console.log(match.groups.month);  // "08"
console.log(match.groups.day);    // "30"

Alternation #

/cat|dog/    // Match "cat" or "dog"
/gr(e|a)y/   // Match "grey" or "gray"
/(jpg|png|gif)/  // Match image extensions

Lookahead and Lookbehind #

Lookahead #

/foo(?=bar)/     // Positive: "foo" followed by "bar"
/foo(?!bar)/     // Negative: "foo" not followed by "bar"

const text = "foo bar foo baz";
text.match(/foo(?=bar)/g);  // ["foo"] (first one)
text.match(/foo(?!bar)/g);  // ["foo"] (second one)

Lookbehind #

/(?<=@)\w+/      // Positive: word after "@"
/(?<!@)\w+/      // Negative: word not after "@"

const text = "@user1 and user2";
text.match(/(?<=@)\w+/g);   // ["user1"]
text.match(/(?<!@)\w+/g);   // ["and", "user2"]

Flags #

/pattern/g   // Global: find all matches
/pattern/i   // Case insensitive
/pattern/m   // Multiline: ^ and $ match line boundaries
/pattern/s   // Dot matches newline
/pattern/u   // Unicode
/pattern/y   // Sticky: match from lastIndex

// Combine flags
/pattern/gi  // Global and case insensitive

JavaScript Methods #

test() #

Check if pattern exists:

const regex = /hello/;
regex.test("hello world");  // true
regex.test("goodbye");      // false

match() #

Find matches:

const text = "cat bat rat";
text.match(/cat/);      // ["cat"]
text.match(/[cbr]at/g); // ["cat", "bat", "rat"]

matchAll() #

Get all matches with groups:

const text = "test1@example.com test2@example.com";
const regex = /(\w+)@(\w+\.\w+)/g;

for (const match of text.matchAll(regex)) {
  console.log(match[1]);  // username
  console.log(match[2]);  // domain
}

replace() #

Replace matches:

const text = "hello world";
text.replace(/world/, "universe");  // "hello universe"

// With function
text.replace(/\w+/g, match => match.toUpperCase());
// "HELLO WORLD"

// With groups
"John Doe".replace(/(\w+) (\w+)/, "$2, $1");
// "Doe, John"

Find index of match:

"hello world".search(/world/);  // 6
"hello world".search(/foo/);    // -1

split() #

Split string:

"a,b;c:d".split(/[,;:]/);  // ["a", "b", "c", "d"]

Common Patterns #

Email Validation #

/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

function isValidEmail(email) {
  const regex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
  return regex.test(email);
}

console.log(isValidEmail("user@example.com"));  // true
console.log(isValidEmail("invalid.email"));     // false

Phone Number #

// US Format: (123) 456-7890
/^\(\d{3}\) \d{3}-\d{4}$/

// International: +1-123-456-7890
/^\+\d{1,3}-\d{3}-\d{3}-\d{4}$/

// Flexible
/^[\d\s\-\+\(\)]+$/

URL #

/^(https?:\/\/)?(www\.)?[\w\-]+\.\w{2,}(\/[\w\-\.]*)*$/

function isValidURL(url) {
  const regex = /^(https?:\/\/)?(www\.)?[\w\-]+\.\w{2,}(\/[\w\-\.]*)*$/;
  return regex.test(url);
}

Password Strength #

// At least 8 chars, 1 uppercase, 1 lowercase, 1 digit, 1 special
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/

function isStrongPassword(password) {
  const regex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
  return regex.test(password);
}

Credit Card #

// Visa, MasterCard, Amex
/^(?:4\d{12}(?:\d{3})?|5[1-5]\d{14}|3[47]\d{13})$/

IP Address #

// IPv4
/^(\d{1,3}\.){3}\d{1,3}$/

// More precise (0-255)
/^((25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)$/

Date Formats #

// YYYY-MM-DD
/^\d{4}-\d{2}-\d{2}$/

// MM/DD/YYYY
/^\d{2}\/\d{2}\/\d{4}$/

// DD-MM-YYYY
/^\d{2}-\d{2}-\d{4}$/

Hex Color #

/^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/

function isValidHexColor(color) {
  const regex = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/;
  return regex.test(color);
}

console.log(isValidHexColor("#fff"));     // true
console.log(isValidHexColor("#123456"));  // true
console.log(isValidHexColor("#12345"));   // false

Username #

// 3-16 characters, alphanumeric and underscore
/^[a-zA-Z0-9_]{3,16}$/

HTML Tag #

/<(\w+)(\s+[\w-]+="[^"]*")*\s*>/g

const html = '<div class="container"><span id="text">Hello</span></div>';
html.match(/<(\w+)/g);  // ["<div", "<span"]

Practical Examples #

Extract URLs #

function extractURLs(text) {
  const regex = /https?:\/\/[^\s]+/g;
  return text.match(regex) || [];
}

const text = "Visit https://example.com and http://test.com";
console.log(extractURLs(text));
// ["https://example.com", "http://test.com"]

Remove HTML Tags #

function stripHTML(html) {
  return html.replace(/<[^>]*>/g, '');
}

const html = "<p>Hello <strong>world</strong></p>";
console.log(stripHTML(html));  // "Hello world"

Extract Email Addresses #

function extractEmails(text) {
  const regex = /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g;
  return text.match(regex) || [];
}

const text = "Contact us at support@example.com or sales@test.com";
console.log(extractEmails(text));
// ["support@example.com", "sales@test.com"]

Capitalize Words #

function capitalizeWords(text) {
  return text.replace(/\b\w/g, char => char.toUpperCase());
}

console.log(capitalizeWords("hello world"));  // "Hello World"

Format Phone Number #

function formatPhone(phone) {
  const cleaned = phone.replace(/\D/g, '');
  const match = cleaned.match(/^(\d{3})(\d{3})(\d{4})$/);

  if (match) {
    return `(${match[1]}) ${match[2]}-${match[3]}`;
  }

  return phone;
}

console.log(formatPhone("1234567890"));  // "(123) 456-7890"

Validate Credit Card #

function validateCreditCard(card) {
  // Remove spaces and dashes
  const cleaned = card.replace(/[\s-]/g, '');

  // Check format
  const regex = /^\d{13,19}$/;
  if (!regex.test(cleaned)) return false;

  // Luhn algorithm
  let sum = 0;
  let isEven = false;

  for (let i = cleaned.length - 1; i >= 0; i--) {
    let digit = parseInt(cleaned[i]);

    if (isEven) {
      digit *= 2;
      if (digit > 9) digit -= 9;
    }

    sum += digit;
    isEven = !isEven;
  }

  return sum % 10 === 0;
}

Count Word Frequency #

function wordFrequency(text) {
  const words = text.toLowerCase().match(/\b\w+\b/g) || [];
  const freq = {};

  for (const word of words) {
    freq[word] = (freq[word] || 0) + 1;
  }

  return freq;
}

const text = "hello world hello";
console.log(wordFrequency(text));
// { hello: 2, world: 1 }

Performance Tips #

  1. Avoid catastrophic backtracking - Be careful with nested quantifiers
  2. Use non-capturing groups when you don’t need captured values
  3. Anchor patterns when possible (^, $)
  4. Use character classes instead of alternation: [abc] not (a|b|c)
  5. Compile regex once for repeated use
  6. Keep patterns simple - Complex regex is hard to maintain

Common Mistakes #

  1. Forgetting to escape special characters
  2. Not using anchors when exact match is needed
  3. Greedy quantifiers when lazy is needed
  4. Not testing edge cases
  5. Over-complicating patterns

Testing Regex #

Online tools:

  • regex101.com
  • regexr.com
  • regexpal.com
// Test in console
const regex = /pattern/;
const tests = ["test1", "test2", "test3"];

tests.forEach(test => {
  console.log(`${test}: ${regex.test(test)}`);
});

Regex Cheat Sheet #

PatternDescription
.Any character
\dDigit
\wWord character
\sWhitespace
*0 or more
+1 or more
?0 or 1
{n}Exactly n
^Start
$End
|Or
[]Character class
()Group
\bWord boundary

Regular expressions are powerful but can be complex. Start simple, test thoroughly, and build up to more complex patterns as needed.