Skip to main content

Introduction to Python RegEx

Python RegEx

The re module provides support for working with regular expressions.

Regular expressions, often referred to as regex or regexp, are patterns used to match and manipulate text strings.

The re module allows you to search, match, and manipulate strings based on these patterns.

Here's an overview of how to use regular expressions in Python using the re module:

Importing the re Module

Before using the functions and methods from the re module, you need to import it.

import re

Pattern Matching

The most common operation with regular expressions is pattern matching, where you check if a string matches a specific pattern.

The re.match() function is used to check if the pattern matches at the beginning of the string.

As an example:

pattern = r"abc"  # Raw string literal to represent the pattern
text = "abcdef"
match = re.match(pattern, text)
if match:
print("Match found!")
else:
print("No match found.")

In this example:

  • The pattern "abc" is matched against the string "abcdef".
  • If a match is found, it will print "Match found!".

Search and Replacement

The re.search() function is used to search for a pattern anywhere in the string. It returns a match object if a match is found.

You can also use the re.findall() function to find all occurrences of a pattern in a string.

As an example:

pattern = r"apple"
text = "I have an apple, and I love apples."
match = re.search(pattern, text)
if match:
print("Pattern found:", match.group())
else:
print("Pattern not found.")

matches = re.findall(pattern, text)
print("All occurrences of pattern:", matches)

In this example:

  • The pattern "apple" is searched in the string.
  • If a match is found, it will print the matched pattern using match.group().
  • The re.findall() function returns a list of all occurrences of the pattern.

Regular Expression Patterns

Regular expressions support a wide range of pattern matching syntax.

Here are some common symbols and constructs used in regular expressions:

  • .: Matches any character except a newline.
  • ^: Matches the beginning of a string.
  • $: Matches the end of a string.
  • *: Matches zero or more occurrences of the preceding pattern.
  • +: Matches one or more occurrences of the preceding pattern.
  • ?: Matches zero or one occurrence of the preceding pattern.
  • []: Matches any single character within the brackets.
  • |: Matches either the pattern before or the pattern after the symbol.
  • (): Groups patterns together.
  • \: Escapes special characters.

Regular expressions are a powerful tool for string manipulation and pattern matching in Python.

The re module provides a comprehensive set of functions and methods to work with regular expressions.