What is the primary Python re module function for replacing text using a regex pattern?
The re.sub(pattern, repl, string, count=0, flags=0) function.
What does re.sub() do?
It finds all occurrences of pattern in string and replaces them with repl.
What are the three mandatory arguments for re.sub()?
pattern (the regex to search for)repl (the replacement string)string (the input string to search within)Example: Replace all occurrences of “old” with “new” in a string.
re.sub(r'old', 'new', 'this is old data, very old')
(Result: ‘this is new data, very new’)
What is the count argument in re.sub() for?
It specifies the maximum number of pattern occurrences to replace. If 0 (default), all occurrences are replaced.
Example: Replace only the first two occurrences of “apple”.
re.sub(r'apple', 'orange', 'apple pie, apple juice, apple tree', count=2)
(Result: ‘orange pie, orange juice, apple tree’)
How do you use capturing groups from the original match in the replacement string?
Refer to them in repl using \1, \2, etc., for the first, second, etc., captured groups. (Or \g<name> for named groups).
Example: Reformat “YYYY-MM-DD” to “DD/MM/YYYY” using capturing groups.
re.sub(r'(\d{4})-(\d{2})-(\d{2})', r'\3/\2/\1', 'Date: 2023-10-26')
(Result: ‘Date: 26/10/2023’)
Can the repl argument be a function?
Yes. If repl is a function, it’s called for every non-overlapping match, and its return value is used as the replacement string.
When using re.compile() for efficiency, how do you perform a substitution?
Use the .sub() method on the compiled pattern object: compiled_pattern.sub(repl, string)
What is re.subn()?
It’s similar to re.sub(), but it returns a tuple of (new_string, number_of_substitutions).
What happens if the pattern is not found in the string?
The original string is returned unchanged.
How do you make re.sub() case-insensitive?
Pass the re.IGNORECASE (or re.I) flag: re.sub(r'word', 'replaced', 'Word is here', flags=re.IGNORECASE)
Example: Replacing multiple spaces with a single space.
re.sub(r'\s+', ' ', 'Hello world !')
(Result: ‘Hello world !’)