Codementor Events

Regular Expressions in Python

Published Jan 19, 2019Last updated Jun 05, 2019
Regular Expressions in Python

Regular Expressions are a sequence of characters used to find and replace patterns in a string In simple terms it is a tool for matching patterns in text.

In python we have "re" module we need to import this before we can start

import re

Main uses of regular expression are

  • Finding a string
  • Replace part of a string
  • Search a string
  • Break our string into a sub strings

Methods of Regular Expressions

\w --> Matches with alphanumeric characters [a-z,A-Z,0-9]
\W --> Matches non-alphanumeric characters
\d --> Matches with digits [0-9]
\D --> Matches with all non-digits
\s --> Matches with a single space character
\S ---> Matches except space reaming all
\t --> Matches Tab
\n --> Matches newline
\r --> Matches return
. --> Matches any charcter except \n
() --> groups regular expressions and returns matched text
a|b --> Matches either a or b
^ --> starting Position
$ --> ending Position
{m} --> Matches must and should m
{m,} --> Matches more than m
{m,n} --> Matches a digit between m and n in length
? --> Matches one or zero occurance of the pattern
plus(+) --> Matches one or more occurance of the pattern

regex-example-2.png

sub ----> Find all substrings where re matches, & replace them with a different string
subn -----> same as sub(), but returns the new string & the number of replacements
start ----> This will give starting position
end ----> This will give ending position
span ----> This will give starting and ending positions of a sub string
search ----> Entire string searching
match ----> First word searching
findall ----> Mutiple times searching in string
compile -----> We can compile pattern into pattern object

Here i am showing how to use some of these methods

re.search()

This search method will search in the entire string and gives the result
If more than one match it rerurns the first occurance of the search pattern

Example :

import re
s=("Hi i am python and my no is 7867465789")
z=re.search('\d{10}',s)
print(z)
print(z.group(0))
print(z.start())
print(z.end())
Output:
<re.Match object; span=(28, 38), match='7867465789'>
7867465789
28
38
print(re.search(r'\d{13}', '9876543210999999999999').group())
Output: 9876543210999
import re
s = ("hi Welcome to python course")
g=re.search('welcome',s,re.I|re.M)
print(g)
Output: <re.Match object; span=(3, 10), match='Welcome'>

re.match()

It will search first word in the given string. If the first word will match it will give the required output, if the first word doesn't match it will give None as output

Example :

import re
s=("hi.hello Welcome, my name is python")
y=re.match('hi',s)
print(y)
print(y.group(0))
Output:
<re.Match object; span=(0, 2), match='hi'>
hi
d=re.match('hello',s)
print(d)
Output: None

Because the hello is not first word in the string

re.findall()

Findall returns all the non-overlapping matches of patterns in a string.

Example :

import re
s=("hey hi hello how are you?")
i=re.findall('h',s)
print(i)
Output: ['h', 'h', 'h', 'h']
import re
s = ("hi i am sadhana and my email id is sadhana-2018@gmail.com, my another email id is sadhana.python@gmail.com")
# here re.findall() returns a list of all the found email strings
z = re.findall(r'[\w\.-]+@[\w\.-]+',s)
print(z)
Output:
['sadhana-2018@gmail.com', 'sadhana.python@gmail.com']
import re
print(re.findall(r'\w','i love python'))
Output: ['i', 'l', 'o', 'v', 'e', 'p', 'y', 't', 'h', 'o', 'n']

It helps to search a pattern and replace with a new sub string.

Example :

import re
result=re.sub(r'India','the World','Codementor is the great platform in India')
print(result)
Output:
Codementor is the great platform in the World

re.compile()

We can combine a regular expression pattern into pattern objects,which can be used for pattern matching.It also helps to search a pattern without rewriting it

Example :

import re
pattern=re.compile('good')
a=pattern.findall('When you think positive good things happen')
print (a)
b=pattern.findall('Life is all about having a good time')
print (b)
Output:
['good']
['good']

Exercises

1. Extract all characters from a given string

import re
a=re.findall(r'.','Sadhana loves Python')
print (a)
Output:
['S', 'a', 'd', 'h', 'a', 'n', 'a', ' ', 'l', 'o', 'v', 'e', 's', ' ', 'P', 'y', 't', 'h', 'o', 'n']

2. Extract each word from a given string

import re
a=re.findall(r'\w*','My name is sadhana')
print (a)
Output:
['My', '', 'name', '', 'is', '', 'sadhana', '']

3. extract numbers from a given string

import re
a=re.findall(r'\d+','My name is sadhana, my number is 32456')
print (a)
Output: ['32456']
import re
a=re.findall(r'\d*','My name is sadhana, my number is 32456')
print (a)
Output: ['3', '2', '4', '5', '6']

For more Regular Expressions Exercises Click here

This is all we have to know about regular expressions. Practice more and more till you get this handy.

Discover and read more posts from Sadhana Reddy
get started
post commentsBe the first to share your opinion
GrendelPL
3 years ago

Hi,
I’m trying to use the regex in python and it’s nothing but frustration so far.

re.findall(r’MetricNamespace:(.*)’, body returns:
[’ AWS/RDS\r’]
Awesome, I have the text, but also spaces and return characters? I only want alphanumeric - so according to the documentation ‘\w’ should do it? returns empty

Show more replies