Python String


In this section we will learn how to create or modify or change or delete strings in Python. We will also cover the various operations and functions that are related to strings in Python.

A string can be defined as the sequence of characters. And a character can be defined as simply a symbol which includes the 26 alphabets of English language. Our computer systems cannot directly understand the symbols but there is a special encoding for each symbol in the computer in binary form that is 0’s and 1’s. Despite the characters and symbols are appearing on the screen.

When a character is converted to computer language it is called encoding and its reverse process is called decoding. As an example the ASCII and Unicode are the mostly used encodings.

A string in Python is referred to as the sequence of Unicode characters.

 

How to create a string?

In Python or any other programming language a string can be created by enclosing the characters inside single or double quotes. Triple quotes can also be used for strings but they are usually used for multiple lines strings or docstrings

Consider the following example in which we have created strings:

CODE

>>> myString = “Python”

>>> print (myString)

>>> myString = ‘Python’

>>> print (myString)

>>> myString = “””Hello

world of Python”””

>>> print (myString)

OUTPUT

Python

Python

Hello

world of Python

world-of-python

In the above example, at first a string was defined using double quotes and printed, then another string was declared using single quotes and in the last we declared multiple line strings using triple quotes.

 

How to access characters in a string?

A character in a string can be accessed by using the indexing; to access a range of characters from a string the slicing operator can be used. The index to access the characters in a string is started from 0.

If the index to access a character in string is not in a range that is if we use the index number that is more than the number of characters in a string then an error will be generated by Python, this error is called IndexError.

And the index must be of integer data type, if the index is of float data type or any other data type then error will be generated by python interpreter. This error will be called as TypeError.

We can also used the negative indexing to access characters in a string as we did in other sequences such as list, tuple, etc.

In Python we can use the negative indexing to access the elements of sequences for example in strings, tuples or in lists, etc. For example the index -1 refers to the last item of the sequence and index -2 refers to the second last element of sequence.

Consider the following example in which we have done the negative indexing, used the slicing operator and accessed a single character from the string:

CODE

>>> myString = ‘Python’

>>> print (‘myString = ‘, myString)

>>> print (‘myString [0] = ‘, myString [0])

>>> print (‘myString [-1] = ‘, myString [-1])

>>> print (‘myString [1:4] = ‘, myString [1:4])

>>> print (‘myString [4:-2] = ‘, myString [4:-2])

>>> print(‘myString [3:-1] = ‘, myString [3:-1])

OUTPUT

myString [0] =  P

myString =  Python

myString [-1] =  n

myString [1:4] =  yth

myString [4:-2] =

myString[3:-1] =  ho

mystring

In the above example, a string is declared named myString with a value that is “Python”, and then the whole string is printed in the first print statement. In the second print statement the character at index 0 or the location one is printed.

In the third print statement we used the negative indexing that is -1 to print the last character in the string. In the fourth and the fifth statement, the slicing operator is used to print the range of characters.

Consider the following example in which we have used the index that is out of range hence the index error is generated, we also used a floating point number as the index number and then we got the type error:

CODE

>>> print (myString [14])

>>> print (myString [1.2])

OUTPUT:

Traceback (most recent call last):

  File “<pyshell#7>”, line 1, in <module>

    print (myString [14])

IndexError: string index out of range

Traceback (most recent call last):

  File “<pyshell#8>”, line 1, in <module>

    print (myString [1.2])

TypeError: string indices must be integers

typeerror

In the above example, we used the index 14 but our string only had 6 characters in it that is way an index error is generated by the python interpreter. Then we used in another print statement, a floating point number in the index and got the error that is type error.

 

How to change or delete a string?

The characters in a string cannot be changed that is a string type sequence is immutable. Once the string is declared the elements of the string cannot be changed or deleted but we can re declare the string with different elements as the user wants.

Consider the following example in which we have tried to change an element of the string but got the type error:

CODE

>>> myString = ‘Python’

>>> myString [3] = ‘o’

OUTPUT

Traceback (most recent call last):

  File “<pyshell#10>”, line 1, in <module>

    myString [3] = ‘o’

TypeError: ‘str’ object does not support item assignment

str

In the above example, the character at index 3 or at the location 4 was tried to change but type error occurred.

Similarly elements cannot be changed or removed from a string. You can only delete the entire string but not the characters of string alone. The entire string can be deleted by using the del keyword.

Consider the following example in which we have deleted the entire string by using the del keyword:

CODE

>>> del myString [1]

>>> del myString

>>> myString

OUTPUT

Traceback (most recent call last):

  File “<pyshell#11>”, line 1, in <module>

    del myString [1]

TypeError: ‘str’ object doesn’t support item deletion

Traceback (most recent call last):

  File “<pyshell#13>”, line 1, in <module>

    myString

NameError: name ‘myString’ is not defined

nameerror

In the above example, the del keyword is used to delete the 2nd character in the string, but an error is generated by the interpreter. Then in the second statement del keyword is used to delete the entire string, the entire string is deleted successfully and when we tried to print that string error was generated by interpreter. This indicates that the string is deleted from the memory.

 

Python String Operators

In Python we can perform a lot of operations on strings. The following are some of the operators that are used with strings.

 

Concatenation of two or more strings

By concatenation of two or more strings we mean to join two strings into a single one. The + operator is used to perform the concatenation of strings. In Python the * operator is used to repeat a string a number of times. Consider the following example in which we have used the + operator to concatenate two strings and the * operator to repeat a string a number of times:

CODE

>>> str1 = ‘Python’

>>> str2 = ‘Programming’

>>> print (‘str1 + str2 = ‘, str1 + str2)

>>> print (‘str1 * 4 = ‘, str1 * 4)

OUTPUT

str1 + str2 =  PythonProgramming

str1 * 4 =  PythonPythonPythonPython

str1

In the above example, we declared two strings and in the first print statement we used the + operator to concatenate these two strings and the result is printed. Then in the next print statement we used the * operator to repeat the string 4 times and the result is printed because we have done this inside the print statement.

If we write two strings together then these strings will also be concatenated that is they will also be joined, and two strings can also be concatenated by writing them into parentheses and assigning to another variable.

Consider the following example in which we have joined two strings using parentheses and by writing to different strings together:

CODE

>>> ‘Python’ ‘Programming’

>>> string = (‘Python’ ‘Programming’)

>>> string

OUTPUT

‘PythonProgramming’

‘PythonProgramming’

pythonprogramming

 

Iterating through String

In Python we can iterate through each item of the string by using the ‘for statement’. Consider the following example in which we have used the ‘for statement’ to iterate through the items of the string:

CODE

>>> count = 0

>>> for ch in ‘Python Programmin’:

          if(ch == ‘t’):

                   count +=1

>>> print(count, ‘Character found’)

OUTPUT

1 Character found

1-character-found

In the above example, a variable is declared named count in which an increment is done of one whenever the ch variable is found in the string and then the variable count is printed and we found that this particular character is found or not.

 

String Membership Test

In Python we can test an item in the string by using the keyword in. By testing an item in the string we mean that if the item exists in the string or not.

Consider the following example in which we have declared a string and checked if any item of the string exists in the declared string or not by using the in keyword in Python:

CODE

>>> ‘p’ in ‘Python’

>>> ‘P’ in ‘Python’

OUTPUT

False

True

true

In the above example in the first statement we entered small p and got false because Python is a case sensitive language and can distinguish between small p and capital p. Therefore in the second statement we got true in the output because capital P was checked to be in the string “Python” by using the in keyword.

 

Built in functions to work with Python

In Python the functions that are defined for the other sequences can also be used with strings as well. The most commonly used built in functions for Python are enumerate (), and the len () function. The ‘enumerate ()’ function is used to return the enumerate objects. This function has the index and the value of all the items that are in the strings in pairs. This is used for iterations.

The len () function is used to find the length of the string or how many characters are there in the string.

Consider the following example in which we have used the enumerate () and len () functions:

CODE

>>> myString = ‘Python’

>>> UsingEnumerate = list (enumerate (myString))

>>> print (‘list (enumerate (myString) =’, UsingEnumerate)

>>> print (‘len (myString) =’, len (myString))

OUTPUT

list (enumerate (myString) = [(0, ‘P’), (1, ‘y’), (2, ‘t’), (3, ‘h’), (4, ‘o’), (5, ‘n’)]

len (myString) = 6

len

In the above example, a string named myString is declared and initialized with the value ‘Python’. In the next statement the enumerate function is used, the result is printed and we can see that we get the index and the value separated by comma. Then in the next statement the length of the string is found. The length of the string “Python” is 6 characters.

 

Python String Formatting

In Python we can format the strings that are the alignment etc of the string can be done to make the string understandable and readable.

 

Escape Sequence

In Python an escape sequence is started from a back slash and the interpreter interprets the escape sequence in a different manner. If the user wants to print:

He says, “I’m not a lawyer”.

To print the above statement we cannot use the single quote or the double quote. If we try to do this then we will get an error that is syntax error. This is because the text has both the single quote and double quote. To print the double quotes or single quotes we use escape sequences.

Consider the following in which it is demonstrated what happens when no escape sequence is used and when we tried to print the quotation marks in the print statement:

CODE

>>> print(“He says, “I’m not a lawyer””)

OUTPUT

SyntaxError: invalid syntax

Now consider the following example in which different ways are used to print the above statement that included quotation marks:

CODE

>>> print(”’He says, “I’m not a lawyer””’)

>>> print(‘He says, “I\’m not a lawyer”‘)

>>> print(“He says, \”I’m not a lawyer\””)

OUTPUT

He says, “I’m not a lawyer”

He says, “I’m not a lawyer”

He says, “I’m not a lawyer”

im-not-a-lawyer

In the above example, in the first print statement we used triple quotation marks to print our statement that had single as well as double quotes. The output is gotten successfully. In the next print statement escape sequence for single quotation marks is used and the statement is printed correctly. In the next print statement escape sequence for double quotation marks is used and the statement is printed correctly.

Consider the following table in which the supported escape sequences in Python are described:

Escape Sequence Description
\new line This escape sequence is used to generate a new line in the output screen.
\\ This escape sequence is used to print backslash on the output screen.
\’ This escape sequence is used to print a single quote in the output screen.
\” This escape sequence is used to print double quotation mark on the output screen.
\a This escape sequence is used for the ASCII bell.
\b This escape sequence is used to print a back space on the output screen.
\f This escape sequence is used to generate a form feed or a new page.
\n This escape sequence is used to generate a new line or line feed.
\r This escape sequence is used for carriage return.
\t This escape sequence is used to print an 8 space on the screen.
\v This escape sequence is used to generate a vertical tab on the output screen.
\ooo This escape sequence is used to print the character with its octal value.
\xHH This escape sequence is used to print the character with its hexadecimal value.

Consider the following example in which we have used the escape sequences to format the output:

CODE

>>> print(“Printing\n in two line”)

OUTPUT

Printing

 in two line

in-two-line

In the above example the escape sequence “\n” is used to generate the output in two different lines.

 

Raw Strings to ignore Escape sequence

In some cases the user wants to ignore the escape sequences that appearing in the strings in print statement. To ignore an escape sequence in a string we can place an r or R in front of the string. This will tell the interpreter that this is the raw string and any escape sequence inside it should be ignored.

Consider the following example in which we have used r or R to ignore the escape sequence appearing inside the string:

CODE

>>> print(“Printing\n in two line”)

>>> print(r”Printing\n in two lines”)

OUTPUT

Printing

 in two line

Printing\n in two lines

printing

In the above example we did not used r in the first print statement and hence the escape sequence in the first print statement worked properly. In the second statement we used r in front of the text and escape sequence inside the string is ignored.

 

The format () method for formatting strings

In Python the format () method is used with strings and is very versatile and powerful tool to format the strings in the output. A format string in Python contains curly braces that are {}. The curly braces are used as place holder.

Consider the following example in which we have used the format method to format strings:

CODE

>>> printind = “{}, {} and {}”.format(‘cake’, ‘Pizza’, ‘Trifle’)

>>> print (printind)

OUTPUT

cake, Pizza and Trifle

cake-pizza-and-trifle

In the above example we declared a variable and initialized it with three place holders, then used the format method to insert values into the place holders, the variable is printed and we got the values successfully.

In the format () method we can use optional format specifications. For example they are separated from the field by using the colon. In the format () method we can left justify the string by using the ‘<’, the string can be right justify in the output string by using the ‘>’, and the string can be put in the centre in the output string by using the ‘^’.

In Python the integers can be formatted as binary and hexadecimal etc. numbers. The floating point numbers can be rounded and can also be displayed in the exponential form in the output screen.

 

Old Style formatting

In Python we can also format strings like we used to do in the C programming language using the printf statement. In the C programming language we used the % operator to accomplish this task. In Python we will also be using the same operator to format the output.

Consider the following example:

CODE

>>> a = 23.456

>>> print(‘The value of a is %3.2f’ %a)

>>> print(‘The value of a is %3.4f’ %a)

OUTPUT

The value of a is 23.46

The value of a is 23.4560

the-value-of-a-is

In the above example a variable named ‘a’ is declared that is assigned a floating point number with three digits after the decimal point. Then we used the format specifier that is %3.2f in the print statement. By %3.2f we mean that the output of this number should have 2 digits after the decimal place, and by %3.4f we mean that the output should contain 4 digits after the decimal place.

 

Common Python String Methods

In Python we are provided with a number of string methods and the format () method is one of the most commonly used of them. There are other methods that are also very common. They include: lower (), upper (), join (), split (), find () and replace ().

Consider the following table in which all of the string methods that are described:

String Methods Description
capitalize () This method is used to return the first letter of the string as capitalized letter and the rest in the lowercase.
casefold () This method is used to return the string in lowercase letters. This function is used for the caseless matching and is more strict than the lower () method.
center () This method is used to center the string in the output screen in the defined field width.
count () This method is used to count the repeated items or characters provided by the user in a string.
encode () This method is used to return the encoded form of the string.
endswith () This method is used to return true if the string is ended with the supplied substring.
expandstab () This method is used to return a string in which all the characters are replaced by the number of spaces provided by the user.
find () This method is used to return the first occurred index if the supplied substring matches with string. The interpreter returns -1 if the index is not found.
format () This method is used to format a given string.
format_map () This method is used to format a given string.
index () This method is used to return the first occurred index of the supplied string in the string. It generates an error names value error if not found.
isalnum () This method is used to return if there are characters in a string or if the string is not empty and all the characters are alpha numeric.
isalpha () This method is used to return if there are characters in a string or if the string is not empty and all the characters are alphabets.
isdecimal () This method is used to return if there are characters in a string or if the string is not empty and all the characters are decimal characters.
isdigit () This method is used to return if there are characters in a string or if the string is not empty and all the characters are digit.
isidentifier () This method is used to return true if the string provided is a valid identifier.
islower () This method is used to return true if string has all lowercased letters.
isnumeric () This method is used to return if there are characters in a string or if the string is not empty and all the characters are numbers.
isprintable () This method is used to return if there are no characters in a string or if the string is empty or all the characters are printable.
isspace () This method is used to return if there are characters in a string or if the string is not empty and all the characters are white space.
istitle () This method is used to return if there are characters in a string or if the string is not empty and all the characters are title cased.
isupper () This method is used to return true if string has all uppercased letters.
join () This method is used to join or concatenate two strings within the provided iterable.
ijust () This method is used to left justify the string in the provided field width and they are given optional characters.
lower () This method is used to return a copy of all the lowercased string.
Istrip () This method is used to return a string in which the leading characters will be removed.
maketrans () This method is used to return the translation table.
partition () This method is used to partition the string from first occurrence of string.
replace () This method is used to replace all the substrings in a string with new substrings.
rfind () This method is used to return the last occurred index if the supplied substring matches with string. The interpreter returns -1 if the index is not found.
rindex () This method is used to return the index if the supplied substring matches with string. The interpreter returns an error that is value error if the index is not found.
rjust () This method is used to right justify the string in the provided field width and they are given optional characters.
rpartition () This method is used to partition the string from last occurrence of the string.
rsplit () This method is used to return the list of those words that are delimited by the string provided by the user. If there are maximum numbers of split then it will be done from right.
rstrip () This method is used to return a string in which the trailing characters will be removed.
split ()

This method is used to return the list of those words that are delimited by the string provided by the user. If there are maximum numbers of split then it will be done from left.

 

Method Description
splitlines () This method is used to return the list of lines in a string.
startswith () This method is used to return a True if the string is starting from a provide substring or character.
strip () This method is used to return a string in which the leading and trailing characters are removed.
swapcase () This method is used to return a string in which the lower characters will be converted to upper case and the upper case letters to lower case letters.
title () This method is used to return a string (title) in which the first letter will be upper cased and the rest of letters will be lowercase letters.
translate () This method is used to return a replica of the string that was mapped in accordance with the map that was provided by the user.
upper () This method is used to return a replica of the string in which all the letters or characters will be upper cased.
zfill () This method is used to return a numeric string in which the left side will be zeros in the defined width.