03 Strings
03 Strings
Learning Outcomes
● Python built-in functions
● Work with numeric data and string data
● Objects and methods
● Formatting numbers and strings
Built-in functions
● Python provides many useful functions for common
programming tasks.
● A function is a group of statements that performs a specific
task.
● You have already used the functions eval , input , print , and
int ,...
○ These are built-in functions and they are always available
in the Python interpreter.
○ You don’t have to import any modules to use these
functions.
3
Math module
import math
Mathematical
functions and
constants
E.g. math.pi
4
Math module
# import math module to use the math functions
import math
6
String
∙ To represent text , we use the “String” type in Python
− A string is a sequence of characters
− Python treats characters and strings the same way.
− Enclosed within double quotes(“) or single quotes(‘).
− Example:
>>> message ="Hello World"
>>> print(message)
Hello World
∙ We can also use special characters to define text:
>>> message="Hello\nWorld"
>>> print(message)
Hello
7
World
7
Special characters
● \n
● \t
● \\
● \’
● \”
8
Objects of type String
∙ Objects of type String (str) are used to represent strings of characters.
− E.g. 'abc' or "abc"
− E.g. '123' denotes a string of three characters, not the number one
hundred twenty-three.
9
Objects and Methods
● In Python, all data—including numbers and strings—are
actually objects.
● In Python, a number is an object, a string is an object, …
● Objects of the same kind have the same type.
● You can use the type() function to get the class/type of an
object.
● You can perform operations on an object. The operations are
defined using functions.
● The functions for the objects are called methods in Python.
Methods can only be invoked from a specific object.
10
Input String [1]
∙ A string value can be input using the input() method
>>> firstName = input(“Please enter your name: ”)
∙ All values input through the input functions are strings.
∙ Strings containing digits are converted to numbers using the
eval() function.
11
Storing strings in variables
∙ We can take a string and assign a name to it using an
equals sign – we call this a variable:
>>> my_name = "Something”
>>> print(my_name)
Something
12
Storing strings in variables (cont.)
∙ We can change the value of a variable as many times as
we like once we've created it:
>>> my_name = "Something"
>>> print(my_name)
Something
# change the value of my_name
>>> my_name = "Another Thing"
Another Thing
13
String Indexing
∙ Indexing can be used to extract individual characters from a
string.
∙ In Python, all indexing is zero-based.
− Typing 'abc'[0] into the interpreter will cause it to display
the string 'a' .
− Typing 'abc'[3] will produce the error message
IndexError: string index out of range .
− Since Python uses 0 to indicate the first element of a
string, the last element of a string of length 3 is accessed
using the index 2.
14
String Indexing (cont.)
15
String Indexing
>>> greet[0]
‘H’
>>> x = 8
>>> print (greet[x–2])
B
17
Exercise 3.1
18
Tools for manipulating strings
∙ So far we have shown that we can store and print strings
∙ But Python also provides the facilities for
manipulating strings.
∙ Python has many built-in functions for carrying out
common operations, and in the following slides we'll
take a look at them one-by-one.
19
Concatenation
∙ We can concatenate (stick together) two strings using the +
symbol.
∙ This symbol will join together the string on the left with the
string on the right:
>>> my_name = "John" + "Smith"
>>> print(my_name)
JohnSmith
∙ We can also concatenate variables that point to strings:
>>> firstname = "John”
>>> my_name = firstname + "Smith"
# my_name is now "JohnSmith"
20
Concatenation (cont.)
∙ We can even join multiple strings together in one go:
>>> upstream = "AAA"
>>> downstream = "GGG"
>>> my_dna = upstream + "ATGC" + downstream
# my_dna is now "AAAATGCGGG"
∙ Note: the result of concatenating two strings together is itself
a string. So it's perfectly OK to use a concatenation inside a
print statement:
>>> print("Hello" + " " + "world")
21
Repetition
∙ Example:
>>> 3 * “John”
‘JohnJohnJohn’
22
Exercise 3.2
23
Finding the length of a string
∙ The len built-in function takes a single argument (a string)
∙ len outputs a value (a number) that can be stored – we call
this the return value.
○ If we write a program that uses len to calculate the
length of a string, the program will run but we won't see
any output:
# this line doesn't produce any output
>>> len("SampleText")
● If we want to actually use the return value, we need to store
it in a variable, and then do something useful with it (like
printing it):
>>> text_length = len("SampleText")
>>> print(text_length)
24
Finding the length of a string (cont.)
25
Finding the length of a string (cont.)
When we try to run the program we get the following error:
1 Traceback (most recent call last):
2 File "calcTextLength.py", line 6, in <module>
3 print("The length of the text is " + text_length)
4 TypeError: must be str, not int
● The error message (line 4) is short but informative: "cannot
concatenate 'str' and 'int' objects".
● Python is complaining that it doesn't know how to
concatenate a string (which it calls str for short) and a number
(which it calls int – short for integer).
● But Python has a built-in solution – a function called str
which turns a number into a string so that we can print it.
26
Finding the length of a string (cont.)
27
Changing case
∙ We can convert a string to lower case by using a new type of syntax – a
method that belongs to strings.
∙ A method is like a function, but instead of being built in to the Python
language, it belongs to a particular type.
∙ The method we are talking about here is called lower, and we say that it
belongs to the string type. Here's how we use it:
my_text = "SampleText"
# print my_text a in lower case
print(my_text.lower())
29
Replacement
∙ replace is another example of a useful method that
belongs to the string type
∙ it takes two arguments (both strings) and returns a copy
of the variable where all occurrences of the first string
are replaced by the second string.
30
Replacement (cont.)
∙ Example of replace :
str1 = "Java is a programming language"
# Calling function
str2 = str1.replace("Java","Python")
# Displaying result
print("Old String: \t",str1)
print("New String: \t",str2)
Output
Old String: Java is a programming language
New String: Python is a programming language
31
Slicing a string
∙ Slicing is used to extract substrings of arbitrary length.
∙ If s is a string, the expression s[start:end] denotes the
substring of s that starts at index start and ends at index
end-1 .
− For example, 'abc'[1:3] = 'bc' .
∙ If the value before the colon is omitted, it defaults to 0.
∙ If the value after the colon is omitted, it defaults to the
length of the string.
∙ Consequently, the expression 'abc'[:] is semantically
equivalent to the more verbose 'abc'[0:len('abc')]
32
Extracting part of a string - Slicing
∙ Note that in Python, the positions in a string start from zero(0)
up to the position (length_of_string-1)
∙ Syntax:
<string>[<start>:<end>]
− Note: Both start and end should be int-valued expressions
33
Extracting part of a string (cont.)
∙ Example of substring:
module = "Problem Solving Techniques"
# print positions three to five
print(module[3:5])
# positions start at zero, not one
print(module[0:6])
# if we use a stop position beyond the end, it's the same as using the end
print(module[0:60])
Output:
bl
Proble
Problem Solving Techniques 34
Extracting part of a string (cont.)
∙ If we just give a single number in the square
brackets, we'll just get a single character:
food = "pizza"
first_char = pizza[0]
print(first_char)
Output:
p
35
Extracting part of a string (cont.)
1 s = “Hello”
2 print(s[0]) ‘H’
3 print(s[4]) ‘o’
4 print(s[-1]) ‘o’ “Slices” can be taken with
5 print(s[1:3]) ‘el’ indices separated by a colon
8 print(s[::2]) ‘Hlo’
9 print(s[::-1]) ‘olleH’
10 print(len(s)) 5
36
Exercise 3.3
∙ Write a program that allows the input of a
movie title, followed by 2 integer values x
and y and displays the substring between
positions x and y inclusive in the movie title.
37
The “in” and “not in” operators
∙ in : membership operator : true if first string exists inside
second string
∙ not in :non-membership: true if first string does not exist
in second string
∙ Examples
>>> 'John' in 'Sir John Smith’
true
>>> 'x' in 'sample’
false
38
Counting and finding substrings
∙ A very common job in text analysis is to count the number
of times some pattern occurs in a text.
∙ In computer programming terms, what that problem
translates to is counting the number of times a substring
occurs in a string.
∙ The method that does the job is called count.
− It takes a single argument whose type is string, and
returns the number of times that the argument is found
in the variable.
− The return type is a number
39
Counting and finding substrings (cont.)
count = string.count(substring)
Output:
The count is: 2
40
Counting and finding substrings (cont.)
Count number of occurrences of a given substring
using start and end
count = string.count(substring,8,25)
Output:
The count is: 1
41
Exercise 3.4
∙ Write a program that allows the input of a
sentence and displays the count of ‘a’ and
‘s’ in the sequence.
42
Exercise 3.5
∙ Write a program that allows a user to input a
sentence and displays five (5) integers
(separated by spaces) counting the
respective number of times that each vowel
occurs in the sequences.
43
Exercise 3.6
∙ Write a program that allows a user to input a DNA sequence
(that can be made up of the alphabets ‘A’, ‘C’, ‘G’ and ‘T’ in
upper or lowercase).
∙ The program will then calculate and display the GC content
(total percentage of G and C) of that sequence.
∙ To calculate the GC content of a DNA sequence (which is
simply a string):
− we must find the sum of “G” and “C”
− divide that sum by the length of the string
− Then, multiply by 100
[Hint: you can use normal mathematical symbols like add (+), subtract (-),
multiply (*), divide (/) and parentheses to carry out calculations on numbers in
Python.]
44
Counting and finding substrings (cont.)
∙ A closely-related problem to counting substrings is
finding their location.
∙ What if instead of counting the number of ‘a’ in
our text we want to know where they are?
∙ The find method will give us the answer, at least
for simple cases.
− find takes a single string argument, just like count, and
returns a number which is the position at which that
substring first appears in the string (in computing, we
call that the index of the substring).
45
Counting and finding substrings (cont.)
∙ Remember that in Python we start counting from
zero rather than one, so position 0 is the first
character, position 4 is the fifth character, etc.
∙ Examples:
word = "problem"
print(word.find('p'))
print(word.find('ob'))
print(word.find('w'))
Output
0
2
46
-1
Counting and finding substrings (cont.)
>>> dna="aagtccgcgcgctttttaaggagccttttgacggc”
#search from position 0
>>> dna.find('ag')
1
# search from position 17, after the first occurrence
>>> dna.find(‘ag’,17)
18
>>> dna.find(‘ag’,19)
21
# same as find but search backwards
>>> dna.rfind(‘ag’)
21 47
Output Formatting
>>> print("The DNA sequence’s GC content is", gc_perc,"%")
The DNA sequence’s GC content is 53.06122448979592 %
∙ The value of the gc_perc variable has many digits following the
dot which are not very significant. You can eliminate the display
of too many digits by imposing a certain format to the printed
string
Formatting string value that is formatted
>>> print("The DNA sequence’s GC content is %5.3f %%" % gc_perc)
note the double % to print a % symbol
percent operator separating the formatting string
and the value to replace the format placeholder
48
Display Values Formatting
49
Formatting numbers
50
Formatting - Placeholders
∙ >>> print(“Hello %s %s, you may have won $%d!” % (“Mr.”, “Smith”, 10000))
Hello Mr. Smith, you may have won $10000!
● >>> print(‘This int, %10d, was placed in a field of width 10’ % (7))
This int, 7, was placed in a field of width 10
51
Yet another Example
∙ >>> print(‘This float, %10.5f, has width 10 and precision 5.’ % (3.1415926))
This float, 3.14159, has width 10 and precision 5.
● >>>import math
● >>>print("Compare %f and %0.20f" % (math.pi, math.pi))
52
Formatting strings (s)
53
Print
● print() automatically prints a linefeed ( \n ) to cause the output to
advance to the next line.
● If you don’t want this to happen after the print function is
finished, you can invoke the print function by passing a special
argument end
54
Exercise 3.7
∙ Calculating AT content
Here's a short DNA sequence:
ACTGATCGATTACGTATAGTATTTGCTATCATACATA
TATATCGATGCGTTCAT
Write a program that will print out the AT content of this
DNA sequence.
[Hint: you can use normal mathematical symbols like add
(+), subtract (-), multiply (*), divide (/) and parentheses to
carry out calculations on numbers in Python.]
55
The Split() function
∙ This function is used to split a string into a sequence of
substrings
∙ By default, it will split the string wherever a space occurs
>>> S="Hello String Library"
>>> S.split()
['Hello', 'String', 'Library']
56
Exercise 3.8
An important process in Computational Biology consists of breaking
a sequence on a particular pattern.
Write a program that allows the input of a DNA sequence and splits
it on the pattern “ATG” into a number of subsequences. The program
should then display the list of subsequences.
Note: Ensure that your sequence contains a number of
occurrences of “ATG”
57
More String Operations
Function Description
s.capitalize() Copy of s with only the first character capitalised
58
String Operations
Function Description
59
String Operations
Function Description
s.rsplit(separator, maxsplit) method splits a string into a list, starting from the right.
maxsplit specifies how many splits to do. Default value
is -1, which is "all occurrences"
60
Exercise 3.9
Write a program to input a string and output a new string where all
occurrences of the first char of the original string has been changed
to '$', except the first char itself.
Sample output
Input String : 'restart'
New String : 'resta$t'
61
Exercise 3.10
Write a Python program to input two strings and create a single
string using the two given strings, separated by a space and
swapping the first two characters of each string.
Sample output
Input first String: abc
Input second String: xyz
New String : xyc abz
62
Exercise 3.11
Write a Python program to input a string and return a new string
made of 4 copies of the last two characters of the original string
(length must be at least 2).
Sample output
Input first String: Python
New String : onononon
63
Exercise 3.12
Write a Python program to input a string and output the last part of a
string before a specified character..
Sample output
Input a String: https://www.w3resource.com/python-exercises/string
Input char: /
Output string: https://www.w3resource.com/python-exercises
64
Exercise 3.13
Write a Python program to input a floating point number and display
the number with no decimal places. [Hint: Use str.format]
Sample outputs
Input a floating point number: 3.1415926
Formatted Number with no decimal places: 3
65
Exercise 3.14
Write a program to print the following integers with zeros on the left
of specified width:
Original Number: 3
Formatted Number(left padding, width 2): 03
66
Acknowledgments
● DGT1039Y lectures notes by Dr. Shakun Baichoo, FoICDT