Decomposition and Abstraction
As well as this lesson, I also highly recommend you read the following - https://cs.stanford.edu/people/nick/py/python-style-decomposition.html
The following definitions are given in [1] (see References).
Decompostion creates struture. It allows us to break a problem into parts that are reasonably self-contained and that may be reused in different settings.
An analogy for this is a car. There are a number of different parts to a car that can be maintained separately. If the car breaks, we probably only have to fix one small part. The car is decomposed into many smaller parts.
Abstraction hides detail. It allows us to use a piece of code as if it were a black-box - that is, something whose interior details we cannot see, don't need to see and shouldn't even want to see.
An analogy for this is also a car, this time though we don't care how the car works. We just need to know how to drive it, e.g. use a steering wheel, clutch, accelerator and brake.
1. Decomposition with Functions
We will see other ways to decompose a program with modules and classes, but for now we will consider functions.
Take the following program.
# program to format a list of strings
name_list = ["sam", "JOE", "Emily", "richard"]
new_name_list = []
# first change the name to lower case, then change the first character to upper
for name in name_list:
lower_name = name.lower()
formatted_name = f"{lower_name[0].upper()}{lower_name[1:]}"
new_name_list.append(formatted_name)
# print names
for name, new_name in zip(name_list, new_name_list):
print(f"The formatted name for {name} is {new_name}.")
# prints...
# The formatted name for sam is Sam.
# The formatted name for JOE is Joe.
# The formatted name for Emily is Emily.
# The formatted name for richard is Richard.
This is actually a series of jobs that get done.
- convert the list of strings to a list of lower case strings
- convert the list of lower case strings to a list with the inital capitalised
- print the formatted strings
We can write this as a decomposed program using functions.
def string_list_to_lower(items):
lower_items = []
for s in items:
lower_items.append(s.lower())
return lower_items
def string_list_capitalise_initial(items):
lower_items = []
for s in items:
lower_items.append(f"{s[0].upper()}{s[1:]}")
return lower_items
def print_formatted(items, formatted_items):
for s, format_s in zip(items, formatted_items):
print(f"The formatted name for {s} is {format_s}.")
# main program
name_list = ["sam", "JOE", "Emily", "richard"]
# convert the list of strings to a list of lower case strings
lower_name_list = string_list_to_lower(name_list)
# convert the list of lower case strings to a list with the inital capitalised
formatted_list = string_list_capitalise_initial(lower_name_list)
# print the formatted strings
print_formatted(name_list, formatted_list)
Now each of the functions is a smaller self-contained part, that could be reused in a different setting.
Note that this is actually a larger program, decomposition doesn't necessarily mean that the program is smaller.
It does though help with maintainability and seperation of concerns.
Here is a version where the formatting is done in one function, but the function now does 2 things.
def format_list(items):
formatted_items = []
for s in items:
formatted_items.append(f"{s[0].upper()}{s[1:].lower()}")
return formatted_items
def print_formatted(items, formatted_items):
for s, format_s in zip(items, formatted_items):
print(f"The formatted name for {s} is {format_s}.")
# main program
name_list = ["sam", "JOE", "Emily", "richard"]
# format_strings
formatted_list = format_list(name_list)
# print the formatted strings
print_formatted(name_list, formatted_list)
And just for fun, here it is using list comprehension...
def format_list(items):
return [f"{s[0].upper()}{s[1:].lower()}" for s in items]
def print_formatted(items, formatted_items):
for s, format_s in zip(items, formatted_items):
print(f"The formatted name for {s} is {format_s}.")
# main program
name_list = ["sam", "JOE", "Emily", "richard"]
# format_strings
formatted_list = format_list(name_list)
# print the formatted strings
print_formatted(name_list, formatted_list)
If you inspect all the main programs that use functions, you will see they are about 3-4 lines long and that it is easy to see which bit does what.
Easier to test and easier to debug!
2. Abstraction with Functions
Again, we will see other ways to abstract a program with modules and classes, but for now we will consider functions.
The essence of abstraction is that you only need know what a function does, but not how it works.
Take for example the following function.
def string_list_capitalise_initial(items):
formatted_items = []
for s in items:
formatted_items.append(f"{s[0].upper()}{s[1:]}")
return formatted_items
I don't really care how it works, I care that I understand what it does and how to use it. Here it is written with a docstring.
def string_list_capitalise_initial(items):
""" Takes a list of strings and Capitalises the first character.
e.g. ["test", "case"] -> ["Test" "Case"]
["heLLo", "WorLD"] -> ["HeLLo", "WorLD"]
Parameters
----------
items: list
list of strings.
Returns
-------
list:
list of transformed strings.
"""
formatted_items = []
for s in items:
formatted_items.append(f"{s[0].upper()}{s[1:]}")
return formatted_items
In many cases this sort of docstring might be left out depending on the programmer; however it is clear what it does and how to use it. It is also important that the function name and parameters makes sense!
The real point it I can use this function without looking at it's internal workings, it is a black-box!
Ideally a function should do one thing only. This makes them easier to test and reuse. It’s hard to define a length restriction but going over 15 lines should make you pause and reflect.
This is an example of something called the single-responsibility principle (SRP).
=== TASK ===
Rewrite the code below so that it uses the functions specified in the remainder of the TASK.
# main program
# list of strings
str_list = ["ABBA", "snake", "Peep", "this", "kayak"]
pal_list = []
for s in str_list:
lower_s = s.lower()
if len(lower_s) > 1:
is_pal = True
start_index = 0
end_index = len(lower_s) - 1
while start_index <= end_index and is_pal:
if not lower_s[start_index] == lower_s[end_index]:
is_pal = False
start_index += 1
end_index -= 1
if is_pal:
pal_list.append(s)
print(pal_list) # prints ['ABBA', 'Peep', 'kayak']
You will note that there are specifications written for each of the functions.
When decomposed properly, your main program should just be 3 lines long.
Copy the following into a new Python file to get started. You can always delete the docstrings to make it easier to read your code.
def is_palindrome(s):
""" Tests if a string is a palindrome, ignore upper/lower case
e.g. ABBA returns True
AbBa returns True
snake returns False
Parameters
----------
s: str
String to test as palindrome
Returns
-------
bool:
Return True if s in a palindrome; False otherwise.
"""
pass
def filter_palindromes(items):
""" Filters a list for palindromes.
e.g. ["ABBA", "snake", "Peep", "this"] returns ["ABBA", "Peep"]
Parameters
----------
items: list
List of strings.
Returns
-------
list:
Filtered list of palindromes
"""
pass
if __name__ == "__main__":
# main program when decomposed
str_list = ["ABBA", "snake", "Peep", "this"]
pal_list = filter_palindromes(str_list)
print(pal_list) # prints ['ABBA', 'Peep']
HINT: You should be calling the is_palindrome()
function from inside the filter_palindromes()
function.
References
[1] Guttag, J.V. (2021) Introduction to Computation and Programming Using Python, third edition: With Application to Computational Modeling. The MIT Press.
[2] https://cs.stanford.edu/people/nick/py/python-style-decomposition.html