Effective Python

第 3 章 類別與繼承

Speaker: 毛毛 (2016/12/03)

Outline

  • 22 優先選用輔助類別而非使用字典或元組來管理紀錄
  • 23 接受函式作為簡單的介面,而非使用類別
  • 24 使用 @classmethod 多型機制來建構泛用物件
  • 25 使用 super 來初始化父類別
  • 26 多重繼承只用於 Mix-in 工具類別
  • 27 優先選用公開屬性而非私有屬性
  • 28 繼承 collections.abc 以建立自訂的容器型別

22 優先選用輔助類別而非使用字典或元組來管理紀錄

字典太容易使用,以致可能會過度擴充而寫出不易閱讀的程式碼

  • 應避免內嵌超過一層的字典,也就是避免含有字典的字典

我們來看個例子!

紀錄學生的成績

In [60]:
# 紀錄單一科目的成績 & 算出平均
class SimpleGradebook(object):
    def __init__(self):
        self._grades = {}
    
    def add_student(self, name):
        self._grades[name] = []
    
    def report_grade(self, name, score):
        self._grades[name].append(score)
    
    def average_grade(self, name):
        grades = self._grades[name]
        return sum(grades)/len(grades)

book = SimpleGradebook()
book.add_student("Maomao")
book.report_grade("Maomao", 100)
book.report_grade("Maomao",90)
print(book.average_grade("Maomao"))
95.0
In [63]:
# 記錄多個科目的成績 & 算出平均
class BySubjectGradebook(object):
    def __init__(self):
        self._grades = {}
    
    def add_student(self, name):
        self._grades[name] = {}
    
    def report_grade(self, name, subject, score):
        by_subject = self._grades[name]
        grade_list = by_subject.setdefault(subject, [])
        grade_list.append(score)
    
    def average_grade(self, name):
        graby_subject = self._grades[name]
        total, count = 0, 0
        for grades in graby_subject.values():
            total += sum(grades)
            count += len(grades)
        return total/count

book = BySubjectGradebook()
book.add_student("Maomao")
book.report_grade("Maomao", "Math", 100)
book.report_grade("Maomao", "Math", 90)
book.report_grade("Maomao", "Cooking", 96)
book.report_grade("Maomao", "Cooking", 97)
print(book.average_grade("Maomao"))
95.75
In [67]:
# 記錄多個科目的不同權重的成績 & 算出平均
class WeightedGradebook(object):
    def __init__(self):
        self._grades = {}
    
    def add_student(self, name):
        self._grades[name] = {}
    
    def report_grade(self, name, subject, score, weight):
        by_subject = self._grades[name]
        grade_list = by_subject.setdefault(subject, [])
        grade_list.append((score, weight))
    
    def average_grade(self, name):
        graby_subject = self._grades[name]
        total, count = 0, 0
        for grades in graby_subject.values():
            for score, weight in grades:
                total += score*weight
                count += weight
        return total/count

book = WeightedGradebook()
book.add_student("Maomao")
book.report_grade("Maomao", "Math", 100, 1)
book.report_grade("Maomao", "Math", 90, 2)
book.report_grade("Maomao", "Cooking", 96, 3)
book.report_grade("Maomao", "Cooking", 97, 1)
print(book.average_grade("Maomao"))
95.0

當發現紀錄的工作變得複雜,就將它拆解成多個類別,以提供定義良好的介面來封裝資料

namedtuple

  • 用來創建類似於 tuple 的數據模型
  • 適合用來定義小型、不可變的資料類別
  • 除了可用索引來取得資料,也可用屬性名來取得
    • 增強程式碼的可讀性
In [79]:
import collections

Human = collections.namedtuple("Human", ("name", "gender", "age"))
maomao = Human(name="Maomao", gender="girl", age="18")
Ivan = Human("Ivan", "boy", "25")
print(maomao.name,"is a", maomao.age, "years-old", maomao[1])
print(Ivan[0],"is a", Ivan[2], "years-old", Ivan.gender)

maomao.age = "20"
Maomao is a 18 years-old girl
Ivan is a 25 years-old boy
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-79-0f38e149ab14> in <module>()
      7 print(Ivan[0],"is a", Ivan[2], "years-old", Ivan.gender)
      8 
----> 9 maomao.age = "20"

AttributeError: can't set attribute

(改寫) 記錄學生的成績

In [70]:
import collections

Grade = collections.namedtuple("Grade", ("score", "weight"))

class Subject(object):
    def __init__(self):
        self._grades = []
    
    def report_grade(self, score, weight):
        self._grades.append(Grade(score, weight))
    
    def average_grade(self):
        total, count = 0, 0
        for grade in self._grades:
            total += grade.score*grade.weight
            count += grade.weight
        return total/count
In [71]:
class Student(object):
    def __init__(self):
        self._subjects = {}
        
    def subject(self, name):
        if name not in self._subjects:
            self._subjects[name] = Subject()
        return self._subjects[name]
    
    def average_grade(self):
        total, count = 0, 0
        for subject in self._subjects.values():
            total += subject.average_grade()
            count += 1
        return total/count
In [72]:
class Gradebook(object):
    def __init__(self):
        self._students = {}

    def student(self, name):
        if name not in self._students:
            self._students[name] = Student()
        return self._students[name]

book = Gradebook()
maomao = book.student("Maomao")
math = maomao.subject("Math")
math.report_grade(100, 2)
math.report_grade(90, 1)
print(math.average_grade())
cook = maomao.subject("Cooking")
cook.report_grade(85, 1)
cook.report_grade(95, 3)
print(cook.average_grade())
print(maomao.average_grade())
96.66666666666667
92.5
94.58333333333334

Summary

  • 如果紀錄用的字典變得太複雜,就改寫為使用多個輔助類別來做記錄
  • 如果不需使用較有彈性的完整類別,可使用 namedtuple 來製作輕量化、不可變的資料容器

23 接受函式作為簡單的介面,而非使用類別

Python 的函式為一級函式

  • 可以指定給變數
  • 可以傳入函式當作參數
  • 可以從函式中傳回當作回傳值
In [144]:
# 指定給變數

def hello(name):
    print("Hello", name, "!")

a = hello
a("maomao")
Hello maomao !
In [146]:
# 傳入函式當作參數

def compute(func, number1, number2):
    return func(number1, number2)

def add(number1, number2):
    return number1+number2

def sub(number1, number2):
    return number1-number2

n1 = 10
n2 = 5
print(n1, "+", n2, "=", compute(add, n1, n2))
print(n1, "-", n2, "=", compute(sub, n1, n2))
10 + 5 = 15
10 - 5 = 5
In [148]:
# 從函式中傳回當作回傳值

def pow_func(base_num):
    def inner_func(num):
        return num**base_num
    return inner_func

pow_2 = pow_func(2)
print(pow_2(2))
print(pow_2(3))
4
9

許多 Python 內建的 API 都允許你傳入一個函式 (掛接器/hook) 來自訂行為

  • 當 API 執行時,就會 call back 你的掛接器

list 的 sort 方法接受一個選擇性的 key 引數,用來決定每個項目用於排序的值

In [153]:
# 根據字串長度來排序

def sort_func(x):
    return len(x)

names = ["Socrates", "Archimedes", "Plato", "Aristotle"]
names.sort(key=sort_func)
print(names)

names2 = ["Socrates", "Archimedes", "Plato", "Aristotle"]
names2.sort(key=lambda x: len(x), reverse=True)
print(names2)
['Plato', 'Socrates', 'Aristotle', 'Archimedes']
['Archimedes', 'Aristotle', 'Socrates', 'Plato']

defaultdict 類別允許你提供一個函式,當存取到缺少的 key 時,就會呼叫該函式

  • 該函式需回傳缺少的 key 應有的預設值
In [251]:
from collections import defaultdict

def key_missing():
    print("Assign default value")
    return 1

scores = {"Maomao": 100, "Abby": 90}

scores2 = defaultdict(key_missing, scores)
print(scores2["Tom"])

print(scores)            # 注意: "Tom" 沒有真的被放進 scores,它是存在在 scores2 裡
print(scores2)
Assign default value
1
{'Maomao': 100, 'Abby': 90}
defaultdict(<function key_missing at 0x7f6bb95d68c8>, {'Tom': 1, 'Maomao': 100, 'Abby': 90})
In [186]:
scores2["Hopper"] += 5
for key, value in scores2.items():
    print (key, value)
Assign default value
Tom 1
Maomao 100
Hopper 6
Abby 90

回歸正題,接受函式作為簡單的介面,而非使用類別,來看個課本的例子!

計算缺少的 key 的總數

In [252]:
current = {"green": 12, "blue": 3}
increments = [("red", 5), ("blue", 17), ("orange", 9)]

# Case 1
def increment_with_report(current, increments):
    added_count = 0
    
    def missing():
        nonlocal added_count
        added_count += 1
        return 0
    
    result = defaultdict(missing, current)
    for key, amount in increments:
        result[key] += amount
        
    return result, added_count

result, added_count = increment_with_report(current, increments)
print(result)
print(added_count)
defaultdict(<function increment_with_report.<locals>.missing at 0x7f6bb9ef4b70>, {'red': 5, 'orange': 9, 'blue': 20, 'green': 12})
2
In [253]:
# Case 2
class CountMissing(object):
    def __init__(self):
        self.added = 0
        
    def missing(self):
        self.added += 1
        return 0
    
counter = CountMissing()
result = defaultdict(counter.missing, current)
for key, amount in increments:
    result[key] += amount

print(result)
print(counter.added)
defaultdict(<bound method CountMissing.missing of <__main__.CountMissing object at 0x7f6bb960b8d0>>, {'red': 5, 'orange': 9, 'blue': 20, 'green': 12})
2
In [254]:
# Case 3
class BetterCountMissing(object):
    def __init__(self):
        self.added = 0
        
    def __call__(self):
        self.added += 1
        return 0
    
counter = BetterCountMissing()
result = defaultdict(counter, current)
for key, amount in increments:
    result[key] += amount

print(result)
print(counter.added)
defaultdict(<__main__.BetterCountMissing object at 0x7f6bb95c5588>, {'red': 5, 'orange': 9, 'blue': 20, 'green': 12})
2

Summary

  • 在 Python 中函式為第一級函式,代表它們能被用在運算式中
  • __call__ 特殊方法能讓一個類別的實體被當作函式呼叫
  • 當需要一個函式來保存狀態,請考慮定義一個提供了__call__方法的類別,而非使用有狀態的 closure

24 使用 @classmethod 多型機制來建構泛用物件

多型: 讓同一階層架構中的多個類別各自實作某個方法的專屬版本

In [2]:
class Animal(object):
    def __init__(self, name):
        self.name = name
        
    def say_hello(self):
        raise NotImplementedError

class Human(Animal):
    def say_hello(self):
        print("Hi~ I'm human {}".format(self.name))

class Cat(Animal):
    def say_hello(self):
        print("Meow~ I'm cat {}".format(self.name))

maomao = Human("maomao")
maomao.say_hello()

kitty = Cat("kitty")
kitty.say_hello()
Hi~ I'm human maomao
Meow~ I'm cat kitty

類別 (class) 裡的方法 (method) 類型

  • 普通方法/實例方法
  • 類別方法
    • 使用 @classmethod 包裹的方法
  • 靜態方法
    • 使用 @staticmethod 包裹的方法

普通方法/實例方法

  • 需要實例一個物件後才能呼叫使用的方法
In [4]:
class Test(object):
    def say_hello(self):
        print("Hi~")

object1 = Test()
object1.say_hello()    # work

Test.say_hello()       # not work
Hi~
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-0d9f59356d68> in <module>()
      6 object1.say_hello()
      7 
----> 8 Test.say_hello()

TypeError: say_hello() missing 1 required positional argument: 'self'

類別方法

  • 不需透過實例物件,就可呼叫使用的方法 (但透過實例物件也可呼叫使用)
  • 互動的對象是類別而不是實例物件
  • 通常用於類別屬性相關的操作
In [255]:
# 紀錄該類別總共有幾個實例物件

class Test(object):
    # 類別屬性
    instance_number = 0
    
    def __init__(self, name):
        # 類別屬性
        Test.instance_number += 1
        
        # 實例屬性
        self.name = name
    
    def say_hello(self):
        print("Hi {}".format(self.name))
    
    @classmethod
    def count(cls):
        print(cls.instance_number)
In [256]:
Test.count()        # work

object1 = Test("object1")
Test.count()
object1.count()     # also work

object2 = Test("object2")
Test.count()
object1.count()
object2.count()
0
1
1
2
2
2

注意!!!

  • 類別屬性,是所有實例共用的
In [257]:
class Test(object):
    hello_msg = "Hi"
    
    def __init__(self, name):
        self.name = name
    
    def say_hello(self):
        print(Test.hello_msg, self.name)
    
    @classmethod
    def modify_hello_msg(cls, new_msg):
        cls.hello_msg = new_msg

object1 = Test("object1")
object1.say_hello()

object2 = Test("object2")
object2.say_hello()
Hi object1
Hi object2
In [258]:
Test.modify_hello_msg("Hey")
object1.say_hello()
object2.say_hello()

object3 = Test("object3")
object3.say_hello()

Test.modify_hello_msg("Welcome")
object1.say_hello()
object2.say_hello()
object3.say_hello()
Hey object1
Hey object2
Hey object3
Welcome object1
Welcome object2
Welcome object3

普通方法/實例方法 v.s. 類別方法

第一個參數

  • 普通方法/實例方法傳入的是實例本身,通常命名為 self
  • 類別方法傳入的是類別,通常命名為 cls
    • 就算透過實例呼叫,傳入的也還是類別
In [37]:
class Test(object): 
    def instance_method(self):
        print(self)
    
    @classmethod
    def class_method(cls):
        print(cls)

object1 = Test()

Test.class_method()
object1.class_method()

object1.instance_method()
<class '__main__.Test'>
<class '__main__.Test'>
<__main__.Test object at 0x7f6bbaffd160>

靜態方法

  • 不需透過實例物件,就可呼叫使用的方法 (但透過實例物件也可呼叫使用)
  • 通常用於不需類別屬性和實例屬性參與的其他操作
In [259]:
import platform
    
class Test(object):
    @staticmethod
    def preparation():
        sysstr = platform.system()
        if(sysstr =="Windows"):
            print("preparation for Windows platform")
        elif(sysstr == "Linux"):
            print("preparation for Linux platform")
        elif(sysstr == "Darwin"):
            print("preparation for Mac OS platform")
        else:
            print("preparation for Other OS platform")
    
    def do_something(self):
        self.preparation()
        print("do something ......")

Test.preparation()        # work

object1 = Test()
object1.preparation()     # also work

print()
object1.do_something()    # also work
preparation for Linux platform
preparation for Linux platform

preparation for Linux platform
do something ......

普通方法/實例方法 v.s. 類別方法 v.s. 靜態方法

呼叫方式

  • 普通方法/實例方法
    • 僅能透過實例物件呼叫
  • 類別方法
    • 可透過類別或實例物件呼叫
  • 靜態方法
    • 可透過類別或實例物件呼叫

參數

  • 普通方法/實例方法
    • 傳入的是實例本身,通常命名為 self
  • 類別方法
    • 傳入的是類別,通常命名為 cls
  • 靜態方法
    • 不傳入任何參數

回歸正題,使用 @classmethod 多型機制來建構泛用物件,來看個課本的例子!

實作 MapReduce: 為大量資料做平行運算處理

  • Map
    • 將輸入資料切割成小部分,分散到各個工作節點去做運算
  • Reduce
    • 回收各個工作節點的處理結果,合併成一份並輸出

準備輸入資料

In [39]:
import os

class InputData(object):
    def read(self):
        raise NotImplementedError

class PathInputData(InputData):
    def __init__(self, path):
        super().__init__()
        self.path = path
        
    def read(self):
        return open(self.path).read()

def generate_inputs(data_dir):
    for name in os.listdir(data_dir):
        yield PathInputData(os.path.join(data_dir, name))

準備工作節點

In [40]:
class Worker(object):
    def __init__(self, input_data):
        self.input_data = input_data
        self.result = None
    
    def map(self):
        raise NotImplementedError
        
    def reduce(self, other):
        raise NotImplementedError

# newline 計數器
class LineCountWorker(Worker):
    def map(self):
        data = self.input_data.read()
        self.result = data.count("\n")
        
    def reduce(self, other):
        self.result += other.result

def create_workers(input_list):
    workers = []
    for input_data in input_list:
        workers.append(LineCountWorker(input_data))
    return workers

平行處理輸入資料

In [45]:
from threading import Thread

def execute(workers):
    threads = [Thread(target=w.map) for w in workers]
    
    for thread in threads:    # 線程啟動
        thread.start()
    
    for thread in threads:    # 等待所有線程結束
        thread.join()
    
    first, rest = workers[0], workers[1:]
    for worker in rest:       # 回收處理結果
        first.reduce(worker)
    
    return first.result

def mapreduce(data_dir):
    inputs = generate_inputs(data_dir)
    workers = create_workers(inputs)
    return execute(workers)

print("There are {} lines".format(mapreduce("/home/maomao/tmp")))
There are 77 lines

如果現在想用別的 InputData 和 Worker 子類別,generate_inputs 和 create_workers 都得重寫!

因為沒有利用泛用的方式來建構物件!

(改寫) 準備輸入資料

In [46]:
import os

class GenericInputData(object):
    def read(self):
        raise NotImplementedError
    
    @classmethod
    def generate_inputs(cls, config):
        raise NotImplementedError

class PathInputData(GenericInputData):
    def __init__(self, path):
        super().__init__()
        self.path = path
        
    def read(self):
        return open(self.path).read()
    
    @classmethod
    def generate_inputs(cls, config):
        data_dir = config["data_dir"]
        for name in os.listdir(data_dir):
            yield cls(os.path.join(data_dir, name))

(改寫) 準備工作節點

In [48]:
class GenericWorker(object):
    def __init__(self, input_data):
        self.input_data = input_data
        self.result = None
    
    def map(self):
        raise NotImplementedError
        
    def reduce(self, other):
        raise NotImplementedError
    
    @classmethod
    def create_workers(cls, input_class, config):
        workers = []
        for input_data in input_class.generate_inputs(config):
            workers.append(cls(input_data))
        return workers

# newline 計數器
class LineCountWorker(GenericWorker):
    def map(self):
        data = self.input_data.read()
        self.result = data.count("\n")
        
    def reduce(self, other):
        self.result += other.result

(改寫) 平行處理輸入資料

In [49]:
from threading import Thread

def execute(workers):
    threads = [Thread(target=w.map) for w in workers]
    
    for thread in threads:    # 線程啟動
        thread.start()
    
    for thread in threads:    # 等待所有線程結束
        thread.join()
    
    first, rest = workers[0], workers[1:]
    for worker in rest:       # 回收處理結果
        first.reduce(worker)
    
    return first.result

def mapreduce(worker_class, input_class, config):
    workers = worker_class.create_workers(input_class, config)
    return execute(workers)

config = {"data_dir": "/home/maomao/tmp"}
print("There are {} lines".format(mapreduce(LineCountWorker, PathInputData, config)))
There are 77 lines

25 使用 super 來初始化父類別

呼叫父類的 __init__ 方法

[Python2 & Python3 皆適用]

  • FatherClass.__init__(self, value)
  • super(ChildClass, self).__init__(value)

[Python3 限定寫法]

  • super(__class__, self).__init__(value)
  • super().__init__(value)

單一繼承

In [50]:
class FatherClass(object):
    def __init__(self, value):
        self.value = value

class ChildClass(FatherClass):
    def __init__(self, value):
        FatherClass.__init__(self, value)

object1 = ChildClass(5)
print(object1.value)
5

多重繼承

In [51]:
class BaseClass(object):
    def __init__(self, value):
        self.value = value

class TimesTwo(object):
    def __init__(self):
        self.value *= 2

class PlusFive(object):
    def __init__(self):
        self.value += 5

class Order1(BaseClass, TimesTwo, PlusFive):
    def __init__(self, value):
        BaseClass.__init__(self, value)
        TimesTwo.__init__(self)
        PlusFive.__init__(self)

class Order2(BaseClass, PlusFive, TimesTwo):
    def __init__(self, value):
        BaseClass.__init__(self, value)
        TimesTwo.__init__(self)
        PlusFive.__init__(self)

object1 = Order1(2)
print(object1.value)

object2 = Order2(2)
print(object2.value)
9
9

鑽石狀繼承

  • 多重繼承的一種
  • ChildClass 繼承 FatherClass1 和 FatherClass2,且 FatherClass1 和 FatherClass2 繼承了一樣的 GrandFatherClass
In [53]:
class BaseClass(object):
    def __init__(self, value):
        self.value = value

class TimesTwo(BaseClass):
    def __init__(self, value):
        BaseClass.__init__(self, value)
        self.value *= 2

class PlusFive(BaseClass):
    def __init__(self, value):
        BaseClass.__init__(self, value)
        self.value += 5

class Order1(TimesTwo, PlusFive):
    def __init__(self, value):
        TimesTwo.__init__(self, value)
        PlusFive.__init__(self, value)

class Order2(PlusFive, TimesTwo):
    def __init__(self, value):
        PlusFive.__init__(self, value)
        TimesTwo.__init__(self, value)

object1 = Order1(2)
print(object1.value)        # expect 9, but 7

object2 = Order2(2)
print(object2.value)        # expect 14, but 4
7
4

利用 FatherClass.__init__(self, value) 來做初始化,容易發生值被重設的問題!

=> 應用 super 內建函式來做初始化,以避免上述問題

In [54]:
class BaseClass(object):
    def __init__(self, value):
        self.value = value

class TimesTwo(BaseClass):
    def __init__(self, value):
        super(TimesTwo, self).__init__(value)
        self.value *= 2

class PlusFive(BaseClass):
    def __init__(self, value):
        super(PlusFive, self).__init__(value)
        self.value += 5

class Order1(TimesTwo, PlusFive):
    def __init__(self, value):
        super(Order1, self).__init__(value)

class Order2(PlusFive, TimesTwo):
    def __init__(self, value):
        super(Order2, self).__init__(value)

object1 = Order1(2)
print(object1.value)        # expect 9, but 14

object2 = Order2(2)
print(object2.value)        # expect 14, but 9
14
9

super 呼叫函式的順序是根據 MRO (method resolution order) 決定的!

當你呼叫 super(MyClass, self).method() 時,其實是去呼叫 MRO 裡下一個類別的 method()

我們來看個例子!

In [260]:
class Root(object):
    def __init__(self):
        print("Enter Root")

class A(Root):
    def __init__(self):
        print("Enter A")
        super(A, self).__init__()
        print("Leave A")

class B(Root):
    def __init__(self):
        print("Enter B")
        super(B, self).__init__()
        print("Leave B")

class C(Root):
    def __init__(self):
        print("Enter C")
        super(C, self).__init__()
        print("Leave C")

class Order1(A, B, C):
    def __init__(self):
        super(Order1, self).__init__()
In [261]:
print(Order1.mro())
object1 = Order1()
[<class '__main__.Order1'>, <class '__main__.A'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.Root'>, <class 'object'>]
Enter A
Enter B
Enter C
Enter Root
Leave C
Leave B
Leave A
In [262]:
class Order2(B, C, A):
    def __init__(self):
        super(Order2, self).__init__()

class Order3(C, A, B):
    def __init__(self):
        super(Order3, self).__init__()

print (Order2.mro())
object2 = Order2()
print()
print (Order3.mro())
object3 = Order3()
[<class '__main__.Order2'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class '__main__.Root'>, <class 'object'>]
Enter B
Enter C
Enter A
Enter Root
Leave A
Leave C
Leave B

[<class '__main__.Order3'>, <class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class '__main__.Root'>, <class 'object'>]
Enter C
Enter A
Enter B
Enter Root
Leave B
Leave A
Leave C

回去看剛剛 TimesTwo 和 PlusFive 的例子

In [265]:
class BaseClass(object):
    def __init__(self, value):
        self.value = value

class TimesTwo(BaseClass):
    def __init__(self, value):
        super(TimesTwo, self).__init__(value)
        self.value *= 2

class PlusFive(BaseClass):
    def __init__(self, value):
        super(PlusFive, self).__init__(value)
        self.value += 5

class Order1(TimesTwo, PlusFive):
    def __init__(self, value):
        super(Order1, self).__init__(value)

print(Order1.mro())
object1 = Order1(2)
print(object1.value)
[<class '__main__.Order1'>, <class '__main__.TimesTwo'>, <class '__main__.PlusFive'>, <class '__main__.BaseClass'>, <class 'object'>]
14
In [266]:
class Order2(PlusFive, TimesTwo):
    def __init__(self, value):
        super(Order2, self).__init__(value)

print(Order2.mro())
object2 = Order2(2)
print(object2.value)
[<class '__main__.Order2'>, <class '__main__.PlusFive'>, <class '__main__.TimesTwo'>, <class '__main__.BaseClass'>, <class 'object'>]
9

Python3 限定寫法

  • super(__class__, self).__init__(value)
  • super().__init__(value)
In [58]:
class BaseClass(object):
    def __init__(self, value):
        self.value = value

class TimesTwo(BaseClass):
    def __init__(self, value):
        super(__class__, self).__init__(value)
        self.value *= 2

class PlusFive(BaseClass):
    def __init__(self, value):
        super().__init__(value)
        self.value += 5

class Order1(TimesTwo, PlusFive):
    def __init__(self, value):
        super().__init__(value)

object1 = Order1(2)
print(object1.value)
14

Summary

  • MRO 解決了鑽石狀繼承的重複設值問題
  • super 呼叫的其實是 MRO 裡的下一個類別

26 多重繼承只用於 Mix-in 工具類別

Mix-in 是一種小型類別,只定義了一個類別應該提供的一組額外方法

  • 不定義自己的實體屬性
  • 不定義自己的 __init__
  • 提供泛用性的功能,可套用到許多其他類別上
  • 最少化重複的程式碼、最大化重複使用率

我們直接來看個例子!

實作水果 (標明是否適合用來送禮以及食用方法)

In [267]:
class Fruit(object):
    def __init__(self, cost):
        self.cost = cost

class GiftMixin(object):
    def is_sutible_gift(self):
        return True

class NotGiftMixin(object):
    def is_sutible_gift(self):
        return False

class PareMixin(object):
    def eat_method(self):
        return "Pare"
    
class HuskMixin(object):
    def eat_method(self):
        return "Husk"

class Apple(Fruit, GiftMixin, HuskMixin):
    def __init__(self, cost):
        super().__init__(cost)

class Banana(Fruit, NotGiftMixin, PareMixin):
    def __init__(self, cost):
        super().__init__(cost)
In [268]:
apple = Apple(100)
banana = Banana(50)
print("Is apple a sutible gift?", apple.is_sutible_gift())
print("Is apple a sutible gift?", banana.is_sutible_gift())
print("How to eat apple?", apple.eat_method())
print("How to eat banana?", banana.eat_method())
Is apple a sutible gift? True
Is apple a sutible gift? False
How to eat apple? Husk
How to eat banana? Pare

再來看另外一個例子!

實作自定義容器

In [271]:
class ValueMixin(object):
    def __getitem__(self, id):
        return self.data[id]

    def __setitem__(self, id, value):
        self.data[id] = value

    def __delitem__(self, id):
        del self.data[id]

class CompareMixin(object):
    def __eq__(self, other):
        return (isinstance(other, self.__class__) and 
                self.__dict__ == other.__dict__)

    def __ne__(self, other):
        return not self.__eq__(other)
    
class SimpleItemContainer(ValueMixin, CompareMixin):
    def __init__(self):
        self.data = {}
In [272]:
object1 = SimpleItemContainer()
object1["aa"] = 111
object1["b"] = 2
print(object1["aa"], object1["b"])

print(object1.__dict__)
del object1["aa"]
print(object1.__dict__)
print()

object2 = SimpleItemContainer()
print("IS object1 equal object2?", object1 == object2)
print("IS object1 not equal object2?", object1 != object2)
object2["b"] = 2
print("IS object1 equal object2?", object1 == object2)
111 2
{'data': {'b': 2, 'aa': 111}}
{'data': {'b': 2}}

IS object1 equal object2? False
IS object1 not equal object2? True
IS object1 equal object2? True

Summary

  • 透過 Mix-in 來組出複雜的功能性,而不是設計多層次複雜的繼承關係
    • 在 Django 和 Tkinter 裡都可看到 Mix-in 的蹤跡

27 優先選用公開屬性而非私有屬性

Python 中的類別屬性可見性分為兩種

  • 公開
    • 任何人都可以存取
  • 私有
    • 只能在該類別區塊中存取

我們先來看一下私有屬性的效果

In [84]:
# 無法直接透過物件存取私有屬性(可間接透過方法)

class Myobject(object):
    def __init__(self):
        self.public_field = 10
        self.__private_field = 20
    
    def get_private_field(self):
        return self.__private_field

object1 = Myobject()
print(object1.public_field)

print(object1.get_private_field())

print(object1.__private_field)
10
20
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-84-45fb60191ae6> in <module>()
     12 print(object1.get_private_field())
     13 
---> 14 print(object1.__private_field)

AttributeError: 'Myobject' object has no attribute '__private_field'
In [88]:
# 子類別也無法存取父類別的私有屬性

class Father(object):
    def __init__(self):
        self.public_field = 10
        self.__private_field = 20

class Child(Father):
    def get_private_field(self):
        return self.__private_field

object1 = Child()
print(object1.public_field)

print(object1.get_private_field())
10
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-88-0692c11e4b9f> in <module>()
     11 print(object1.public_field)
     12 
---> 13 print(object1.get_private_field())

<ipython-input-88-0692c11e4b9f> in get_private_field(self)
      6 class Child(Father):
      7     def get_private_field(self):
----> 8         return self.__private_field
      9 
     10 object1 = Child()

AttributeError: 'Child' object has no attribute '_Child__private_field'
In [273]:
# 類別方法(class method)可存取私有屬性,因為是定義在類別的區塊中

class Myobject(object):
    def __init__(self):
        self.public_field = 10
        self.__private_field = 20
    
    @classmethod
    def get_private_field(cls, instance):
        return instance.__private_field

object1 = Myobject()
print(object1.public_field)

print(Myobject.get_private_field(object1))

print(object1.__private_field)
10
20
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-273-6921a5bb26ca> in <module>()
     15 print(Myobject.get_private_field(object1))
     16 
---> 17 print(object1.__private_field)

AttributeError: 'Myobject' object has no attribute '__private_field'
In [104]:
# 無法直接透過物件存取私有屬性(可間接透過方法)

class Myobject(object):
    public_field = 10
    __private_field = 20
    
    def get_private_field(self):
        return self.__private_field

print(Myobject.public_field)
print(Myobject.__private_field)
10
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-104-94c55fa985c7> in <module>()
      7 
      8 print(Myobject.public_field)
----> 9 print(Myobject.__private_field)

AttributeError: type object 'Myobject' has no attribute '__private_field'
In [105]:
object1 = Myobject()
print(object1.public_field)

print(object1.get_private_field())
10
20
In [116]:
# [Bonus] 類別屬性&實體屬性

class Myobject(object):
    public_field = 10
    
    def set_public_field(self, number):
        self.public_field = number

object1 = Myobject()
print("Class:", Myobject.public_field)
print("Instance:", object1.public_field)

object1.set_public_field(20)
print("#####################")

print("Class:", Myobject.public_field)
print("Instance:", object1.public_field)

Myobject.public_field = 30
print("#####################")

print("Class:", Myobject.public_field)
print("Instance:", object1.public_field)

object1.public_field = 40
print("#####################")

print("Class:", Myobject.public_field)
print("Instance:", object1.public_field)
Class: 10
Instance: 10
#####################
Class: 10
Instance: 20
#####################
Class: 30
Instance: 20
#####################
Class: 30
Instance: 40

[注意] 其實 Python 根本沒有私有屬性,只是名字被改變找不到而已!

怎麼知道改變後的名字呢? 透過 __dict__ 來找!

__dict__

  • 紀錄類別和實體物件的屬性資料
In [131]:
class Myclass(object):
    class_value = 5
    
    def __init__(self):
        self.instance_value = 10
        
    def hello(self):
        print("hello")

print(Myclass.__dict__)
print()

object1 = Myclass()
print(object1.__dict__)
{'__weakref__': <attribute '__weakref__' of 'Myclass' objects>, 'hello': <function Myclass.hello at 0x7f6bb9f74d08>, '__init__': <function Myclass.__init__ at 0x7f6bb9f74e18>, '__dict__': <attribute '__dict__' of 'Myclass' objects>, '__module__': '__main__', '__doc__': None, 'class_value': 5}

{'instance_value': 10}
In [274]:
class Myclass(object):
    """doctstring for Myclass"""
    
    class_value = 5
    
    def __init__(self):
        self.instance_value = 10
        
    def hello(self):
        print("hello")

print(Myclass.__dict__)
{'__weakref__': <attribute '__weakref__' of 'Myclass' objects>, 'hello': <function Myclass.hello at 0x7f6bb96246a8>, '__init__': <function Myclass.__init__ at 0x7f6bb9624730>, '__dict__': <attribute '__dict__' of 'Myclass' objects>, 'class_value': 5, '__module__': '__main__', '__doc__': 'doctstring for Myclass'}

我們來看看私有的屬性名稱被改成什麼樣了

In [125]:
class Myobject(object):
    def __init__(self):
        self.public_field = 10
        self.__private_field = 20
    
    def get_private_field(self):
        return self.__private_field

object1 = Myobject()
print(object1.__dict__)
print()
print(object1._Myobject__private_field)
{'_Myobject__private_field': 20, 'public_field': 10}

20
In [127]:
class Father(object):
    def __init__(self):
        self.public_field = 10
        self.__private_field = 20

class Child(Father):
    def get_private_field(self):
        return self._Father__private_field

object1 = Child()
print(object1.__dict__)
print()
print(object1.get_private_field())
print(object1._Father__private_field)
{'_Father__private_field': 20, 'public_field': 10}

20
20
In [129]:
class Myobject(object):
    public_field = 10
    __private_field = 20
    
    def get_private_field(self):
        return self.__private_field

print(Myobject.__dict__)
print()
print(Myobject._Myobject__private_field)
{'__weakref__': <attribute '__weakref__' of 'Myobject' objects>, '__dict__': <attribute '__dict__' of 'Myobject' objects>, '__module__': '__main__', 'public_field': 10, 'get_private_field': <function Myobject.get_private_field at 0x7f6bb9f14400>, '__doc__': None, '_Myobject__private_field': 20}

20

私有屬性的使用會讓子類別的覆寫和擴充變得麻煩、容易出錯

建議的使用時機是在避免父類別與子類別的屬性名稱發生衝突

In [199]:
# 名稱同樣,但因為私有屬性的關係,使用上不衝突

class Father(object):
    def __init__(self):
        self.__myown_value = 10
    
    def get_father_value(self):
        return self.__myown_value

class Child(Father):
    def __init__(self):
        super().__init__()
        self.__myown_value = 20
        
    def get_child_value(self):
        return self.__myown_value
        
object1 = Child()
print(object1.get_father_value())
print(object1.get_child_value())
print(object1.__dict__)
10
20
{'_Child__myown_value': 20, '_Father__myown_value': 10}
In [200]:
# 名稱同樣,但因為不是私有屬性的關係,使用上衝突

class Father(object):
    def __init__(self):
        self.myown_value = 10
    
    def get_father_value(self):
        return self.myown_value

class Child(Father):
    def __init__(self):
        super().__init__()
        self.myown_value = 20
        
    def get_child_value(self):
        return self.myown_value
        
object1 = Child()
print(object1.get_father_value())
print(object1.get_child_value())
print(object1.__dict__)
20
20
{'myown_value': 20}
In [201]:
# 名稱同樣,但因為不是私有屬性的關係,使用上衝突

class Father(object):
    def __init__(self):
        self.myown_value = 10
    
    def get_father_value(self):
        return self.myown_value

class Child(Father):
    def __init__(self):
        self.myown_value = 20
        super().__init__()
        
    def get_child_value(self):
        return self.myown_value
        
object1 = Child()
print(object1.get_father_value())
print(object1.get_child_value())
print(object1.__dict__)
10
10
{'myown_value': 10}

Python 的一個命名慣例,前綴有單一底線的欄位是受保護的,代表該類別的外部使用者應小心使用

  • 沒有實際的限制效力
    • 等同於公開屬性
  • 建議搭配文字說明每個受保護的欄位
    • 解釋哪些是子類別可用的以及哪些是不要更動的
In [204]:
class Father(object):
    def __init__(self, value):
        # 儲存使用者提供的值
        # 視為不可變變數
        self._value = value

class Child(Father):
    def __init__(self, value):
        super().__init__(value)
        
    def get_value(self):
        return self._value
    
object1 = Child(100)
print(object1.get_value())
print(object1.__dict__)
100
{'_value': 100}

Summary

  • Python 並沒有強制施行私有屬性的存取限制
  • 一開始就規劃好,讓子類別能用內部 API 與屬性做更多事情,而非預設就鎖住
  • 使用受保護欄位的說明文件來指引子類別,而非預設就鎖住
  • 只為了在避免與你無法控制的子類別發生名稱衝突時,才考慮使用私有屬性

28 繼承 collections.abc 以建立自訂的容器型別

Python 內建的容器型別: list, set, tuple, dict

對於簡單的應用,可直接繼承 Python 內建的容器型別

In [275]:
class SimpleList(list):
    def __init__(self):
        super().__init__()

object1 = SimpleList()
object1.append(1)
object1.append(2)
object1.append(3)
print(object1)
print(object1[0])
del object1[1]
print(object1)
[1, 2, 3]
1
[1, 3]
In [222]:
class FrequencyList(list):
    def __init__(self, values):
        super().__init__(values)
    
    def frequency(self):
        counts = {}
        for i in self:
            counts.setdefault(i, 0)
            counts[i] += 1
        return counts

object1 = FrequencyList(["a", "c", "b", "c", "e", "a"])
print("object1's length is", len(object1))
print("object1's frequency info is", object1.frequency())
object1.pop()
print("object1's length is", len(object1))
print("object1's frequency info is", object1.frequency())
object1's length is 6
object1's frequency info is {'b': 1, 'c': 2, 'a': 2, 'e': 1}
object1's length is 5
object1's frequency info is {'b': 1, 'c': 2, 'a': 1, 'e': 1}

如需自訂容器型別,可尋求 collections.abc 模組的輔助

collections.abc

  • 定義一組抽象基礎類別,並提供每個容器型別所有的典型方法
  • 當從抽象類別衍生出子類別時,若忘記實作必要方法,就會有錯誤訊息告知
  • 當確實實作了必要方法,會自動提供所有額外的方法
In [227]:
from collections.abc import Sequence

class BadType(Sequence):
    pass

object1 = BadType()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-227-d27245bd6296> in <module>()
      4     pass
      5 
----> 6 object1 = BadType()

TypeError: Can't instantiate abstract class BadType with abstract methods __getitem__, __len__

實作二元樹,並採用 BFS 尋訪各個 Node

  • 此二元樹可使用 index 取得特定 node 的值
    • index 與 BFS 尋訪的順序一致
  • 此二元樹的長度定義為 node 的個數
In [278]:
class BinaryNode(object):
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

class SequenceNode(BinaryNode, Sequence):
    def __init__(self, value, left=None, right=None):
        super().__init__(value, left, right)
        self._BFS_resut = []
        self._parse_tree()
    
    # BFS
    def _parse_tree(self):
        self._BFS_resut.append(self.value)
        search_list = [self]
        while search_list:
            current_node = search_list[0]
            del search_list[0]
            if current_node.left:
                search_list.append(current_node.left)
                self._BFS_resut.append(current_node.left.value)
            if current_node.right:
                search_list.append(current_node.right)
                self._BFS_resut.append(current_node.right.value)
    
    def __getitem__(self, index):
        return self._BFS_resut[index]
        
    def __len__(self):
        return len(self._BFS_resut)
In [279]:
tree = SequenceNode(10, SequenceNode(5, SequenceNode(2)), SequenceNode(23, SequenceNode(9), SequenceNode(16)))
print(len(tree))
print(tree[4])
print([i for i in tree])
6
9
[10, 5, 23, 2, 9, 16]