Table of Contents
1. 宣告 Python String Object
1.1 單引號:’
用來宣告普通的 string object,或是當 string 裡面有雙引號:” 可以使用,基本上和用雙引號:” 宣告一樣,沒辦法用來宣告多行的 string object:
my_string = 'My name is Jimmy'
print(my_string)
# My name is Jimmy
my_string = 'Jimmy is "really" handsome '
print(my_string)
# Jimmy is "really" handsome
1.2 雙引號:”
用來宣告普通的 string object,或是當 string 裡面有單引號:’ 可以使用,基本上和用單引號:’ 宣告一樣,沒辦法用來宣告多行的 string object:
my_string = "My name is Jimmy"
print(my_string)
# My name is Jimmy
my_string = "Jimmy is 'really' handsome "
print(my_string)
# Jimmy is 'really' handsome
1.3 三引號:”’, “””
和以上兩個方法最大的差別在於:可以用來宣告多行的 string object:
my_string = """My name is Jimmy"""
print(my_string)
# My name is Jimmy
my_string = '''Jimmy is 'really' handsome'''
print(my_string)
# Jimmy is 'really' handsome
my_string = """My name is Jimmy
Jimmy is "really" handsome"""
print(my_string)
# My name is Jimmy
# Jimmy is "really" handsome
my_string = '''My name is Jimmy
Jimmy is 'really' handsome'''
print(my_string)
# My name is Jimmy
# Jimmy is 'really' handsome
1.4 str class
可以用 str 這個 class 來 new 出 string object,這個 class 接收幾個參數:
str(object, encoding=encoding, errors=errors)
1.4.1 parameter – object
第一個 parameter 為任何可以轉成 string 的 object:
my_string = str('Hello world')
print(my_string)
# Hello world
my_string = str(True)
print(my_string)
# True
my_string = str(None)
print(my_string)
# None
my_string = str(range(2))
print(my_string)
# range(0, 2)
1.4.2 parameter – encoding
當第一個 argument 傳入的 object 是 bytes,可以用來指定這個 string object 要用什麼方法 decode,預設為 UTF-8:
# Declare a byte object
b = bytes('Café', encoding='utf-8')
# Convert UTF-8 byte object to ASCII
print(str(b, encoding='ascii'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)
如果無法 decode 的話,會直接噴錯:UnicodeDecodeError,這時候可以用第三個 argument 來處理 error
1.4.3 parameter – errors
第三個 parameter errors 用來決定當無法 decode 時,要對 character 做什麼錯誤處理,有以下幾個選項:
- strict – default response which raises a UnicodeDecodeError exception on failure
- ignore – ignores the unencodable unicode from the result
- replace – replaces the unencodable unicode to a question mark ?
- xmlcharrefreplace – inserts XML character reference instead of unencodable unicode
- backslashreplace – inserts a \uNNNN escape sequence instead of unencodable unicode
- namereplace – inserts a \N{…} escape sequence instead of unencodable unicode
資料來源:Python String encode() – Programiz
b = bytes('Café', encoding='utf-8')
print(str(b, encoding='ascii', errors='ignore'))
# Caf
print(str(b, encoding='ascii', errors='replace'))
# Caf��
print(str(b, encoding='ascii', errors='backslashreplace'))
# Caf\xc3\xa9
1.5 F-string, f-string (Python 3.6+)
F-string,通常用來 format string(下一節會講到為什麼需要 format string),但也可以用來宣告 string object,用法是 F 或 f 後面加上用引號宣告 string object 的方式:
my_string = F'Hello World'
print(my_string)
# Hello World
my_string = f'Hello World'
print(my_string)
# Hello World
my_string = f"""Hello
World"""
print(my_string)
# Hello
# World
2. Python String Formatting
使用 string formatting – 字串格式化,主要有兩個目的:
- 讓 string 可以接收 variable
- 可以直接將不同 type 的 variable 轉換成 string
在 Python 中使用 string formatting 有以下幾個方法:
2.1 % Operator
使用 % operator 可以將 string 內的特定字串用後面的 argument 替換掉,特定字串有:
- %c – single character
- %s – string
- %d, %i – decimal integer ( 十進位整數 )
- %f – float ( 十進位浮點數 )
- %o – octal integer ( 八進位整數 )
- %x – hexadecimal integer using lowercase letters (a-f) ( 十六進位整數 )
- %X – hexadecimal integer using uppercase letters (A-F) ( 十六進位整數 )
- %u – unsigned decimal integer
- %e – exponential notation with a lowercase ‘e’
- %E – exponential notation with an uppercase ‘E’
- %g – %f 和 %e 的簡寫
- %G – %f 和 %E 的簡寫
% 後面可以直接接特定型別的 object 或是 variable,但若是前後型別對不起來,可能會噴錯
my_string = '%d is an integer' % 10
print(my_string)
# 10 is an integer
my_integer = 10
my_string = '%d is an integer' % my_integer
print(my_string)
# 10 is an integer
my_string = '%d is an integer' % 'a'
print(my_string)
# TypeError: %d format: a real number is required, not str
範例:
my_char = 'a'
my_string = '%c is a character' % my_char
print(my_string)
# a is a character
my_int = 10
my_string = '%i is an integer' % my_int
print(my_string)
# 10 is an integer
my_float = 10.5
my_string = '%f is a float' % my_float
print(my_string)
# 10.500000 is a float
my_int = 15
my_string = "%i in octal is %o" % (my_int, my_int)
print(my_string)
# 15 in octal is 17
my_int = 15
my_string = "%i in hex is %x" % (my_int, my_int)
print(my_string)
# 15 in hex is f
可以在 % 後面加上數字,指定這個 object 至少要用幾個 character:
my_city = "Taipei"
my_string = "%s is my home." % my_city
print(my_string)
# Taipei is my home.
my_city = "Taipei"
my_string = "%10s is my home." % my_city
print(my_string)
# Taipei is my home.
my_city = "Taipei"
my_string = "%-10s is my home." % my_city
print(my_string)
# Taipei is my home.
%f 可以在 % 後面加上 .數字,用來表示這個 float 要取多少位(預設 6 位):
my_float = 10.6354687
my_string = "There is a float: %f" % my_float
print(my_string)
# There is a float: 10.635469
my_float = 10.6354687
my_string = "There is a float: %.2f" % my_float
print(my_string)
# There is a float: 10.64
%x 可以在 % 後面加上 0數字,用來表示這個 16 進位的數字要幾個 characters,不夠的補0:
my_int = 15
my_string = "%i in hex is %x" % (my_int, my_int)
print(my_string)
# 15 in hex is f
my_int = 15
my_string = "%i in hex is %04x" % (my_int, my_int)
print(my_string)
# 15 in hex is 000f
2.2 str.format() (Python 2.7+)
Python 2.7 後支援新的 formatting:str.format(),str 裡面用 {} 作為特殊字元,可以在 format() 的 argument 傳入變數或 object,用來替換 str 裡面的 {}:
my_string = 'My name is {}, my age is {}'.format('Jimmy', 10)
print(my_string)
# My name is Jimmy, my age is 10
name = 'Jimmy'
age = 10
my_string = 'My name is {}, my age is {}'.format(name, age)
print(my_string)
# My name is Jimmy, my age is 10
{} 裡面可以放要傳入的 argument:
my_string = 'My name is {1}, my age is {0}'.format(10, 'Jimmy')
print(my_string)
# My name is Jimmy, my age is 10
也可以直接放入 variable name:
my_string = 'My name is {name}, my age is {age}'.format(age = 10, name = 'Jimmy')
print(my_string)
# My name is Jimmy, my age is 10
2.2.1 format 數字
可以用以下方法在數字中加上 , :
'{:,}'.format(1234567890)
# '1,234,567,890'
2.2.2 format datetime
import datetime
d = datetime.datetime(2010, 7, 4, 12, 15, 58)
'{:%Y-%m-%d %H:%M:%S}'.format(d)
# '2010-07-04 12:15:58'
2.3 F-string / f-string ( Python 3.6+ )
Python 3.6 後支援 F-string,增加 code 的可閱讀性:
假設今天有個 string 需要有許多 variable 傳入,用 string.format 就會變得比較長:
my_name = 'Jimmy'
my_age = 10
my_city = 'Taipei city'
my_string = 'My name is {my_name}, my age is {my_age}, I\'m living in {my_city}'.format(
my_name = my_name,
my_age = my_age,
my_city= my_city
)
print(my_string)
# My name is Jimmy, my age is 10, I'm living in Taipei city
如果改成 F-string,可以直接將變數放到 string 裡面:
my_name = 'Jimmy'
my_age = 10
my_city = 'Taipei city'
my_string = f'My name is {my_name}, my age is {my_age}, I\'m living in {my_city}'
print(my_string)
# My name is Jimmy, my age is 10, I'm living in Taipei city
也可以直接在 {} 裡面做運算:
my_age = 10
years = 15
my_string = f'My age after {years} years is {my_age + years}'
print(my_string)
# My age after 10 years is 20
2.4 Template Strings (Standard Library)
雖然 f-string 滿方便的,但如果是直接讓使用這輸入 variable 的話,有可能讓使用者有機會取得敏感的資訊:
# 某個敏感變數
SECRET = 'this-is-a-secret'
class Error:
def __init__(self):
pass
# 使用者透過自己宣告的 class 來取得全域變數
user_input = '{error.__init__.__globals__[SECRET]}'
err = Error()
print(user_input.format(error=err))
# this-is-a-secret
使用 Template string 可以避免發生這種情況。
user_input = '${error.__init__.__globals__[SECRET]}'
Template(user_input).substitute(error=err)
ValueError:
"Invalid placeholder in string: line 1, col 1"
在使用 Template string 之前,需要先從 string module import Template class,才能使用。用法是將欲輸入的變數前面加上’$’:
from string import Template
s = Template('Hey, $name!')
print(s.substitute(name = 'Jimmy'))
# Hey, Jimmy!
2.5 結論 – 該使用哪種方法做 string formatting
- string 中的變數是直接從使用者 input 而來 – String Template
- string 中的變數不是從使用者輸入而來,且 Python 3.6+,使用 f-string
- string 中的變數不是從使用者輸入而來,且 Python 3.6-,使用 str.format
參考圖:
3. String Operator
3.1 + (Concatenate Operator)
string1 = 'Hello '
string2 = 'World'
print(string1 + string2)
# Hello World
print(string1)
# Hello
print(string2)
# World
3.2 * (Repetition Operator)
string = 'Hi'
print(string * 3)
# HiHiHi
3.3 [] (Slicing Operator)
string = 'Jimmy'
# 取得某個 index 的 char
print(string[0])
# J
# 取得最後一個 char
print(string[-1])
# y
# 取得第 1 個 index 到第 4 個 index 的字串(含頭不含尾)
print(string[1:4])
# imm
# 取得第 1 個 index 以後的字串(含第 1 個)
print(string[1:])
# immy
# 取得第 4 個 index 以前的字串(不含第 4 個)
print(string[:4])
# Jimm
# 反轉字串
print(string[::-1])
# ymmiJ
3.4 in, not in(Membership Operator)
判斷某個 string 是否包含另一個 string:
string = 'Jimmy'
print('J' in string)
# True
print('Ji' in string)
# True
print('j' in string)
# False
print('k' in string)
# False
print('k' not in string)
# True
3.5 \ (Escape Sequence Operator)
用來在 string 中表示特殊字元:
- \’ – 單引號「’」
- \”- 雙引號「”」
- \\ – 反斜線「\」
- \n – 換行(new line)
- \r – 游標移到列首
- \t – tab 鍵
- \b – backspace
- \f – 換頁
- \x – hex value
- \o – octal value
string = 'Jimmy is handsome \nbut poor'
print(string)
Jimmy is handsome
but poor
string = 'Jimmy is \n\tgreat \n\ttall \n\tpoor'
print(string)
Jimmy is
great
tall
poor
4. String Attribute
使用一些方法取得 string object 的 attribute
4.1 len()
取得 string object 的 length
greet = 'Hello'
len(greet)
# 5
4.2 r””
取得 raw string
string = 'Jimmy is handsome \nbut poor'
print(string)
Jimmy is handsome
but poor
string = r'Jimmy is handsome \nbut poor'
print(string)
Jimmy is handsome \nbut poor
4.3 find
str.find(sub[, start[, end]])
從 str[start, end] 開始找 sub,回傳符合的第一個 index,否則回傳 -1
string = 'Hello World'
print(string.find('W'))
# 6
print(string.find('W', 0, 5))
# -1
print(string.find('or'))
# 7
print(string.find('z'))
# -1
4.4 index
str.index(sub[, start[, end]])
和 find 類似,但找不到會噴錯:ValueError
string = 'Hello World'
print(string.index('W'))
# 6
print(string.index('or'))
# 7
print(string.index('z'))
# ValueError: substring not found
4.5 rfind
str.rfind(sub[, start[, end]])
和 find 類似,但是會回傳符合最後一個的 index,都找不到就回傳 -1
string = 'Hello World'
print(string.rfind('o'))
# 7
print(string.rfind('o', 0, 5))
# 4
print(string.rfind('z'))
# -1
4.6 rindex
str.rindex(sub[, start[, end]])
和 index 類似,但是會回傳符合最後一個的 index,都找不到就噴錯
string = 'Hello World'
print(string.rindex('o'))
# 7
print(string.rindex('or'))
# 7
print(string.rindex('z'))
# ValueError: substring not found
4.7 count
str.count(sub[, start[, end]])
計算 sub 在 str 中出現的次數
string = 'Hello World'
print(string.count('l'))
# 3
print(string.count('l', 0, 5))
# 2
5. String Methods – 操作 (不改變原 object)
介紹一下 string 常用的 built-in methods
5.1 upper
將所有 character 轉成大寫
string = 'hello world'
print(string.upper())
# HELLO WORLD
print(string)
# hello world
5.2 lower
將所有 character 轉成小寫
string = 'Hello WORLD'
print(string.lower())
# hello world
print(string)
# Hello WORLD
5.3 capitalize
將第一個字母大寫
string = 'hello world'
print(string.capitalize())
# Hello world
print(string)
# hello world
5.4 swapcase
轉換大小寫
string = 'Hello World'
print(string.swapcase())
# hELLO wORLD
print(string)
# Hello World
5.5 replace
str.replace(old, new[, count])
替換 string 中的某些 characters
string = 'Hello World'
print(string.replace('H', ''))
# ello World
print(string.replace('Hello', 'Wow'))
# Wow World
print(string)
# Hello World
5.6 strip
str.strip([chars])
argument 可以傳入一段從 string object 兩邊希望刪除的 string,也可以不傳,default 為 whitespace
' spacious '.strip()
'spacious'
'www.example.com'.strip('cmowz.')
'example'
comment_string = '#....... Section 3.2.1 Issue #32 .......'
comment_string.strip('.#! ')
'Section 3.2.1 Issue #32'
5.7 removeprefix (3.9+)
可以移除前綴 string
string = 'HelloWorld'
print(string.removeprefix('Hello'))
# World
print(string.removeprefix('World'))
# HelloWorld
print(string)
# HelloWorld
string = 'tw-text-bold'
print(string.removeprefix('tw-'))
# text-bold
print(string)
# tw-text-bold
5.8 removesuffix (3.9+)
可以移除後綴 string
string = 'HelloWorld'
print(string.removesuffix('Hello'))
# HelloWorld
print(string.removesuffix('World'))
# Hello
print(string)
# HelloWorld
6. String Methods – 轉換成其他 object
6.1 split
str.split(sep=None, maxsplit=- 1)
將 string 轉成 list,可以傳入 sep argument,選擇要分隔的符號:
string = 'good great bad'
print(string.split())
# ['good', 'great', 'bad']
string = 'apple,banana,orange'
print(string.split(','))
# ['apple', 'banana', 'orange']
6.2 partition
str.partition(sep)
將 string 以 sep 切成三段,並回傳 tuple:
string = 'I\'m very handsome!'
print(string.partition('very'))
# ("I'm ", 'very', ' handsome!')
6.3 encode
str.encode(encoding='utf-8', errors='strict')
將 str 以指定的 encoding encode 成 bytes
my_bytes = 'hello'.encode()
print(my_bytes)
# b'hello'
print(my_bytes.__class__)
# <class 'bytes'>
7. Iterate string
7.1 for … in
string = 'Hello'
for s in string:
print(s)
H
e
l
l
o
7.2 for … in with enumerate
除了 value 外,還可以取得 index:
string = 'Hello'
for i, s in enumerate(string):
print(f'index: {i}, char: {s}')
index: 0, char: H
index: 1, char: e
index: 2, char: l
index: 3, char: l
index: 4, char: o
參考資料
Python str() function – GeeksforGeeks
Built-in Types — Python 3.11.4 documentation
Python String encode() – Programiz
[Python] 字串格式化. 前言| by Tsung-Yu | Tom’s blog – Medium
Python String Formatting Best Practices
Python String Interpolation with the Percent (%) Operator
string — Common string operations — Python 3.11.4
Examples of String Operators in Python – EDUCBA
Iterating each character in a string using Python – Stack Overflow
如果覺得我的文章有幫助的話,歡迎幫我的粉專按讚哦~謝謝你!