SlideShare una empresa de Scribd logo
1 de 14
Descargar para leer sin conexión
WWW 新生訓練
Transaction, Migration, Worker
黃瑞安 2017
Transaction
在我們的場地借用申請流程裡面一開始是這樣:
在使用者提出申請表之後我們必須去檢查要求的這個時段的場地是否可以借用,
那這段邏輯要怎麼寫呢?
Transaction
若根據我們之前的設計,我們有一張 reservations 表,每當申請單成功提交之後,
我們會在這個表上面新增一筆申請單對應的借用資訊,其中:
state 則是借用狀態,根據我們之前的狀態轉移圖,他會有三種狀態:
使用者佔用、可借用、管理員佔用
field_id 則是指向場地 id
application_id 則是指向申請單 id
date, start_at, end_at 則是代表借用的日期、開始與結束時間
Transaction
那我們的第一步驟就是要去查詢場地使否被其他申請單佔據:
SELECT count(1)
FROM reservations
WHERE field_id = ${my_field_id}
AND state != ‘可借用’
AND date = ${my_date}
AND ${my_start_at} < end_at
AND ${my_end_at} > start_at;
若回傳的 count 是 0,則代表該時段的場地沒被佔用,那我們就可以新增一筆
INSERT INTO reservations (field_id, state, date, start_at, end_at, ...) VALUES
(${my_field_id}, ‘使用者佔用’, ${my_date}, ${my_start_at}, ${my_end_at})
這樣就完成鎖定該時段了,然後我們可以再把申請單創出來,再接續後續動作
INSERT INTO applications (...) VALUES (...);
Transaction
所以我們的步驟有三部:
1. 檢查場地時段是否有被佔用
2. 若沒有,則創立場地佔用紀錄
3. 然後再創建申請單
但這樣的操作可能會有兩個問題:
1. 如果創建完佔用紀錄之後,程式噴 error 或死機導致申請單沒有創建怎麼辦?
2. 如果同時有兩個人提出同一場地時段的申請怎麼辦?
Transaction
一般的 RDBMS 都有透過 Transaction 功能提供 ACID 性質,這四個性質對業務邏
輯非常重要,可以讓我們開發變得非常容易:
我們的第一個問題就可以利用 Transaction Atomic 特性解決,如果中途發生錯誤沒
有 COMMIT 的話,那麼資料庫就會幫我們 ROLLBACK。
第二個問題,則需要透過設定 Transaction 的 Isolation Level 來解決
Atomic 事務作為一個整體被執行,包含在其中的對資料庫的操作要麼全部被執行,要麼都
不執行
Consistency 事務應確保資料庫的狀態從一個一致狀態轉變為另一個一致狀態。一致狀態的含義
是資料庫中的數據應滿足完整性約束
Isolation 多個事務並發執行時,一個事務的執行不應影響其他事務的執行
Durability 已被提交的事務對資料庫的修改應該永久保存在資料庫中
Transaction Isolation Level
Isolation Level 是用來做 Transaction Concurrent Control 的設定,來管理一個
Transaction 會對其他 Concurrent Transaction 造成什麼影響,首先要先了解四個在
Concurrent Transaction 可能會發生的現象,每個 Isolation level 分別則是保證讓這
些不能發生。
四個現象:
Dirty Read A transaction reads data written by a concurrent uncommitted transaction.
Nonrepeatable Read A transaction re-reads data it has previously read and finds that data has been
modified by another transaction
Phantom Read A transaction re-executes a query returning a set of rows that satisfy a search
condition and finds that the set of rows satisfying the condition has changed due to
another recently-committed transaction.
Serialization Anomaly The result of successfully committing a group of transactions is inconsistent with all
possible orderings of running those transactions one at a time.
Transaction Isolation Level
四個 Isolation level 則是保證:
以我們的情況,在檢查是否場地有被佔用時,不能有 Phantom Read 的現象,所以我
們要選擇 Serializable Transaction 來處理這段業務邏輯。
然而在使用 Repeatable Read 跟 Serialization 這兩個等級時,要特別注意的是一旦
資料庫發現 Transaction 有可能會違反它們的保證時將會結束該 Transaction 並返
回 Error,所以我們的業務邏輯也必須做好 Transaction 失敗的準備
Isolation level Dirty Read Nonrepeatable
Read
Phantom Read Serialization
Anomaly
Read
Uncommitted
Possible Possible Possible Possible
Read
Committed
Not Possible Possible Possible Possible
Repeatable
Read
Not Possible Not Possible Possible Possible
Serializable Not Possible Not Possible Not Possible Not Possible
Transaction
所以最終我們的完整 SQL 操作則是:
BEGIN TRANSACTION LEVEL SERIALIZABLE;
SELECT count(1)
FROM reservations
WHERE field_id = ${my_field_id}
AND state != ‘可借用’
AND date = ${my_date}
AND ${my_start_at} < end_at
AND ${my_end_at} > start_at;
-- if count == 0
INSERT INTO reservations (field_id, state, date, start_at, end_at, ...) VALUES
(${my_field_id}, ‘使用者佔用’, ${my_date}, ${my_start_at}, ${my_end_at});
INSERT INTO applications (...) VALUES (...);
COMMIT;
Migration
通常我們把每次對資料表的變動寫成一隻腳本去執行,這件事情叫做 Migration
例如創表、新增欄位、新增 index 、移除欄位等操作可以歸類為 Schema Migration
還有另外一種伴隨著 Schema Migration 可能會需要的 Data Migration,例如把舊欄
位的資料搬移到新欄位。
在寫 Migration 腳本的時候,要特別注意的是,如果中間出錯一定要可以 Rollback,
尤其是牽涉到資料的 Migration。通常保險的做法是不要用刪除類的操作。
而如果在 Migration 時要求服務不能中斷的話還有更多事情要考慮
Zero Downtime Migration
Migration 對於線上的系統來說是非常危險的操作,一個不小心很有可能會鎖表或是
使得線上的程式崩潰導致服務中斷。有幾點要特別注意:
1. 新增欄位的時候必須確保欄位的預設值要是 null,如果不是設定成 null 的話,資
料庫會去更新現有的所有列把預設值設定上去,這樣的操作會導致現上的程式
在 Migration 完成之前無法取用該表。
2. 若欄位要重新命名,得分成三次 Migration:新增欄位,然後遷移資料到新欄位,
再將舊欄位刪除。然而第二步驟資料遷移如果是透過 Data Migration 的話是很
難做到 Zero Downtime 的,如果要 Zero Downtime 的話會需要應用層協助。特
別要注意整個過程中,每次 Migration 都不能使得線上程式崩潰。
3. 在新增 index 的時候要用 concurrent 模式,否則也會鎖表。
4. 某些資料庫物件如 view 或是 materialized view 要更新的話,須採的方式可能不
太一樣,如 view 要用 replace 而 materialized view 要先創一個新的、把舊的刪
掉、再把新的用舊的名字重新命名
Worker
在我們的需求中,需要寄信給審核者或是申請者通知他們各種事項。由於寄信的時間
太長,不適合放在一個 Http Request 裡面完成,所以會做成 Job 放在額外的 Worker
來做完。這個 Worker 可以是一個 cronjob 或是 while loop 去 DB 存取需要被處理的
job 來做,或是我們可以把 job 放進 queue 裡面去,由 worker 來消化。
在 Job 實作有一些地方需要考量:
1. Job 的狀態管理,以我們寄信的例子來說,我們會在意:
a. 是否等待被寄信?
b. 是否已經被 worker 處理中?
c. 是否已寄送成功?
d. 為什麼寄送失敗?
e. 是否需要重新寄送?
Worker
在 while loop worker 以及 consumer worker 的實作上則需要特別注意對外連線的
Error handling 與 Exception 處理,否則 Worker 可能會死掉或是沒在做事。
通常這兩類的 worker 應該要是長這樣子的結構:
while (true) {
try {
db_connection = db.connect();
// fetch jobs from db or mq and change the state of jobs and then do jobs.
} catch (err) {
log(err);
}
}
注意如上,對外部的連線應該要包含在 while 裡面,這樣子一來如果對外部連線壞掉
可以重新 loop 進來再建立連線,否則的話很可能 worker 對外部連線斷掉之後他就
不斷 loop 但是不做事。
新訓就這樣結束囉
麻煩大家給點意見
https://goo.gl/forms/bH3VuBhsItERwETf1
下星期會寄給大家一份驗收題目,下下星期會收回檢討,希望大家都可以自己認真做
喔!

Más contenido relacionado

Destacado

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destacado (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Www 2017第三次新生訓練:transaction, migration, worker

  • 3. Transaction 若根據我們之前的設計,我們有一張 reservations 表,每當申請單成功提交之後, 我們會在這個表上面新增一筆申請單對應的借用資訊,其中: state 則是借用狀態,根據我們之前的狀態轉移圖,他會有三種狀態: 使用者佔用、可借用、管理員佔用 field_id 則是指向場地 id application_id 則是指向申請單 id date, start_at, end_at 則是代表借用的日期、開始與結束時間
  • 4. Transaction 那我們的第一步驟就是要去查詢場地使否被其他申請單佔據: SELECT count(1) FROM reservations WHERE field_id = ${my_field_id} AND state != ‘可借用’ AND date = ${my_date} AND ${my_start_at} < end_at AND ${my_end_at} > start_at; 若回傳的 count 是 0,則代表該時段的場地沒被佔用,那我們就可以新增一筆 INSERT INTO reservations (field_id, state, date, start_at, end_at, ...) VALUES (${my_field_id}, ‘使用者佔用’, ${my_date}, ${my_start_at}, ${my_end_at}) 這樣就完成鎖定該時段了,然後我們可以再把申請單創出來,再接續後續動作 INSERT INTO applications (...) VALUES (...);
  • 5. Transaction 所以我們的步驟有三部: 1. 檢查場地時段是否有被佔用 2. 若沒有,則創立場地佔用紀錄 3. 然後再創建申請單 但這樣的操作可能會有兩個問題: 1. 如果創建完佔用紀錄之後,程式噴 error 或死機導致申請單沒有創建怎麼辦? 2. 如果同時有兩個人提出同一場地時段的申請怎麼辦?
  • 6. Transaction 一般的 RDBMS 都有透過 Transaction 功能提供 ACID 性質,這四個性質對業務邏 輯非常重要,可以讓我們開發變得非常容易: 我們的第一個問題就可以利用 Transaction Atomic 特性解決,如果中途發生錯誤沒 有 COMMIT 的話,那麼資料庫就會幫我們 ROLLBACK。 第二個問題,則需要透過設定 Transaction 的 Isolation Level 來解決 Atomic 事務作為一個整體被執行,包含在其中的對資料庫的操作要麼全部被執行,要麼都 不執行 Consistency 事務應確保資料庫的狀態從一個一致狀態轉變為另一個一致狀態。一致狀態的含義 是資料庫中的數據應滿足完整性約束 Isolation 多個事務並發執行時,一個事務的執行不應影響其他事務的執行 Durability 已被提交的事務對資料庫的修改應該永久保存在資料庫中
  • 7. Transaction Isolation Level Isolation Level 是用來做 Transaction Concurrent Control 的設定,來管理一個 Transaction 會對其他 Concurrent Transaction 造成什麼影響,首先要先了解四個在 Concurrent Transaction 可能會發生的現象,每個 Isolation level 分別則是保證讓這 些不能發生。 四個現象: Dirty Read A transaction reads data written by a concurrent uncommitted transaction. Nonrepeatable Read A transaction re-reads data it has previously read and finds that data has been modified by another transaction Phantom Read A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction. Serialization Anomaly The result of successfully committing a group of transactions is inconsistent with all possible orderings of running those transactions one at a time.
  • 8. Transaction Isolation Level 四個 Isolation level 則是保證: 以我們的情況,在檢查是否場地有被佔用時,不能有 Phantom Read 的現象,所以我 們要選擇 Serializable Transaction 來處理這段業務邏輯。 然而在使用 Repeatable Read 跟 Serialization 這兩個等級時,要特別注意的是一旦 資料庫發現 Transaction 有可能會違反它們的保證時將會結束該 Transaction 並返 回 Error,所以我們的業務邏輯也必須做好 Transaction 失敗的準備 Isolation level Dirty Read Nonrepeatable Read Phantom Read Serialization Anomaly Read Uncommitted Possible Possible Possible Possible Read Committed Not Possible Possible Possible Possible Repeatable Read Not Possible Not Possible Possible Possible Serializable Not Possible Not Possible Not Possible Not Possible
  • 9. Transaction 所以最終我們的完整 SQL 操作則是: BEGIN TRANSACTION LEVEL SERIALIZABLE; SELECT count(1) FROM reservations WHERE field_id = ${my_field_id} AND state != ‘可借用’ AND date = ${my_date} AND ${my_start_at} < end_at AND ${my_end_at} > start_at; -- if count == 0 INSERT INTO reservations (field_id, state, date, start_at, end_at, ...) VALUES (${my_field_id}, ‘使用者佔用’, ${my_date}, ${my_start_at}, ${my_end_at}); INSERT INTO applications (...) VALUES (...); COMMIT;
  • 10. Migration 通常我們把每次對資料表的變動寫成一隻腳本去執行,這件事情叫做 Migration 例如創表、新增欄位、新增 index 、移除欄位等操作可以歸類為 Schema Migration 還有另外一種伴隨著 Schema Migration 可能會需要的 Data Migration,例如把舊欄 位的資料搬移到新欄位。 在寫 Migration 腳本的時候,要特別注意的是,如果中間出錯一定要可以 Rollback, 尤其是牽涉到資料的 Migration。通常保險的做法是不要用刪除類的操作。 而如果在 Migration 時要求服務不能中斷的話還有更多事情要考慮
  • 11. Zero Downtime Migration Migration 對於線上的系統來說是非常危險的操作,一個不小心很有可能會鎖表或是 使得線上的程式崩潰導致服務中斷。有幾點要特別注意: 1. 新增欄位的時候必須確保欄位的預設值要是 null,如果不是設定成 null 的話,資 料庫會去更新現有的所有列把預設值設定上去,這樣的操作會導致現上的程式 在 Migration 完成之前無法取用該表。 2. 若欄位要重新命名,得分成三次 Migration:新增欄位,然後遷移資料到新欄位, 再將舊欄位刪除。然而第二步驟資料遷移如果是透過 Data Migration 的話是很 難做到 Zero Downtime 的,如果要 Zero Downtime 的話會需要應用層協助。特 別要注意整個過程中,每次 Migration 都不能使得線上程式崩潰。 3. 在新增 index 的時候要用 concurrent 模式,否則也會鎖表。 4. 某些資料庫物件如 view 或是 materialized view 要更新的話,須採的方式可能不 太一樣,如 view 要用 replace 而 materialized view 要先創一個新的、把舊的刪 掉、再把新的用舊的名字重新命名
  • 12. Worker 在我們的需求中,需要寄信給審核者或是申請者通知他們各種事項。由於寄信的時間 太長,不適合放在一個 Http Request 裡面完成,所以會做成 Job 放在額外的 Worker 來做完。這個 Worker 可以是一個 cronjob 或是 while loop 去 DB 存取需要被處理的 job 來做,或是我們可以把 job 放進 queue 裡面去,由 worker 來消化。 在 Job 實作有一些地方需要考量: 1. Job 的狀態管理,以我們寄信的例子來說,我們會在意: a. 是否等待被寄信? b. 是否已經被 worker 處理中? c. 是否已寄送成功? d. 為什麼寄送失敗? e. 是否需要重新寄送?
  • 13. Worker 在 while loop worker 以及 consumer worker 的實作上則需要特別注意對外連線的 Error handling 與 Exception 處理,否則 Worker 可能會死掉或是沒在做事。 通常這兩類的 worker 應該要是長這樣子的結構: while (true) { try { db_connection = db.connect(); // fetch jobs from db or mq and change the state of jobs and then do jobs. } catch (err) { log(err); } } 注意如上,對外部的連線應該要包含在 while 裡面,這樣子一來如果對外部連線壞掉 可以重新 loop 進來再建立連線,否則的話很可能 worker 對外部連線斷掉之後他就 不斷 loop 但是不做事。