A gene string can be represented by an 8-character long string, with choices from "A" , "C" , "G" , "T" .
Suppose we need to investigate about a mutation (mutation from "start" to "end"), where ONE mutation is defined as ONE single character changed in the gene string.
Suppose we need to investigate about a mutation (mutation from "start" to "end"), where ONE mutation is defined as ONE single character changed in the gene string.
For example, "AACCGGTT" -> "AACCGGTA" is 1 mutation.
Also, there is a given gene "bank", which records all the valid gene mutations. A gene must be in the bank to make it a valid gene string.
Now, given 3 things - start, end, bank, your task is to determine what is the minimum number of mutations needed to mutate from "start" to "end". If there is no such a mutation, return -1.
Now, given 3 things - start, end, bank, your task is to determine what is the minimum number of mutations needed to mutate from "start" to "end". If there is no such a mutation, return -1.
Note:
- Starting point is assumed to be valid, so it might not be included in the bank.
- If multiple mutations are needed, all mutations during in the sequence must be valid.
- You may assume start and end string is not the same.
Example 1:
start: "AACCGGTT" end: "AACCGGTA" bank: ["AACCGGTA"] return: 1
Example 2:
start: "AACCGGTT" end: "AAACGGTA" bank: ["AACCGGTA", "AACCGCTA", "AAACGGTA"] return: 2
Example 3:
start: "AAAAACCC" end: "AACCCCCC" bank: ["AAAACCCC", "AAACCCCC", "AACCCCCC"] return: 3
<Solution>
這題要使用 BFS (Breadth-First Search) 演算法來解題
BFS 的搜尋順序是「剛開始的狀態 -> 至少一次遷移可抵達的所有狀態
-> 至少兩次遷移可抵達的所有狀態 -> 至少三次遷移可抵達的所有狀態 -> ... 」
在 BFS 會使用 queue 來儲存每個要檢查的狀態
而因為 FIFO 的特性,離初始狀態比較近的狀態,都會被先檢查到
所以可以用來求出最短路線或最少步驟
這題來說,就從初始狀態出發
檢查 bank 中每一個字串,是否只要改變一個字元就可以符合
如果只要改變一個字元就可以符合,那就放到 queue,並標記使用過
最後和目標字串相等,就可以結束
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Solution { | |
public: | |
int minMutation(string start, string end, vector<string>& bank) { | |
//>> pair<string, int> : | |
//>> string: the gene string | |
//>> int: mutation number start from gene start string | |
queue<pair<string,int>> geneQue; | |
geneQue.push({start, 0}); | |
const int len = bank.size(); | |
vector<bool> isUsed(len, false); | |
//> BFS | |
while(!geneQue.empty()) { | |
const string gene = geneQue.front().first; | |
const int step = geneQue.front().second; | |
//> find ans | |
if(end.compare(gene) == 0) { | |
return step; | |
} | |
//>> put all posibile string into queue | |
for(int i = 0; i < len; i++) { | |
if(!isUsed[i] && diffCnt(gene, bank[i]) == 1) { | |
geneQue.push({bank[i], step+1}); | |
isUsed[i] = true; | |
} | |
} | |
geneQue.pop(); | |
} | |
return -1; | |
} | |
//>> check how many characters are different between two strings | |
int diffCnt(const string &str1, const string &str2) { | |
int cnt = 0; | |
for(int i = 0; i < str1.length(); i++) { | |
if(str1[i] != str2[i]) { | |
++cnt; | |
if(cnt > 1) { | |
return cnt; | |
} | |
} | |
} | |
return cnt; | |
} | |
}; |
思路如下
- 準備兩個 set,以及兩個指標,分別指到 size 比較小的 set 和 size 比較大的 set
- 要區分大小 set 的原因也是用來加速,每次都只歷遍比較小的 set 去找答案
- 如果在比較小的 set 中的 word,可以變一個字就變成在比較大的 set 中的 word,這樣就代表找到答案了。因為兩個 set 分別從兩端來找,當一個 set 裡面的 word 可以變化成另外一個 set 裡面的 word 的時候,就是相遇點,代表可以把兩端給串起來
- 如果不在另一個 set 裡面,那就檢查變化後的 word 是否可用,可用的話就放到 nextReachableSet,最後在和 smallerSet 做 swap,用來進行下次的檢查
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Solution { | |
public: | |
int minMutation(string start, string end, vector<string>& bank) { | |
//>> transform bank to hash map | |
unordered_set<string> bankSet; | |
for(const auto &g : bank) { | |
bankSet.insert(g); | |
} | |
//>> prepare two sets. one is from head and the other is from end | |
unordered_set<string> headSet, tailSet, *smallerSetPtr, *biggerSetPtr; | |
headSet.insert(start); | |
//>> check end is reachable or not | |
if(bankSet.find(end) != bankSet.end()) { | |
tailSet.insert(end); | |
} | |
int step = 0; | |
const int len = start.length(); | |
while(!headSet.empty() && !tailSet.empty()) { | |
++step; | |
smallerSetPtr = (headSet.size() < tailSet.size()) ? &headSet : &tailSet; | |
biggerSetPtr = (headSet.size() < tailSet.size()) ? &tailSet : &headSet; | |
unordered_set<string> nextReachableSet; | |
for(const auto &gene : *smallerSetPtr) { | |
for(int i = 0; i < len; i++) { | |
string newGene = gene; | |
for(int j = 0; j < 4; j++) { | |
char ch; | |
switch(j) { | |
case 0: | |
ch = 'A'; | |
break; | |
case 1: | |
ch = 'C'; | |
break; | |
case 2: | |
ch = 'G'; | |
break; | |
case 3: | |
ch = 'T'; | |
break; | |
} | |
newGene[i] = ch; | |
//>> find ans | |
if(biggerSetPtr->find(newGene) != biggerSetPtr->end()) { | |
return step; | |
} | |
//>> find valid next step | |
if(bankSet.find(newGene) != bankSet.end()) { | |
nextReachableSet.insert(newGene); | |
bankSet.erase(newGene); | |
} | |
} | |
} | |
} | |
//>> change smallerSet to nextReachableSet | |
swap(*smallerSetPtr, nextReachableSet); | |
} | |
return -1; | |
} | |
}; |
沒有留言:
張貼留言