About Software: [LeetCode] 187. Repeated DNA Sequences

2017年5月1日星期一

[LeetCode] 187. Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

<Solution>
這題是要在一個 DNA 字串中，找出重複的 DNA 組成

想法如下

總共有四個字元，分別編碼 A=0、C=1、G=2、T=3，也就是用 2 bits 來表示四個不同的值

先將前面 9 個字元組成一個基底 DNA 字串

從第10個字元開始，先從上一次的組成取出後 18位元 (和 0x3ffff 做 and)，然後和目前的字元組成一個 10 字元的 DNA 字串

檢查目前最新組成的 DNA 字串，是否已經存在 hash map 裡面了，如果有的話，就放到答案裡。這邊要注意一點，不論同一個 DNA 字串重複幾次，都只放一次到答案裡面

code如下(參考資料)

About Software

2017年5月1日星期一

[LeetCode] 187. Repeated DNA Sequences

沒有留言:

張貼留言

關於我自己

2017年5月1日 星期一

[LeetCode] 187. Repeated DNA Sequences

沒有留言:

張貼留言

2017年5月1日星期一