码迷,mamicode.com
首页 > 其他好文 > 详细

Repeated DNA Sequences

时间:2016-02-04 06:42:21      阅读:193      评论:0      收藏:0      [点我收藏+]

标签:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

 

Analyse: Use a map to store all 10-letter-long sequences and count their times at the same time. Do a two-pass examination and put all sequences appear more than once in the result vector. 

 1 class Solution {
 2 public:
 3     vector<string> findRepeatedDnaSequences(string s) {
 4         vector<string> result;
 5         if(s.length() < 11) return result;
 6         
 7         unordered_map<string, int> um;
 8         for(int i = 0; i < s.size() - 9; i++){
 9             um[s.substr(i, 10)]++;
10         }
11         for(unordered_map<string, int>::iterator ite = um.begin(); ite != um.end(); ite++){
12             if(ite->second > 1)
13                 result.push_back(ite->first);
14         }
15         return result;
16     }
17 };

 

Repeated DNA Sequences

标签:

原文地址:http://www.cnblogs.com/amazingzoe/p/5180884.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!