692. Top K Frequent Words
程序员文章站
2022-04-25 19:06:48
...
692. Top K Frequent Words
Given a non-empty list of words, return the k most frequent elements.
Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.
Example 1:
Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
Output: ["i", "love"]
Explanation: "i" and "love" are the two most frequent words.
Note that "i" comes before "love" due to a lower alphabetical order.
Example 2:
Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4
Output: ["the", "is", "sunny", "day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words,
with the number of occurrence being 4, 3, 2 and 1 respectively.
Note:
You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Input words contain only lowercase letters.
Follow up:
Try to solve it in O(n log k) time and O(n) extra space.
方法1: priority_queue
思路:
如果按照常规的HashMap + PriorityQueue 需要O(nlogn),不符合题目要求。优化成k个元素的PriorityQueue, 在放入O(n)个元素的过程中不断更替掉低频词而达到O(nlogk)
易错点:
- 要会写c++中的comparator。注意上文中的最后一段。因此priority_queue默认的comparator是std::less,从小到大排序但是先pop最大值。模仿这个写法,就应该是1. a.freq < b.freq, 2. a.word > b.word. 这样会优先词频,并在词频相等时先输出字母序靠前的单词。
- 但是这道题由于我们想在推堆的过程住不断替换掉最小值,所以应该用的是完全相反来实现让词频小且字母序靠后的在最前pop。也就是comparator定义为 1. a.freq > b.freq, 2. a.word < b.word。
- cmp的function定义之后需要有;
- priority的定义语法
- 最后还需要reverse一次
class Solution {
public:
vector<string> topKFrequent(vector<string>& words, int k) {
vector<string> result;
unordered_map<string, int> freqMap;
for (string word: words) {
++freqMap[word];
}
// 这里first是word, second是词频
// true的结果是a排在前面
auto cmp = [](pair<string, int> & a, pair<string, int> & b){
if (a.second == b.second){
return a.first < b.first;
}
return a.second > b.second;
};
priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp)> freqHeap(cmp);
for (auto item: freqMap) {
freqHeap.push(item);
if (freqHeap.size() > k){
freqHeap.pop();
}
}
for (int i = 0; i < k; i++){
result.push_back(freqHeap.top().first);
freqHeap.pop();
}
//先pop出来的小词频现在在前面,需要reverse
std::reverse(result.begin(), result.end());
return result;
}
};
方法2: bucket-sort
用map<int, set<string>>的结构来建立bucket,{freq: {word1, word2…}}。如果使用ordered_map, which is default type of map (in contrast to unordered_map), 就可以直接从map中取最大的词直至取满k个。map中的相同词频的用set<string>来储存,就可以自动排字母序,每次从后面开始取。
易错点:
- 从bucket中取,和从bucket中的second,也就是vector<\string> 当中取都应该是reverse。用到rbegin(), rend()
- while (!tmp2.empty() && k > 0 ) 这里内部也需要限制k。可以有一种整体判断的方式避免遗忘这个条件:每次先计算当前取出bucket的大小,如果累计不超过k全体推入,如果超过有限制的筛选(grandyang)
class Solution {
public:
vector<string> topKFrequent(vector<string>& words, int k) {
vector<string> result;
unordered_map<string, int> freqMap;
map<int, set<string>> bucket;
for (string word: words) {
++freqMap[word];
}
for (auto item: freqMap){
bucket[item.second].insert(item.first);
}
for (auto it = bucket.rbegin(); it != bucket.rend(); it++){
if (k <= 0 ) break;
set<string> tmp = it->second;
vector<string> tmp2(tmp.rbegin(), tmp.rend());
while (!tmp2.empty() && k > 0 ){
string a = tmp2.back() ;
tmp2.pop_back();
result.push_back(a);
k--;
}
}
return result;
}
};
grandyang
class Solution {
public:
vector<string> topKFrequent(vector<string>& words, int k) {
vector<string> res;
unordered_map<string, int> freq;
vector<set<string>> v(words.size() + 1, set<string>());
for (string word : words) ++freq[word];
for (auto a : freq) {
v[a.second].insert(a.first);
}
for (int i = v.size() - 1; i >= 0; --i) {
if (k <= 0) break;
vector<string> t(v[i].begin(), v[i].end());
if (k >= t.size()) {
res.insert(res.end(), t.begin(), t.end());
} else {
res.insert(res.end(), t.begin(), t.begin() + k);
}
k -= t.size();
}
return res;
}
};
推荐阅读
-
LeetCode刷题笔记(Top K Frequent Elements)
-
TOP K frequent-elements
-
LeetCode 692. Top K Frequent Words
-
Leetcode 692. Top K Frequent Words
-
692. Top K Frequent Words
-
LeetCode 347: 前 K 个高频元素 Top K Frequent Elements
-
Top K Frequent Elements
-
(Java)leetcode-347 Top K Frequent Elements
-
[Leetcode] Top K Frequent Elements
-
LeetCode 347. Top K Frequent Elements