hiho#1107 : Shortest Proper Prefix (统计非指定前缀在字符串集中出现次数)
1107 : Shortest Proper Prefix
时间限制:10000ms
单点时限:1000ms
内存限制:512MB
描述
Query auto-completion(QAC) is widely used in many search applications. The basic idea is that when you type some string s in the search box several high-frequency queries which have s as a prefix are suggested. We say string s1 has string s2 as a prefix if and only if the first |s2| characters of s1 are the same as s2 where |s2| is the length of s2.
These days Little Hi has been working on a way to improve the QAC performance. He collected N high-frequency queries. We say a string s is a proper prefix if there are no more than 5 collected queries have s as a prefix. A string s is a shortest proper prefix if s is a proper prefix and all the prefixes of s(except for s itself) are not proper prefixes. Little Hi wants to know the number of shortest proper prefixes given N collected queries.
Hint: the 4 shortest proper prefixes for Sample Input are “ab”, “bb”, “bc” and “be”. Empty string “” is not counted as a proper prefix even if N <= 5.
输入
The first line contains N(N <= 10000), the number of collected queries.
The following N lines each contain a query.
Each query contains only lowercase letters ‘a’-‘z’.
The total length of all queries are no more than 2000000.
Input may contain identical queries. Count them separately when you calculate the number of queries that have some string as a prefix.
输出
Output the number of shortest proper prefixes.
样例输入
12
a
ab
abc
abcde
abcde
abcba
bcd
bcde
bcbbd
bcac
bee
bbb
样例输出
4
题意:这道题的大意就是给定你N个高频的查询字符串。然后题目定义如果一个字符串s满足,有不大于5个高频字符串是以s为前缀的,那么我们就称s是“合适的前缀”。同时如果一个“合适的前缀”s,删掉s的最后一个字符之后就不是“合适的前缀”了,那我们就称s是“最短的合适前缀”。最后题目问你对于给定N个高频字符串,一共有几个“最短的合适前缀”。
AC code
#define _CRT_SBCURE_NO_DEPRECATE
#include <set>
#include <cmath>
#include <queue>
#include <stack>
#include <vector>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <iostream>
#include <algorithm>
#include <functional>
using namespace std;
#define maxn 1000005
#define maxm 26
#define lson l, mid, rt << 1
#define rson mid + 1, r, rt << 1 | 1
int trie[maxn][maxm] = {0};
int cnt[maxn] = {0};
int t, ans, k=1;
char s[2000010];
void insert(char *s){
int len = strlen(s);
int p = 0;
cnt[0]++; // 这个很重要,可能father为0,而且cnt[father]>5 && cnt[son] <=5
for(int i=0; i<len; i++){
int c = s[i] - 'a';
if(!trie[p][c]){
trie[p][c] = k;
k++;
}
p = trie[p][c];
cnt[p]++;
}
}
void dfs(int x, int father){
if(x!=0 && cnt[x]<=5 && cnt[father]>5) ans++; // 本节点前缀小于等于5,父节点大于5
for(int i=0; i<26; i++){
if(trie[x][i]) dfs(trie[x][i], x); // 如果有下一个节点则继续搜索, 当x==0的时候,为根节点开始的26路搜索
}
}
int main(){
scanf("%d",&t);
while(t--){
scanf("%s", s);
insert(s);
}
dfs(0,-1);
printf("%d\n", ans);
return 0;
}