[POJ 1200] Crazy Search【字符串Hash】

  • 2018-01-07
  • 0
  • 0


Time Limit: 1000MS Memory Limit: 65536K


Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon will discover, you really need the help of a computer and a good algorithm to solve such a puzzle.
Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text.As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5.


The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed 16 Millions.


The program should output just an integer corresponding to the number of different substrings of size N found in the given text.

Sample Input

3 4

Sample Output



Huge input,scanf is recommended.


Southwestern Europe 2002



首先给 NC 个不同字符分配一个权值 id[],然后将长度为 N 的子串当作 N 位的 NC 进制数,转化成十进制数来判重即可。

一个有效的优化Rabin-Karp 算法,又称滚动哈希,主要思想是记录 N 位 NC 进制数最高位的单位权值 T = NCN-1,每次不用重新进行进制转换,而是将前一个 hash 结果减去 T * id[最高位字符] 来快速计算本次的前 N - 1 位的 hash 值。

Code: O(L), 其中L为字符串长度 [7884K, 32MS]

using namespace std;

int N, NC, ans;
char str[16000005];
int id[128], topid = 0;
bool exi[16000005];
// Ambiguous data size is given so that the size of exi[] is not determined

inline int fastpow(int bas, int ex){
	int res = 1;
		if(ex & 1) res *= bas;
		ex >>= 1, bas *= bas;
	return res;

int main(){
	scanf("%d%d%s", &N, &NC, str);
	int len = strlen(str), topHash = fastpow(NC, N - 1);
	// Get length of str and weight of top digit of a N-digited number
	for(register int i = 0; i < len; i++)
		if(!id[str[i]]) id[str[i]] = topid++;  // Distribution the NC identities
	int curHash = 0;
	for(register int i = 0; i < N; i++) curHash = curHash * NC + id[str[i]];
	exi[curHash] = 1, ans = 1;
	for(register int i = N; i < len; i++){
		curHash = (curHash - topHash * id[str[i - N]]) * NC + id[str[i]];
		if(!exi[curHash]) exi[curHash] = 1, ans++;
	printf("%d\n", ans);
	return 0;





OPEN AT 2017.12.10

If the code cannot be displayed normally, please refresh the page.


- Theme by Qzhai