[洛谷 2852][USACO06DEC] 牛奶模式 Milk Patterns【后缀数组+二分答案】

  • 2018-02-24
  • 0
  • 0

Problem:

题目描述

Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation, he discovered that although he can't predict the quality of milk from one day to the next, there are some regular patterns in the daily milk quality.

To perform a rigorous study, he has invented a complex classification scheme by which each milk sample is recorded as an integer between 0 and 1,000,000 inclusive, and has recorded data from a single cow over N (1 ≤ N ≤ 20,000) days. He wishes to find the longest pattern of samples which repeats identically at least K (2 ≤ K ≤ N) times. This may include overlapping patterns -- 1 2 3 2 3 2 3 1 repeats 2 3 2 3 twice, for example.

Help Farmer John by finding the longest repeating subsequence in the sequence of samples. It is guaranteed that at least one subsequence is repeated at least K times.

农夫John发现他的奶牛产奶的质量一直在变动。经过细致的调查,他发现:虽然他不能预见明天产奶的质量,但连续的若干天的质量有很多重叠。我们称之为一个“模式”。 John的牛奶按质量可以被赋予一个0到1000000之间的数。并且John记录了N(1<=N<=20000)天的牛奶质量值。他想知道最长的出现了至少K(2<=K<=N)次的模式的长度。比如1 2 3 2 3 2 3 1 中 2 3 2 3出现了两次。当K=2时,这个长度为4。

输入输出格式

输入格式:

Line 1: Two space-separated integers: N and K

Lines 2..N+1: N integers, one per line, the quality of the milk on day i appears on the ith line.

输出格式:

Line 1: One integer, the length of the longest pattern which occurs at least K times

输入输出样例

输入样例#1:

8 2
1
2
3
2
3
2
3
1
输出样例#1:

4

Solution:

本题求的是至少出现 K 次的最长子串

我们发现,以相同子串开头的后缀一定是字典序相邻的。

所以我们可以用后缀数组进行后缀排序,求出相邻字典序的公共前缀长度 height[]。

然后二分答案 ans,check 时当 height[i] ≥ ans 时将计数器加一,否则将计数器清空。若计数器到达 K 则存在符合题意的长为 ans 的子串,否则不存在。

注意要理解后缀数组的思想,不要打错。代码详解请见:后缀数组模板&详细代码注释

Code: O(nlogn) [2605K, 4MS]

#include<cstdio>
#include<cstdlib>
#include<cstring>
#include<cmath>
#include<cassert>
#include<iostream>
#include<algorithm>
using namespace std;

int N, K, a[20005], disc[20005];
int SA[20005], rnk[20005], ht[20005];

inline void buildSA(int m){
	int *x = new int[20005], *y = new int[20005], *cnt = new int[20005];
	for(register int i = 0; i < m; i++) cnt[i] = 0;
	for(register int i = 0; i < N; i++) cnt[x[i] = a[i]]++;
	for(register int i = 1; i < m; i++) cnt[i] += cnt[i - 1];
	for(register int i = N - 1; i >= 0; i--) SA[--cnt[x[i]]] = i;
	for(register int k = 1; k <= N; k <<= 1){
		int p = 0;
		for(register int i = N - k; i < N; i++) y[p++] = i;
		for(register int i = 0; i < N; i++) if(SA[i] >= k) y[p++] = SA[i] - k;
		for(register int i = 0; i < m; i++) cnt[i] = 0;
		for(register int i = 0; i < N; i++) cnt[x[y[i]]]++;
		for(register int i = 1; i < m; i++) cnt[i] += cnt[i - 1];
		for(register int i = N - 1; i >= 0; i--) SA[--cnt[x[y[i]]]] = y[i];
		swap(x, y), p = 1, x[SA[0]] = 0;
		for(register int i = 1; i < N; i++) x[SA[i]] = (y[SA[i]] == y[SA[i - 1]] && y[SA[i] + k] == y[SA[i - 1] + k]) ? p - 1 : p++;
		if(p == N) break;
		m = p;
	}
	for(register int i = 0; i < N; i++) rnk[SA[i]] = i;
	int k = 0, j;
	for(register int i = 0; i < N; ht[rnk[i++]] = k)
		for(k ? k-- : 0, j = SA[rnk[i] - 1]; a[i + k] == a[j + k]; k++);
	delete []x, delete []y, delete []cnt;
}

inline bool check(int k){
	for(register int i = 1, cnt = 0; i < N; i++){
		if(ht[i] < k) cnt = 0;
		else if(++cnt == K - 1) return 1;
	}
	return 0;
}

int main(){
	scanf("%d%d", &N, &K);
	for(register int i = 0; i < N; i++) scanf("%d", a + i), disc[i + 1] = a[i];
	sort(disc + 1, disc + N + 1);
	int k = unique(disc + 1, disc + N + 1) - disc - 1;
	for(register int i = 0; i < N; i++) a[i] = lower_bound(disc, disc + k, a[i]) - disc;  // Discretization
	a[N++] = 0, buildSA(k + 1);
	int l = 0, r = N / K;
	while(l < r){
		int mid = l + r + 1 >> 1;
		if(check(mid)) l = mid;
		else r = mid - 1;
	}
	printf("%d\n", l);
	return 0;
}

评论

还没有任何评论,你来说两句吧



常年不在线的QQ:
49750

不定期更新的GitHub:
https://github.com/Darkleafin


OPEN AT 2017.12.10

如遇到代码不能正常显示的情况,请刷新页面。
If the code cannot be displayed normally, please refresh the page.


发现一个优美的网站:
https://visualgo.net/en
















- Theme by Qzhai