## 10234 - Frequent Substrings

Moderator: Board moderators

ftomi
Learning poster
Posts: 64
Joined: Sun Jan 06, 2002 2:00 am
Location: Hungary
Contact:
"Capitalization should be ignored" - what does it mean?

if it's true, how is it possibile that this program haven't special judge program?

for example, what is the output for this input:
AabaaAajaAk
1
1

now, i'm use tolower(), and my output is:
8 a

Thanks!

New poster
Posts: 4
Joined: Tue Feb 05, 2002 2:00 am
Hi

Arun Kishore
NTU

Stefan Pochmann
A great helper
Posts: 284
Joined: Thu Feb 28, 2002 2:00 am
Location: Germany
Contact:
Are you sure? As far as I understand it, during the search for n-grams you have to ignore case, but at the end you have to output the lexicographically smallest one. So I think the output should be "8 A".

Well, I hope you're right, cause then I think I can find *all* n-grams of a string T in time O(length(T)). Have to check it tomorrow, currently I don't get through to the judge (seems to be down again)...

Guru
Posts: 724
Joined: Wed Dec 19, 2001 2:00 am
Location: Germany
You must consider this line:
If there are several such N-Grams, output the least lexicographical one, compared in terms of ASCII values.

Stefan Pochmann
A great helper
Posts: 284
Joined: Thu Feb 28, 2002 2:00 am
Location: Germany
Contact:

pochmann
New poster
Posts: 28
Joined: Sat Jan 26, 2002 2:00 am
Contact:
I just checked it, and it's true: Output it as lowercase. That even corrected my "Presentation Error".

uzioriluzan
New poster
Posts: 27
Joined: Thu Feb 14, 2002 2:00 am

### Efficient Implementation?

How could we solve it in O(length(s))? What's is the time complexity of your algorithm?
Thanx!

Rossi
New poster
Posts: 20
Joined: Thu Mar 21, 2002 2:00 am
whats the output for

[[[AAA
1
3
and

[[[aaa
1
3

stcheung
Experienced poster
Posts: 114
Joined: Mon Nov 18, 2002 6:48 am
Contact:
Hmm...I probably have missed some corner cases, or haven't yet understood completely what they mean by ignore capatalization. Anyways, can anyone run my program and feed it some tricky input? THANKS.

[cpp]#include <iostream.h>
#include <stdlib.h>
#include <string>
#include <map>

int main()
{
string s="";
int T, N, max;
getline(cin, s);
cin >> T;
map<string, int> NGrams;
for(int i=0; i<T; i++)
{
cin >> N;
string sub1="", sub2="", nocapS="", result="";
max=1;
result = s.substr(0, N);
for(int j=0; j<s.length(); j++)
{
nocapS += (isalpha(s[j]) ? tolower(s[j]) : s[j]);
}
for(int j=0; j<s.length()-N; j++)
{
// sub1 = s.substr(j, N);
sub2 = nocapS.substr(j, N);
// cout << "sub1: " << sub1 << "\t" << "sub2: " << sub2 << "\n";
if(NGrams.find(sub2) != NGrams.end())
{
NGrams[sub2]++;
}
else NGrams[sub2] = 1;
if(NGrams[sub2] > max || (NGrams[sub2]==max && sub2 < result))
{
result = sub2;
max = NGrams[sub2];
}
}

cout << max << " " << result << "\n";
}

return 0;
}[/cpp]

Dominik Michniewski
Guru
Posts: 834
Joined: Wed May 29, 2002 4:11 pm
Location: Wroclaw, Poland
Contact:
I've got Accepted this problem but with time 4.7 sec. Could anyone tell me how can I improve this result ?

I use following algorithm:
for each query (length of n-gram) if I didn't search it I do: copy every substring of length len and store it in tree with it's frequency and remember best result. This step has time complexity length(s)*time used to insert data in binary tree.
I can't imagine better algorithm. Could anyone explain me it ?

Best regards
DM
If you really want to get Accepted, try to think about possible, and after that - about impossible ... and you'll get, what you want ....
Born from ashes - restarting counter of problems (800+ solved problems)

PdR
New poster
Posts: 24
Joined: Mon Dec 30, 2002 4:27 am
My algorithm is O(n log n)
Basicly I use a pointer to each location until len(line)-n,
then i sort them attending the lenght asked for...
and see the best duplicate in O (len(line) - n)

1.2 secs

Dominik Michniewski
Guru
Posts: 834
Joined: Wed May 29, 2002 4:11 pm
Location: Wroclaw, Poland
Contact:
Maybe in my solution most time is given for allocate memory and copy strings ... It's possible I think - so I thank you for hint

Best regards
DM
If you really want to get Accepted, try to think about possible, and after that - about impossible ... and you'll get, what you want ....
Born from ashes - restarting counter of problems (800+ solved problems)

Ivan Silva
New poster
Posts: 7
Joined: Wed Jun 05, 2002 3:27 pm
Location: Portugal
Contact:
Any one could post some input and output for this problem, i'm always getting WA!!! and i really have no idea what is the problem

-Ivan

Ivan Silva
New poster
Posts: 7
Joined: Wed Jun 05, 2002 3:27 pm
Location: Portugal
Contact:
This post is just to get one more post... i'm joking of course, please just a little help here, i tryed all inputs in the board and i still get WA!!!

chuzpa
New poster
Posts: 33
Joined: Fri Jun 18, 2004 8:39 am
Location: Mexico
Contact:

### 10234 Frequent substrings please Inputs/ Outputs !!

Hi, if somebody has any input for this problem I'll apreciate,
I haven't understood well if I should print the most frequent substring as lowercase substring, or just the most frequent substring found in it's least lexicographically way ...

for example ... for the input ...

Code: Select all

``````AbCabcABC
1
3
``````
The output should be ?

Code: Select all

``````3 ABC
``````
or
3 abc ?