10234 - Frequent Substrings

All about problems in Volume 102. If there is a thread about your problem, please use it. If not, create one with its number in the subject.

Moderator: Board moderators

ftomi
Learning poster
Posts: 64
Joined: Sun Jan 06, 2002 2:00 am
Location: Hungary
Contact:

Post by ftomi » Mon Mar 11, 2002 9:47 am

"Capitalization should be ignored" - what does it mean?

if it's true, how is it possibile that this program haven't special judge program?

for example, what is the output for this input:
AabaaAajaAk
1
1

now, i'm use tolower(), and my output is:
8 a

Thanks!

Ustaad
New poster
Posts: 4
Joined: Tue Feb 05, 2002 2:00 am

Post by Ustaad » Mon Mar 11, 2002 9:49 am

Hi

Your Output is correct !

Arun Kishore
NTU

Stefan Pochmann
A great helper
Posts: 284
Joined: Thu Feb 28, 2002 2:00 am
Location: Germany
Contact:

Post by Stefan Pochmann » Mon Mar 11, 2002 10:47 am

Are you sure? As far as I understand it, during the search for n-grams you have to ignore case, but at the end you have to output the lexicographically smallest one. So I think the output should be "8 A".

Well, I hope you're right, cause then I think I can find *all* n-grams of a string T in time O(length(T)). Have to check it tomorrow, currently I don't get through to the judge (seems to be down again)...

Adrian Kuegel
Guru
Posts: 724
Joined: Wed Dec 19, 2001 2:00 am
Location: Germany

Post by Adrian Kuegel » Mon Mar 11, 2002 10:50 am

You must consider this line:
If there are several such N-Grams, output the least lexicographical one, compared in terms of ASCII values.

Stefan Pochmann
A great helper
Posts: 284
Joined: Thu Feb 28, 2002 2:00 am
Location: Germany
Contact:

Post by Stefan Pochmann » Mon Mar 11, 2002 11:24 am

Adrian: Who do you mean? What is your answer for the above input?

pochmann
New poster
Posts: 28
Joined: Sat Jan 26, 2002 2:00 am
Contact:

Post by pochmann » Mon Mar 11, 2002 4:04 pm

I just checked it, and it's true: Output it as lowercase. That even corrected my "Presentation Error".

uzioriluzan
New poster
Posts: 27
Joined: Thu Feb 14, 2002 2:00 am

Efficient Implementation?

Post by uzioriluzan » Mon Oct 28, 2002 7:10 pm

How could we solve it in O(length(s))? What's is the time complexity of your algorithm?
Thanx!

Rossi
New poster
Posts: 20
Joined: Thu Mar 21, 2002 2:00 am
Location: Bangladesh

Post by Rossi » Tue Dec 03, 2002 12:51 pm

whats the output for

[[[AAA
1
3
and

[[[aaa
1
3

stcheung
Experienced poster
Posts: 114
Joined: Mon Nov 18, 2002 6:48 am
Contact:

Post by stcheung » Mon Feb 24, 2003 11:35 am

Hmm...I probably have missed some corner cases, or haven't yet understood completely what they mean by ignore capatalization. Anyways, can anyone run my program and feed it some tricky input? THANKS.


[cpp]#include <iostream.h>
#include <stdlib.h>
#include <string>
#include <map>

int main()
{
string s="";
int T, N, max;
getline(cin, s);
cin >> T;
map<string, int> NGrams;
for(int i=0; i<T; i++)
{
cin >> N;
string sub1="", sub2="", nocapS="", result="";
max=1;
result = s.substr(0, N);
for(int j=0; j<s.length(); j++)
{
nocapS += (isalpha(s[j]) ? tolower(s[j]) : s[j]);
}
for(int j=0; j<s.length()-N; j++)
{
// sub1 = s.substr(j, N);
sub2 = nocapS.substr(j, N);
// cout << "sub1: " << sub1 << "\t" << "sub2: " << sub2 << "\n";
if(NGrams.find(sub2) != NGrams.end())
{
NGrams[sub2]++;
}
else NGrams[sub2] = 1;
if(NGrams[sub2] > max || (NGrams[sub2]==max && sub2 < result))
{
result = sub2;
max = NGrams[sub2];
}
}

cout << max << " " << result << "\n";
}

return 0;
}[/cpp]

Dominik Michniewski
Guru
Posts: 834
Joined: Wed May 29, 2002 4:11 pm
Location: Wroclaw, Poland
Contact:

Post by Dominik Michniewski » Wed May 21, 2003 1:38 pm

I've got Accepted this problem but with time 4.7 sec. Could anyone tell me how can I improve this result ?

I use following algorithm:
for each query (length of n-gram) if I didn't search it I do: copy every substring of length len and store it in tree with it's frequency and remember best result. This step has time complexity length(s)*time used to insert data in binary tree.
I can't imagine better algorithm. Could anyone explain me it ?

Best regards
DM
If you really want to get Accepted, try to think about possible, and after that - about impossible ... and you'll get, what you want ....
Born from ashes - restarting counter of problems (800+ solved problems)

PdR
New poster
Posts: 24
Joined: Mon Dec 30, 2002 4:27 am

Post by PdR » Thu May 22, 2003 12:04 am

My algorithm is O(n log n)
Basicly I use a pointer to each location until len(line)-n,
then i sort them attending the lenght asked for...
and see the best duplicate in O (len(line) - n)

1.2 secs ;-)

Dominik Michniewski
Guru
Posts: 834
Joined: Wed May 29, 2002 4:11 pm
Location: Wroclaw, Poland
Contact:

Post by Dominik Michniewski » Thu May 22, 2003 7:38 am

Maybe in my solution most time is given for allocate memory and copy strings ... It's possible I think - so I thank you for hint :)

Best regards
DM
If you really want to get Accepted, try to think about possible, and after that - about impossible ... and you'll get, what you want ....
Born from ashes - restarting counter of problems (800+ solved problems)

Ivan Silva
New poster
Posts: 7
Joined: Wed Jun 05, 2002 3:27 pm
Location: Portugal
Contact:

Post by Ivan Silva » Fri May 30, 2003 12:40 am

Any one could post some input and output for this problem, i'm always getting WA!!! and i really have no idea what is the problem :)

-Ivan

Ivan Silva
New poster
Posts: 7
Joined: Wed Jun 05, 2002 3:27 pm
Location: Portugal
Contact:

Post by Ivan Silva » Sun Jun 15, 2003 10:31 pm

This post is just to get one more post... i'm joking of course, please just a little help here, i tryed all inputs in the board and i still get WA!!!

chuzpa
New poster
Posts: 33
Joined: Fri Jun 18, 2004 8:39 am
Location: Mexico
Contact:

10234 Frequent substrings please Inputs/ Outputs !!

Post by chuzpa » Mon Aug 08, 2005 3:02 pm

Hi, if somebody has any input for this problem I'll apreciate,
I haven't understood well if I should print the most frequent substring as lowercase substring, or just the most frequent substring found in it's least lexicographically way ...

for example ... for the input ...

Code: Select all

AbCabcABC
1
3
The output should be ?

Code: Select all

3 ABC
or
3 abc ?

Post Reply

Return to “Volume 102 (10200-10299)”