10234 - Frequent Substrings
Moderator: Board moderators
-
- A great helper
- Posts: 284
- Joined: Thu Feb 28, 2002 2:00 am
- Location: Germany
- Contact:
Are you sure? As far as I understand it, during the search for n-grams you have to ignore case, but at the end you have to output the lexicographically smallest one. So I think the output should be "8 A".
Well, I hope you're right, cause then I think I can find *all* n-grams of a string T in time O(length(T)). Have to check it tomorrow, currently I don't get through to the judge (seems to be down again)...
Well, I hope you're right, cause then I think I can find *all* n-grams of a string T in time O(length(T)). Have to check it tomorrow, currently I don't get through to the judge (seems to be down again)...
-
- Guru
- Posts: 724
- Joined: Wed Dec 19, 2001 2:00 am
- Location: Germany
-
- A great helper
- Posts: 284
- Joined: Thu Feb 28, 2002 2:00 am
- Location: Germany
- Contact:
-
- New poster
- Posts: 27
- Joined: Thu Feb 14, 2002 2:00 am
Efficient Implementation?
How could we solve it in O(length(s))? What's is the time complexity of your algorithm?
Thanx!
Thanx!
Hmm...I probably have missed some corner cases, or haven't yet understood completely what they mean by ignore capatalization. Anyways, can anyone run my program and feed it some tricky input? THANKS.
[cpp]#include <iostream.h>
#include <stdlib.h>
#include <string>
#include <map>
int main()
{
string s="";
int T, N, max;
getline(cin, s);
cin >> T;
map<string, int> NGrams;
for(int i=0; i<T; i++)
{
cin >> N;
string sub1="", sub2="", nocapS="", result="";
max=1;
result = s.substr(0, N);
for(int j=0; j<s.length(); j++)
{
nocapS += (isalpha(s[j]) ? tolower(s[j]) : s[j]);
}
for(int j=0; j<s.length()-N; j++)
{
// sub1 = s.substr(j, N);
sub2 = nocapS.substr(j, N);
// cout << "sub1: " << sub1 << "\t" << "sub2: " << sub2 << "\n";
if(NGrams.find(sub2) != NGrams.end())
{
NGrams[sub2]++;
}
else NGrams[sub2] = 1;
if(NGrams[sub2] > max || (NGrams[sub2]==max && sub2 < result))
{
result = sub2;
max = NGrams[sub2];
}
}
cout << max << " " << result << "\n";
}
return 0;
}[/cpp]
[cpp]#include <iostream.h>
#include <stdlib.h>
#include <string>
#include <map>
int main()
{
string s="";
int T, N, max;
getline(cin, s);
cin >> T;
map<string, int> NGrams;
for(int i=0; i<T; i++)
{
cin >> N;
string sub1="", sub2="", nocapS="", result="";
max=1;
result = s.substr(0, N);
for(int j=0; j<s.length(); j++)
{
nocapS += (isalpha(s[j]) ? tolower(s[j]) : s[j]);
}
for(int j=0; j<s.length()-N; j++)
{
// sub1 = s.substr(j, N);
sub2 = nocapS.substr(j, N);
// cout << "sub1: " << sub1 << "\t" << "sub2: " << sub2 << "\n";
if(NGrams.find(sub2) != NGrams.end())
{
NGrams[sub2]++;
}
else NGrams[sub2] = 1;
if(NGrams[sub2] > max || (NGrams[sub2]==max && sub2 < result))
{
result = sub2;
max = NGrams[sub2];
}
}
cout << max << " " << result << "\n";
}
return 0;
}[/cpp]
-
- Guru
- Posts: 834
- Joined: Wed May 29, 2002 4:11 pm
- Location: Wroclaw, Poland
- Contact:
I've got Accepted this problem but with time 4.7 sec. Could anyone tell me how can I improve this result ?
I use following algorithm:
for each query (length of n-gram) if I didn't search it I do: copy every substring of length len and store it in tree with it's frequency and remember best result. This step has time complexity length(s)*time used to insert data in binary tree.
I can't imagine better algorithm. Could anyone explain me it ?
Best regards
DM
I use following algorithm:
for each query (length of n-gram) if I didn't search it I do: copy every substring of length len and store it in tree with it's frequency and remember best result. This step has time complexity length(s)*time used to insert data in binary tree.
I can't imagine better algorithm. Could anyone explain me it ?
Best regards
DM
If you really want to get Accepted, try to think about possible, and after that - about impossible ... and you'll get, what you want ....
Born from ashes - restarting counter of problems (800+ solved problems)
Born from ashes - restarting counter of problems (800+ solved problems)
-
- Guru
- Posts: 834
- Joined: Wed May 29, 2002 4:11 pm
- Location: Wroclaw, Poland
- Contact:
Maybe in my solution most time is given for allocate memory and copy strings ... It's possible I think - so I thank you for hint 
Best regards
DM

Best regards
DM
If you really want to get Accepted, try to think about possible, and after that - about impossible ... and you'll get, what you want ....
Born from ashes - restarting counter of problems (800+ solved problems)
Born from ashes - restarting counter of problems (800+ solved problems)
-
- New poster
- Posts: 7
- Joined: Wed Jun 05, 2002 3:27 pm
- Location: Portugal
- Contact:
-
- New poster
- Posts: 7
- Joined: Wed Jun 05, 2002 3:27 pm
- Location: Portugal
- Contact:
10234 Frequent substrings please Inputs/ Outputs !!
Hi, if somebody has any input for this problem I'll apreciate,
I haven't understood well if I should print the most frequent substring as lowercase substring, or just the most frequent substring found in it's least lexicographically way ...
for example ... for the input ...
The output should be ?
or
3 abc ?
I haven't understood well if I should print the most frequent substring as lowercase substring, or just the most frequent substring found in it's least lexicographically way ...
for example ... for the input ...
Code: Select all
AbCabcABC
1
3
Code: Select all
3 ABC
3 abc ?