10745  Dominant Strings
Moderator: Board moderators
10745  Dominant Strings
Hello,
We tried solving the problem by reading the strings one after the other and maintaining a list of the currently nondominated strings. When a new string is read, the list is scanned and every member of the list that is dominated by the new string is removed from the list. If some of the strings in the list dominates the new one the scan stops and we read the next string. At the end of the scan, if none of the strings in the list dominates the new string, it is added to the list. However, this solution timed out. Could some of you post an idea for a faster solution and running time analysis if possible?
Greetings
We tried solving the problem by reading the strings one after the other and maintaining a list of the currently nondominated strings. When a new string is read, the list is scanned and every member of the list that is dominated by the new string is removed from the list. If some of the strings in the list dominates the new one the scan stops and we read the next string. At the end of the scan, if none of the strings in the list dominates the new string, it is added to the list. However, this solution timed out. Could some of you post an idea for a faster solution and running time analysis if possible?
Greetings
I solve it in this way:
Read all the string.
Sort the strings in some way such that if string B < string A, string B never dominates string A.
Then check for all pair (string B, string A) if string A dominates string B.
Read all the string.
Sort the strings in some way such that if string B < string A, string B never dominates string A.
Then check for all pair (string B, string A) if string A dominates string B.
My signature:
 Please make discussion about the algorithm BRFORE posting source code.
We can learn much more in discussion than reading source code.  I HATE testing account.
 Don't send me source code for debug.
..: Aren't you checking O(n^2) pairs in the worst case (ie. no dominating strings)? Unless I didn't get your algorithm.
BiK: I was just saying that one string of length K can dominate at most 2^K2 strings. Since K<=10, one string can dominate at most 1022 strings. So it's cheaper to generate all possible strings, given string can dominate and check if it is one of the given strings. If so, flag that string as nondominating.
I used trie to represent input strings, and used recursive procedure to generate all dominated strings. So the complexity is O(N*2^K), which is better that O(N^2), because 2^K < N, in the worst case.
BiK: I was just saying that one string of length K can dominate at most 2^K2 strings. Since K<=10, one string can dominate at most 1022 strings. So it's cheaper to generate all possible strings, given string can dominate and check if it is one of the given strings. If so, flag that string as nondominating.
I used trie to represent input strings, and used recursive procedure to generate all dominated strings. So the complexity is O(N*2^K), which is better that O(N^2), because 2^K < N, in the worst case.
Yes, but enough to get ACmarian wrote:..: Aren't you checking O(n^2) pairs in the worst case (ie. no dominating strings)? Unless I didn't get your algorithm.
Your alogrithm is much more interesting than mine.marian wrote: I used trie to represent input strings, and used recursive procedure to generate all dominated strings. So the complexity is O(N*2^K), which is better that O(N^2), because 2^K < N, in the worst case.
But how can you check if the generated string is in the input quickly?
There are totally (N*2^K) generated string, are you generate all these strings and then sort them for checking?~
My signature:
 Please make discussion about the algorithm BRFORE posting source code.
We can learn much more in discussion than reading source code.  I HATE testing account.
 Don't send me source code for debug.
marian: Thanks for your explanation. As far as I understand, you look at the strings as multisets of characters. And in your trie you do not distinguish between strings that give rise to the same multiset. Is that correct? For example you can sort the string lexicographically. And for a given multiset you keep the indeces of the input strings that produce the given multiset. So I assume that given an input string you generate the multisets dominated by that input string and mark them in the trie as nondominating. And at the end you traverse your trie to find the multisets that are dominating. This is how I imagine your algorithm working. Am I correct?
..: I tried your approach and it timed out. If you got accepted, I think that it is exceptional and this is not the idea of the solution. Do you use pure C? I'm currios how you managed to get under the time limit with an O(N^2) solution.
..: I tried your approach and it timed out. If you got accepted, I think that it is exceptional and this is not the idea of the solution. Do you use pure C? I'm currios how you managed to get under the time limit with an O(N^2) solution.
I used pure C. I tell more about how I optimize the algorithm.BiK wrote: ..: I tried your approach and it timed out. If you got accepted, I think that it is exceptional and this is not the idea of the solution. Do you use pure C? I'm currios how you managed to get under the time limit with an O(N^2) solution.
For each string, I will store how many different kind of chars and the occurance of each kind in the string. So string A dominates string B only if number of kind in A >= number of kind in B.
Also, for string A, B, C, if I know string C is dominated by B, I will mark the string C dominated and won't check again if any other string A dominates C.
The above rules will not decrease the complexity, but it decreases the practical number of opertaions. If I remove these 2 rules of my AC code, the runtime will grow from 4s to 8s.
My signature:
 Please make discussion about the algorithm BRFORE posting source code.
We can learn much more in discussion than reading source code.  I HATE testing account.
 Don't send me source code for debug.
Exactly.BiK wrote:marian: Thanks for your explanation. As far as I understand, you look at the strings as multisets of characters. And in your trie you do not distinguish between strings that give rise to the same multiset. Is that correct?
In fact, I'm keeping the opposite  for every string I keep the index of apropriate multiset. In the trie, I have indices of multisets.BiK wrote:For example you can sort the string lexicographically. And for a given multiset you keep the indeces of the input strings that produce the given multiset.
Almost, only I'm not traversing trie at the end, but flagging nondominating multiset indices to some array.BiK wrote:So I assume that given an input string you generate the multisets dominated by that input string and mark them in the trie as nondominating. And at the end you traverse your trie to find the multisets that are dominating.
Also, to optimize this solution a bit, I'm traversing a trie as I generate dominated strings (I have node pointer as a parameter of my recursive function). This way, I can reduce the number of generated strings, because I can stop generating, when the needed edge is not in the trie.
Marian

 New poster
 Posts: 30
 Joined: Sat Nov 30, 2002 1:04 pm

 New poster
 Posts: 14
 Joined: Tue Oct 19, 2004 2:01 am
Bah
I, on the other hand, don't think that tries are the best data structure for this problem. Any data structure used for this problem has a construction penalty, especially once you break the 1 s barrier. I think an exhaustive search with good short circuiting heuristics goes best here combined with cheap to construct data structures.
What kept me inspired was the fact that the former first place implementation was written in PASCAL. Surely, I could not let this stand, to the embarassment of C++ programmers, everywhere .
Michael Goldshteyn
( 10745  0.129 s )
P.S. for those that still do not understand the problem statement, here is a simplified version:
Given strings A and B. B dominates A if BOTH conditions below are satisfied:
 Every distinct character in A also appears in B
 Any character that appears multiple times in A must appear at least that many times in B
So, a simple way of saying this is that a string B dominates a string A if A can be constructed using some or all of the characters from B.
Here are examples where the first string is dominated by the second:
abc bac (in fact, each dominates the other, so neither is a dominant string)
abc dcba
aabcc aaabccd
abccc aaabccddcc
Examples where the first string is NOT dominated by the second:
abc abdefg
abbc abcadef
aabbbcccc abbabcddcbce
What kept me inspired was the fact that the former first place implementation was written in PASCAL. Surely, I could not let this stand, to the embarassment of C++ programmers, everywhere .
Michael Goldshteyn
( 10745  0.129 s )
P.S. for those that still do not understand the problem statement, here is a simplified version:
Given strings A and B. B dominates A if BOTH conditions below are satisfied:
 Every distinct character in A also appears in B
 Any character that appears multiple times in A must appear at least that many times in B
So, a simple way of saying this is that a string B dominates a string A if A can be constructed using some or all of the characters from B.
Here are examples where the first string is dominated by the second:
abc bac (in fact, each dominates the other, so neither is a dominant string)
abc dcba
aabcc aaabccd
abccc aaabccddcc
Examples where the first string is NOT dominated by the second:
abc abdefg
abbc abcadef
aabbbcccc abbabcddcbce
Last edited by Michael Goldshteyn on Wed Oct 20, 2004 4:43 am, edited 1 time in total.

 Experienced poster
 Posts: 151
 Joined: Wed Aug 21, 2002 12:07 am
 Location: Seoul, Korea
 Contact:
Okay, it seems I'm the second C++ programmer who broke the pascal implementation.. Haha (With a very close shave of 0.002 second )
What I did was inserting all vectors of alphabet frequencies into a trie. By building the trie with the frequencies (instead of the letter itself) I think you can reduce the amount of exhaustive searches you will make.
What I did was inserting all vectors of alphabet frequencies into a trie. By building the trie with the frequencies (instead of the letter itself) I think you can reduce the amount of exhaustive searches you will make.
JongMan @ Yonsei
I don't know how to implement the trie, But get AC with divide and conquer algorithm. Divide the string set to two parts, then gets the dominant strings in the two parts respectively with recursion, i call the results as candidate dominant strings of part1 and candidate dominant strings of part2. Then tests each candidate string of part1 by enumerate all the candidate strings of part2, if no candidate strings of part2 can dominate it, it is a dominant string of this set; and the same to candidate strings of part2. This method gets AC, but slow.
my goal: master algorithms