## 10745 - Dominant Strings

All about problems in Volume 107. If there is a thread about your problem, please use it. If not, create one with its number in the subject.

Moderator: Board moderators

BiK
Experienced poster
Posts: 104
Joined: Tue Sep 23, 2003 5:49 pm

### 10745 - Dominant Strings

Hello,

We tried solving the problem by reading the strings one after the other and maintaining a list of the currently nondominated strings. When a new string is read, the list is scanned and every member of the list that is dominated by the new string is removed from the list. If some of the strings in the list dominates the new one the scan stops and we read the next string. At the end of the scan, if none of the strings in the list dominates the new string, it is added to the list. However, this solution timed out. Could some of you post an idea for a faster solution and running time analysis if possible?

Greetings

marian
New poster
Posts: 30
Joined: Sun Oct 27, 2002 4:01 pm
Contact:
Hint: There are fewer strings one given string can dominate than total number of strings.

..
A great helper
Posts: 454
Joined: Thu Oct 18, 2001 2:00 am
Location: Hong Kong
I solve it in this way:
Read all the string.
Sort the strings in some way such that if string B < string A, string B never dominates string A.
Then check for all pair (string B, string A) if string A dominates string B.
My signature:
• Please make discussion about the algorithm BRFORE posting source code.
We can learn much more in discussion than reading source code.
• I HATE testing account.
• Don't send me source code for debug.

BiK
Experienced poster
Posts: 104
Joined: Tue Sep 23, 2003 5:49 pm
Hi,

.. : 10x, this is beutiful. And of course you have to look at all adjacent pairs, otherwsie you'll time out.

marian: I still don't get your hint.

marian
New poster
Posts: 30
Joined: Sun Oct 27, 2002 4:01 pm
Contact:
..: Aren't you checking O(n^2) pairs in the worst case (ie. no dominating strings)? Unless I didn't get your algorithm.

BiK: I was just saying that one string of length K can dominate at most 2^K-2 strings. Since K<=10, one string can dominate at most 1022 strings. So it's cheaper to generate all possible strings, given string can dominate and check if it is one of the given strings. If so, flag that string as non-dominating.

I used trie to represent input strings, and used recursive procedure to generate all dominated strings. So the complexity is O(N*2^K), which is better that O(N^2), because 2^K < N, in the worst case.

..
A great helper
Posts: 454
Joined: Thu Oct 18, 2001 2:00 am
Location: Hong Kong
marian wrote:..: Aren't you checking O(n^2) pairs in the worst case (ie. no dominating strings)? Unless I didn't get your algorithm.
Yes, but enough to get AC
marian wrote: I used trie to represent input strings, and used recursive procedure to generate all dominated strings. So the complexity is O(N*2^K), which is better that O(N^2), because 2^K < N, in the worst case.
Your alogrithm is much more interesting than mine.
But how can you check if the generated string is in the input quickly?
There are totally (N*2^K) generated string, are you generate all these strings and then sort them for checking?~
My signature:
• Please make discussion about the algorithm BRFORE posting source code.
We can learn much more in discussion than reading source code.
• I HATE testing account.
• Don't send me source code for debug.

BiK
Experienced poster
Posts: 104
Joined: Tue Sep 23, 2003 5:49 pm
marian: Thanks for your explanation. As far as I understand, you look at the strings as multisets of characters. And in your trie you do not distinguish between strings that give rise to the same multiset. Is that correct? For example you can sort the string lexicographically. And for a given multiset you keep the indeces of the input strings that produce the given multiset. So I assume that given an input string you generate the multisets dominated by that input string and mark them in the trie as non-dominating. And at the end you traverse your trie to find the multisets that are dominating. This is how I imagine your algorithm working. Am I correct?

..: I tried your approach and it timed out. If you got accepted, I think that it is exceptional and this is not the idea of the solution. Do you use pure C? I'm currios how you managed to get under the time limit with an O(N^2) solution.

..
A great helper
Posts: 454
Joined: Thu Oct 18, 2001 2:00 am
Location: Hong Kong
BiK wrote: ..: I tried your approach and it timed out. If you got accepted, I think that it is exceptional and this is not the idea of the solution. Do you use pure C? I'm currios how you managed to get under the time limit with an O(N^2) solution.
I used pure C. I tell more about how I optimize the algorithm.
For each string, I will store how many different kind of chars and the occurance of each kind in the string. So string A dominates string B only if number of kind in A >= number of kind in B.
Also, for string A, B, C, if I know string C is dominated by B, I will mark the string C dominated and won't check again if any other string A dominates C.

The above rules will not decrease the complexity, but it decreases the practical number of opertaions. If I remove these 2 rules of my AC code, the runtime will grow from 4s to 8s.
My signature:
• Please make discussion about the algorithm BRFORE posting source code.
We can learn much more in discussion than reading source code.
• I HATE testing account.
• Don't send me source code for debug.

marian
New poster
Posts: 30
Joined: Sun Oct 27, 2002 4:01 pm
Contact:
BiK wrote:marian: Thanks for your explanation. As far as I understand, you look at the strings as multisets of characters. And in your trie you do not distinguish between strings that give rise to the same multiset. Is that correct?
Exactly.
BiK wrote:For example you can sort the string lexicographically. And for a given multiset you keep the indeces of the input strings that produce the given multiset.
In fact, I'm keeping the opposite - for every string I keep the index of apropriate multiset. In the trie, I have indices of multisets.
BiK wrote:So I assume that given an input string you generate the multisets dominated by that input string and mark them in the trie as non-dominating. And at the end you traverse your trie to find the multisets that are dominating.
Almost, only I'm not traversing trie at the end, but flagging non-dominating multiset indices to some array.

Also, to optimize this solution a bit, I'm traversing a trie as I generate dominated strings (I have node pointer as a parameter of my recursive function). This way, I can reduce the number of generated strings, because I can stop generating, when the needed edge is not in the trie.

Marian

tat tvam asi
New poster
Posts: 30
Joined: Sat Nov 30, 2002 1:04 pm
to BiK
i think that the brute force
O(nn) approach mentioned by
Lawrence (aka ..) can easily get
accepted with the current input
and time limit. I used "pure" c
(no asm,register etc.) and O(nn).
csaba noszaly

BiK
Experienced poster
Posts: 104
Joined: Tue Sep 23, 2003 5:49 pm
think that the brute force O(nn) approach mentioned by Lawrence (aka ..) ...
Who is Lawrence?

..
A great helper
Posts: 454
Joined: Thu Oct 18, 2001 2:00 am
Location: Hong Kong
Me
My signature:
• Please make discussion about the algorithm BRFORE posting source code.
We can learn much more in discussion than reading source code.
• I HATE testing account.
• Don't send me source code for debug.

Michael Goldshteyn
New poster
Posts: 14
Joined: Tue Oct 19, 2004 2:01 am

### Bah

I, on the other hand, don't think that tries are the best data structure for this problem. Any data structure used for this problem has a construction penalty, especially once you break the 1 s barrier. I think an exhaustive search with good short circuiting heuristics goes best here combined with cheap to construct data structures.

What kept me inspired was the fact that the former first place implementation was written in PASCAL. Surely, I could not let this stand, to the embarassment of C++ programmers, everywhere .

Michael Goldshteyn
( 10745 - 0.129 s )

P.S. for those that still do not understand the problem statement, here is a simplified version:

Given strings A and B. B dominates A if BOTH conditions below are satisfied:

- Every distinct character in A also appears in B
- Any character that appears multiple times in A must appear at least that many times in B

So, a simple way of saying this is that a string B dominates a string A if A can be constructed using some or all of the characters from B.

Here are examples where the first string is dominated by the second:

abc bac (in fact, each dominates the other, so neither is a dominant string)
abc dcba
aabcc aaabccd
abccc aaabccddcc

Examples where the first string is NOT dominated by the second:

abc abdefg
aabbbcccc abbabcddcbce
Last edited by Michael Goldshteyn on Wed Oct 20, 2004 4:43 am, edited 1 time in total.

Whinii F.
Experienced poster
Posts: 151
Joined: Wed Aug 21, 2002 12:07 am
Location: Seoul, Korea
Contact:
Okay, it seems I'm the second C++ programmer who broke the pascal implementation.. Haha (With a very close shave of 0.002 second )

What I did was inserting all vectors of alphabet frequencies into a trie. By building the trie with the frequencies (instead of the letter itself) I think you can reduce the amount of exhaustive searches you will make.
JongMan @ Yonsei

wiltord
New poster
Posts: 16
Joined: Mon Oct 27, 2003 10:10 am
Location: SiChuan, China
I don't know how to implement the trie, But get AC with divide and conquer algorithm. Divide the string set to two parts, then gets the dominant strings in the two parts respectively with recursion, i call the results as candidate dominant strings of part1 and candidate dominant strings of part2. Then tests each candidate string of part1 by enumerate all the candidate strings of part2, if no candidate strings of part2 can dominate it, it is a dominant string of this set; and the same to candidate strings of part2. This method gets AC, but slow.
my goal: master algorithms