It indeed is possible to determine the k-th smallest value of a set S in O(|S|) time without using counting sort:
Code: Select all
select(S, k)
{
if |S| <= LIMIT
{
sort S
return k-th element of S
}
split S into groups of 5 (possibly creating one group with less than 5)
B = medians of these groups (each in constant time)
Pivot = select(B, |B| / 2);
partition S into S1 (< Pivot), S2 (= Pivot), and S3 (> Pivot)
if (k <= |S1|)
return select(S1, k);
if (k > |S1| + |S2|)
return select(S3, k - |S1| - |S2|);
return Pivot;
}
The value of LIMIT depens on the sorting algorithm used, usually it is something around 15. The algorithm itself is very similar to quicksort, but only one partition needs to be sorted in each recursion step. It can easily be transformed into an iterative version.
The special choice of the pivot element ensures the linear time need. The proof is based on how many elements are definitely larger or smaller than the pivot, limiting the size of S1 and S3. Choosing an arbitrary element as pivot would allow |S3| = |S| - 1, resulting in quadratic worst-case behaviour.
I got TLE when trying to implement the algorithm, probably due to the required overhead.
My current solution uses something similar to counting sort, which is a lot faster. Does anyone know how the three sub-second solutions work?