I solved it with multinomial coefficients -- i.e. without any DP except for a set to store some duplicating bitmasks.
Got AC in 1.560. I now wonder how those fellows solve it in 0.000?
We just enumerate all the bitmasks, and if the corresponding sum of digits is divisible by 5, we form all permutations with repetitions... But here two different masks may represent one and the same subset, and here sets come into play.