Page 1 of 1
Problem hardness index
Posted: Wed Jul 18, 2007 12:23 am
by Carlos
We're implementing in the new system a mod to give a problem an index (based in statistics like submissions, %AC, etc.) indicating how difficult it is to solve. I know there had already been some external tries about this and we would like to know how you did them.
We would also divide problems in 4 categories: easy, medium, hard, crazy depending on the value of this index.
All contributions and suggestions are highly appreciated.
Posted: Wed Jul 18, 2007 1:04 am
by misof
There's more to difficulty than a linear scale can show. For example, the average number of submits a user makes before getting AC somehow measures how "tricky" the problem is.
Posted: Wed Jul 18, 2007 7:17 am
by Darko
I would agree with misof. There are problems that are hard to solve but once you figure out a solution, it is easy to implement. And then there are problems that are easy to solve, but implementation is tricky. I would think that the latter ones are harder. But not always, see A vs J during Worlds - A was easy to solve, but you had to pay attention to details. J, OTOH, had a solution that was hard to come up with, but, once you got it (or someone showed you what it was, like in my case), implementation was easy.
Of course, there are the problems that are combination of the two and then there are the ones that contain several subproblems of various difficulty.
I think it is really a subjective call. Maybe you can have a panel of experinced setters/solvers and they could agree on a difficulty. And I mean people that both set and solve problems a lot, like misof. Of course, that would mean that they had some time to volunteer for that project, which is kinda hard. And - they should weed out the bad problems in the process, which would make their job even harder.
If you go by submitions, or ACs, like Felix does, sometimes it really does not reflect their difficulty. People have local contests, so some problems get solved more. Or some people try to improve their times and submit a lot (with a lot of AC/WA in the process). There is a difference between problems posted 9 years ago and last month, the problems that are in the Programming Challenges book are probably solved more, etc.
Posted: Wed Jul 18, 2007 2:25 pm
by Carlos
Ehm.....I think I didn't express the idea clearly. I don't (still) want an expert rating on how difficult the problem is. I want some auto-generated numerical index that show in some way how popular, hard to solve once you know how, etc, a problem is.
Posted: Wed Jul 18, 2007 4:49 pm
by little joey
Some time ago I calculated a difficulty scale based on how many people from the top 100 of the rank list solved a particular problem. The ranges are somewhat arbitrary, but it looked like this:
0-10 solvers: very difficult
11-30 solvers: difficult
31-70 solvers: intermediate
71-90 solvers: easy
91-100 solvers: very easy
It was generated automatically by reading the appropriate HTML pages, but can also be calculated from the database directly, of course. The choice to use the top 100 only was based on the idea that they form a relatively stable group of 'regular' users who at least try to solve the majority of the available problems and in their selection, on average, solve the easiest problems first. This system over-rates newly added problems, because it doesn't discriminate between 'active' and 'non-active' users. But over time it gives a reasonable estimate of the 'true' difficulty of a problem, I think.
Posted: Fri Jul 20, 2007 7:01 am
by yiuyuho
I use <Accepted Submission>*<Accepted Percent>^2.
This way problems that have high accept counts just because they've been around forever, but are actually difficult (low percentage) will be brought down. And if a problem has low accept percent, chances are I too have to do it over a few times, which makes the problem hard for me (since I have to find bug without knowing the test case when it fails).
I found that the problems with 95%+ Accept Percentage are almost guaranteed to be easy, however those are hard to come by nowadays.
Anyhow, here's a few issues I find about getting statistics base on submission results:
(1) Some problems have high success rate not because the problem itself is easy, but rather the solution is already posted online (either this forum, or TopCoder, etc.) So, if I go read the forum I'll know how to do the problem, in that sense you can say the problem is easy. But as far as practice is concerned, I really use the forum as a second to last resort, because you can't get your problem solving ability up if you just read algorithms and code them (instead of coming up with them).
(2) Of course, there is the issue of when and where the problem is set. Problems in Vol 1 are more likely to be solved even if they are hard. Problems that are made recently will have less AC submissions. Problems in Vol 9 are less lately to have visitors. But I believe those can be factored into a more complicated formula.
Posted: Sun Jul 29, 2007 9:31 pm
by A1
there could be some voting or rating system!! I mean every solver will rate a problem on its difficulty (maybe from 1 to 10) and offcourse the average will show real difficulty level!
Posted: Wed Aug 01, 2007 9:16 pm
by mpi
To my mind, the best solution must combine the two approaches discussed so far. That is, the final rating o "hardness index" of a problem should merge two complementary units of measurement:
1) Statistics. It's quite obvious that the percentage of people who have succesfully solved the problem is a critical factor to determine how hard a problem is. This part can be implemented in several ways: a good algorithm must take into account, e.g., how many users from the top 100 solved the problem (as somebody said above), how many submissions in average took the users to solve the problem, and things like that. Nonetheless, statistics alone might not capture subtleties that only users can find by themselves, like a tricky I/O, a hard-to-understand but easy-to-implement algorithm (or vice versa), a poor description of the problem, and so on.
2) Users' ratings. Opinion from users is also valuable. However, any newcome user/programmer can rate an easy problem as a hard problem just because of his short problem solving background.
Both points of view must be implemented carefully, but both are necessary. The point is, which proportion should be the fairest, 50%-50%, 60%-40%?