We're implementing in the new system a mod to give a problem an index (based in statistics like submissions, %AC, etc.) indicating how difficult it is to solve. I know there had already been some external tries about this and we would like to know how you did them.
We would also divide problems in 4 categories: easy, medium, hard, crazy depending on the value of this index.
All contributions and suggestions are highly appreciated.
Problem hardness index
Moderator: Board moderators

 System administrator
 Posts: 1286
 Joined: Sat Oct 13, 2001 2:00 am
 Location: Valladolid, Spain
 Contact:
Problem hardness index
DON'T PM ME > For any doubt, suggestion or error reporting, please use the "Contact us" form in the web.
I would agree with misof. There are problems that are hard to solve but once you figure out a solution, it is easy to implement. And then there are problems that are easy to solve, but implementation is tricky. I would think that the latter ones are harder. But not always, see A vs J during Worlds  A was easy to solve, but you had to pay attention to details. J, OTOH, had a solution that was hard to come up with, but, once you got it (or someone showed you what it was, like in my case), implementation was easy.
Of course, there are the problems that are combination of the two and then there are the ones that contain several subproblems of various difficulty.
I think it is really a subjective call. Maybe you can have a panel of experinced setters/solvers and they could agree on a difficulty. And I mean people that both set and solve problems a lot, like misof. Of course, that would mean that they had some time to volunteer for that project, which is kinda hard. And  they should weed out the bad problems in the process, which would make their job even harder.
If you go by submitions, or ACs, like Felix does, sometimes it really does not reflect their difficulty. People have local contests, so some problems get solved more. Or some people try to improve their times and submit a lot (with a lot of AC/WA in the process). There is a difference between problems posted 9 years ago and last month, the problems that are in the Programming Challenges book are probably solved more, etc.
Of course, there are the problems that are combination of the two and then there are the ones that contain several subproblems of various difficulty.
I think it is really a subjective call. Maybe you can have a panel of experinced setters/solvers and they could agree on a difficulty. And I mean people that both set and solve problems a lot, like misof. Of course, that would mean that they had some time to volunteer for that project, which is kinda hard. And  they should weed out the bad problems in the process, which would make their job even harder.
If you go by submitions, or ACs, like Felix does, sometimes it really does not reflect their difficulty. People have local contests, so some problems get solved more. Or some people try to improve their times and submit a lot (with a lot of AC/WA in the process). There is a difference between problems posted 9 years ago and last month, the problems that are in the Programming Challenges book are probably solved more, etc.

 System administrator
 Posts: 1286
 Joined: Sat Oct 13, 2001 2:00 am
 Location: Valladolid, Spain
 Contact:
Ehm.....I think I didn't express the idea clearly. I don't (still) want an expert rating on how difficult the problem is. I want some autogenerated numerical index that show in some way how popular, hard to solve once you know how, etc, a problem is.
DON'T PM ME > For any doubt, suggestion or error reporting, please use the "Contact us" form in the web.

 Guru
 Posts: 1080
 Joined: Thu Dec 19, 2002 7:37 pm
Some time ago I calculated a difficulty scale based on how many people from the top 100 of the rank list solved a particular problem. The ranges are somewhat arbitrary, but it looked like this:
010 solvers: very difficult
1130 solvers: difficult
3170 solvers: intermediate
7190 solvers: easy
91100 solvers: very easy
It was generated automatically by reading the appropriate HTML pages, but can also be calculated from the database directly, of course. The choice to use the top 100 only was based on the idea that they form a relatively stable group of 'regular' users who at least try to solve the majority of the available problems and in their selection, on average, solve the easiest problems first. This system overrates newly added problems, because it doesn't discriminate between 'active' and 'nonactive' users. But over time it gives a reasonable estimate of the 'true' difficulty of a problem, I think.
010 solvers: very difficult
1130 solvers: difficult
3170 solvers: intermediate
7190 solvers: easy
91100 solvers: very easy
It was generated automatically by reading the appropriate HTML pages, but can also be calculated from the database directly, of course. The choice to use the top 100 only was based on the idea that they form a relatively stable group of 'regular' users who at least try to solve the majority of the available problems and in their selection, on average, solve the easiest problems first. This system overrates newly added problems, because it doesn't discriminate between 'active' and 'nonactive' users. But over time it gives a reasonable estimate of the 'true' difficulty of a problem, I think.
The biggest problem with most problems is not how to solve the problem, but how to not solve what is not the problem.
I use <Accepted Submission>*<Accepted Percent>^2.
This way problems that have high accept counts just because they've been around forever, but are actually difficult (low percentage) will be brought down. And if a problem has low accept percent, chances are I too have to do it over a few times, which makes the problem hard for me (since I have to find bug without knowing the test case when it fails).
I found that the problems with 95%+ Accept Percentage are almost guaranteed to be easy, however those are hard to come by nowadays.
Anyhow, here's a few issues I find about getting statistics base on submission results:
(1) Some problems have high success rate not because the problem itself is easy, but rather the solution is already posted online (either this forum, or TopCoder, etc.) So, if I go read the forum I'll know how to do the problem, in that sense you can say the problem is easy. But as far as practice is concerned, I really use the forum as a second to last resort, because you can't get your problem solving ability up if you just read algorithms and code them (instead of coming up with them).
(2) Of course, there is the issue of when and where the problem is set. Problems in Vol 1 are more likely to be solved even if they are hard. Problems that are made recently will have less AC submissions. Problems in Vol 9 are less lately to have visitors. But I believe those can be factored into a more complicated formula.
This way problems that have high accept counts just because they've been around forever, but are actually difficult (low percentage) will be brought down. And if a problem has low accept percent, chances are I too have to do it over a few times, which makes the problem hard for me (since I have to find bug without knowing the test case when it fails).
I found that the problems with 95%+ Accept Percentage are almost guaranteed to be easy, however those are hard to come by nowadays.
Anyhow, here's a few issues I find about getting statistics base on submission results:
(1) Some problems have high success rate not because the problem itself is easy, but rather the solution is already posted online (either this forum, or TopCoder, etc.) So, if I go read the forum I'll know how to do the problem, in that sense you can say the problem is easy. But as far as practice is concerned, I really use the forum as a second to last resort, because you can't get your problem solving ability up if you just read algorithms and code them (instead of coming up with them).
(2) Of course, there is the issue of when and where the problem is set. Problems in Vol 1 are more likely to be solved even if they are hard. Problems that are made recently will have less AC submissions. Problems in Vol 9 are less lately to have visitors. But I believe those can be factored into a more complicated formula.
To my mind, the best solution must combine the two approaches discussed so far. That is, the final rating o "hardness index" of a problem should merge two complementary units of measurement:
1) Statistics. It's quite obvious that the percentage of people who have succesfully solved the problem is a critical factor to determine how hard a problem is. This part can be implemented in several ways: a good algorithm must take into account, e.g., how many users from the top 100 solved the problem (as somebody said above), how many submissions in average took the users to solve the problem, and things like that. Nonetheless, statistics alone might not capture subtleties that only users can find by themselves, like a tricky I/O, a hardtounderstand but easytoimplement algorithm (or vice versa), a poor description of the problem, and so on.
2) Users' ratings. Opinion from users is also valuable. However, any newcome user/programmer can rate an easy problem as a hard problem just because of his short problem solving background.
Both points of view must be implemented carefully, but both are necessary. The point is, which proportion should be the fairest, 50%50%, 60%40%?
1) Statistics. It's quite obvious that the percentage of people who have succesfully solved the problem is a critical factor to determine how hard a problem is. This part can be implemented in several ways: a good algorithm must take into account, e.g., how many users from the top 100 solved the problem (as somebody said above), how many submissions in average took the users to solve the problem, and things like that. Nonetheless, statistics alone might not capture subtleties that only users can find by themselves, like a tricky I/O, a hardtounderstand but easytoimplement algorithm (or vice versa), a poor description of the problem, and so on.
2) Users' ratings. Opinion from users is also valuable. However, any newcome user/programmer can rate an easy problem as a hard problem just because of his short problem solving background.
Both points of view must be implemented carefully, but both are necessary. The point is, which proportion should be the fairest, 50%50%, 60%40%?