more on user statistics

Post here if you don't find any other place for your post. But please, stay on-topic: algorithms, programming or something related to this web site and its services.

Moderator: Board moderators

xenon
Learning poster
Posts: 100
Joined: Fri May 24, 2002 10:35 am
Location: Scheveningen, Holland

more on user statistics

Post by xenon »

Inspired by an earier thread on this site by Caesum, I thought about how nice it would be to have more detailed user statistics. I've been enjoying solving contest problems for more than 2 months now, and the thing that I miss most, is the possibility to directly link to problems I haven't solved yet. Also I like to know how good my solutions are compared to others.
So I wrote my own program for it, and I must admit, it got a little out of hand. It directly reads the problemstat pages from the acm-host, and compiles detailed statistics for every problem it finds. The output is a set of HTML-pages.
An example of what it produces can be found here http://joachim.wulff.net/valladolid/userstat.html.
The program executable and the source files are contained in this zip-file http://joachim.wulff.net/valladolid/getstats.zip.
It runs in a DOS box under WIN98 and needs access to the internet. See the readme.txt for details.

The bad thing is that it has to read all problemstat data from the site, and that is a bulky 45 Megs, currently. Also it strongly depends on the current layout of the website.

The good thing is that it shows detailed information (at least the information I like to see), and it has options to do quicker partial scans per volume; you don't have to update all the data every time you use the program.

Well, I hope to get your comments. Btw: I don't supply support! And: I'm not responsible for anything that can go wrong, nor for heavy traffic on the ACM-host.

Caesum
Experienced poster
Posts: 225
Joined: Fri May 03, 2002 12:14 am
Location: UK
Contact:

Post by Caesum »

Wow xenon, just what I need to add to my armory ;)

When I run it though I get as far as:

UHTMPAGE: connecting to acm.uva.es
UHTMPAGE: host connected
UHTMPAGE: accessing page /cgi-bin/OnlineJudge?ProblemsList
UHTMPAGE: bytes recieved 0

and thats it. Not sure why this happens. I'm on cable and my ISP implements a transparent proxy which is not always so transparent, my firewall is zonealarm and when it pops up I allow the program to access the site...... cant think of anything other possible reasons at the moment....

Caesum
Experienced poster
Posts: 225
Joined: Fri May 03, 2002 12:14 am
Location: UK
Contact:

Post by Caesum »

The only thing that looks funny to me is the page request code:
[pascal]
WebRequest := 'GET '+pagename+' '+char(10)+char(13)+chr(0);
[/pascal]
whenever i have done a page request in code before i have always used
HTTP/1.0 at the end of the GET, and added a few more lines like
GET /cgi-bin/OnlineJudge?ProblemsList HTTP/1.0
Accept: */*
Referer: http://acm.uva.es/
Accept-Language: en-gb
User-Agent: you_wish [en] (SomeO/S; Blah; Sowhat)
Host: acm.uva.es
Anyone else using this program ?

wyvmak
Experienced poster
Posts: 110
Joined: Thu Dec 13, 2001 2:00 am

Great

Post by wyvmak »

it's really good that you have this stats page, but i'm afraid i couldn't get the connection, with a proxy and on my win2k. i don't know Pascal (i forget them already). but it looks like stopped after "UHTMPAGE: connecting to acm.uva.es", maybe my internet connection is problematic by itself. but it's really inspiring, truly. but i have the following thoughts:
1. it looks that if to measure the difficulty of a problem, your using of number of solvers seems a more accurate way.
2. if a user has the same name (or same display name?) as another user in the ranklist, then there would be a problem, (at least on determine the rank), isn't it?
3. you've filtered judge-not-available problem, that's a good feature.
4. using average running time for comparison seems a bit something to me (i cannot find the word), though, i cannot think of another.

xenon
Learning poster
Posts: 100
Joined: Fri May 24, 2002 10:35 am
Location: Scheveningen, Holland

Hmmm

Post by xenon »

Caesum:
That's the problem with using code you don't thoroughly understand...
As far as I can see, based on the output you supply, the program issues a receive request, and waits forever. Recv() is really a call to 'recv' in WINSOCK.DLL, and it should time out after some seconds if the server can't deliver the requested page in time (afaik). My program should give an error message if this happens, so since it doesn't, I conclude that recv doesn't time out. Strange.
I don't think the format of the GET command is wrong. At least the acm host knows how to handle it. I can use Telnet to log into acm.uva.es, port 80, and type in 'get /cgi-bin/OnlineJudge?ProblemsList' and then get the page-data allright. The remainder (HTTP/1.0 & data fields) is optional and is normally supplied by the browser, I think.
So maybe there's an intervening proxy along the route? I also use zonealarm, but have static IP. I'll try my program using a modem, and see what happens.
I'm clearly in the dark here :-? If I use my program with your userid, it works OK.

wyvmak:
2. I didn't think of that before, but you're right. I guess it'll use the the last occurrence in every list for it's statistics. I'm not shure ACM accepts duplicate names, but it might. Bad luck.
4. Well, it's just something, and that's all it is. Love it or leave it. I think the median value, or the average of the middle 50%, would be better values to compare with (some problems have improbable extremes, like 0 secs for #333, and times above 30 secs), but I don't care too much.

If anybody can help me find some code to more reliably read webpages (C, C++, Pascal, Assembler), than I would be much obliged.

-xenon

Adrian Kuegel
Guru
Posts: 724
Joined: Wed Dec 19, 2001 2:00 am
Location: Germany

Post by Adrian Kuegel »

I have used your program, and it did work. But my first try was not succesful (it was at 22.00 judge time). I think the best time to use this program is after 0.00 judge time.

AlexandreN
New poster
Posts: 27
Joined: Sun Jul 07, 2002 6:46 pm
Location: Campina Grande - Brazil
Contact:

Post by AlexandreN »

I have used your program and get the above output: :(

UHTMPAGE: connecting to acm.uva.es
UHTMPAGE: host connected
UHTMPAGE: accessing page /cgi-bin/OnlineJudge?ProblemsList
UHTMPAGE: bytes recieved 0
UHTMPAGE: page read complete
UHTMPAGE: closing host... done

SERIOUS ERROR: Problems list not found on host
Program halting.

xenon
Learning poster
Posts: 100
Joined: Fri May 24, 2002 10:35 am
Location: Scheveningen, Holland

possible bug, new version

Post by xenon »

Thanks Adrian, at least it works sometimes...

Caesum:
I think I found a possible bug. As your quote indicates, a HTTP-request can be a multi-line package, so it needs a way to indicate the end of the request. Most probable this is done by adding an empty line. (I checked the junkbuster source code (a great source for wannabe sockets programmers) and they allways end their requests with an extra CRLF).
So I changed my code:[pascal]WebRequest := 'GET '+pagename+' '+char(10)+char(13)+char(10)+char(13)+chr(0);[/pascal]I recompiled and put the new version on the above stated link. Would you be so kind to download it and test it?
The reason the previous version worked here and not from your PC could be the 'transparent' proxy. The ACM host just times out waiting for the extra empty line and sends the requested data anyway. Your proxy, however, waits forever for the end-of-request signal before sending it through to the Judge host. Sounds plausible?
Anyway, I'm anxiously awaiting your results.
Re the 'HTTP/1.0' addition to the request: my wild guess is that it makes the server send a reply header (with date, checksum, server version, etc.) prepended to the page data. We don't need them, so we don't ask for them. I don't get them anyway without the 'HTTP/1.0' addition.

AlexandreN

AlexandreN
New poster
Posts: 27
Joined: Sun Jul 07, 2002 6:46 pm
Location: Campina Grande - Brazil
Contact:

Post by AlexandreN »

Yes, my id is 3590, it seems like the system cannot read /cgi-bin/OnlineJudge?ProblemsList

D:\acm\util>getstats 3590 0
UHTMPAGE: connecting to acm.uva.es
UHTMPAGE: host connected
UHTMPAGE: accessing page /cgi-bin/OnlineJudge?ProblemsList
UHTMPAGE: bytes recieved 0
UHTMPAGE: page read complete
UHTMPAGE: closing host... done

SERIOUS ERROR: Problems list not found on host
Program halting.

Ivor
Experienced poster
Posts: 150
Joined: Wed Dec 26, 2001 2:00 am
Location: Tallinn, Estonia

Post by Ivor »

I don't know what's wrong with other people but I just got my general info witout any problems. Works fine, looks fine. Thanks.

Ivor

Caesum
Experienced poster
Posts: 225
Joined: Fri May 03, 2002 12:14 am
Location: UK
Contact:

Post by Caesum »

Xenon,

Yes! working now, as you can see my ISPs transparent proxy is not very transparent (and this isnt the only instance where its transparentness shows up :( )

and completed in between 10 and 15 minutes :)

wyvmak
Experienced poster
Posts: 110
Joined: Thu Dec 13, 2001 2:00 am

Post by wyvmak »

i don't know why, i still cannot use it. therefore, i wrote my own version, with fewer features. it's slow, plus i put some sleep() in the code.

source code at:
http://www.zdtech.net/~vincent/acm_stat.cpp
i'm not sure how long i'd put it up. i'm sorry. the code is a bit naive, it works under Linux, i think it won't work if with proxy or firewall (though i haven't tested it). also, any comment on my code would be welcome.

my stats at:
http://www.zdtech.net/~vincent/acm.txt

it looks that most of my ranking aren't high (which is quite disappointing to me).

xenon
Learning poster
Posts: 100
Joined: Fri May 24, 2002 10:35 am
Location: Scheveningen, Holland

Post by xenon »

Well Caesum, I'm glad it works now. 't Was the extra CRLF which, I guess, is part of the official HTTP standard :)
Your proxy migh not be that transparent, but at least your connection is twice as fast as mine: it takes me 30 mins for a full scan.

wyvmak: My code will never work under Linux, since it uses WINSOCK.DLL and SOCKETS.DLL. I guess a port is possible, in principle, because Free Pascal also comes for Linux. Looking at your code, there is not much difference in the way Internet is accessed, so the adjustments will be small. I currently have no Linux installed, so I won't be able to do it myself.

Enjoy,
-xenon

wyvmak
Experienced poster
Posts: 110
Joined: Thu Dec 13, 2001 2:00 am

Post by wyvmak »

actually, i tried your code on my win2k, not on Linux.

>getstats 5656
UHTMPAGE: connecting to acm.uva.es
UHTMPAGE: host connected
UHTMPAGE: accessing page /cgi-bin/OnlineJudge?ProblemsList
UHTMPAGE: bytes recieved 0

then i waited for a minute, but it still kept there. do you know why? or I am just impatient to wait? i write on Linux, as i'm not good at sockets programming, and comparatively Linux seems a easier platform for me to write on.

xenon
Learning poster
Posts: 100
Joined: Fri May 24, 2002 10:35 am
Location: Scheveningen, Holland

Post by xenon »

AFAIK you shouldn't have to wait more then a few seconds, otherwise the program is stuck waiting for the page. Are you sure you have the latest version of my program (as uploaded yesterday)? I fixed a small bug to cope with some proxies, and as a side effect it performs twice as fast (if your internet connection is fast enough).
I just tested the program under win2k, and it works fine.

I'm off on holiday now,
-xenon

Post Reply

Return to “Other words”