Accessing information in a webpage

Write here if you have problems with your C++ source code

Moderator: Board moderators

Post Reply
User avatar
shamim
A great helper
Posts: 498
Joined: Mon Dec 30, 2002 10:10 am
Location: Bozeman, Montana, USA

Accessing information in a webpage

Post by shamim » Thu Feb 10, 2005 9:33 am

Does anyone know how to get information from a certain page using Visual C++ code, or some similar script.

For example, I want to write a code that will that will get the AC number of some UserID by accessing the required pages.
Thanks.

User avatar
little joey
Guru
Posts: 1080
Joined: Thu Dec 19, 2002 7:37 pm

Post by little joey » Thu Feb 10, 2005 2:51 pm

Well, this kind of stuff strongly depends on:
- the operating system you use,
- the language you code in,
- the compiler you use (and its version).

A C program that works under Linux and compiles with gcc (version 3.x, I think) is:

Code: Select all

#include <stdio.h>
#include <ctype.h>
#include <string.h>

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>

int get_htmlpage(char *buffer, char *server, char *page){
    struct sockaddr_in server_info;
    struct hostent *host_info;
    unsigned long addr;
    int sock;
    static char request[1024],readbuffer[8192];
    int count,result,i;

    /* create socket */
    sock=socket(PF_INET,SOCK_STREAM,0);
    if(sock<0){
       sprintf(buffer,"*** SOCKET CREATION FAILED ***");
       return -1;
       }
       
    /* set up connection to server */
    memset(&server_info,0,sizeof(server_info));
    host_info=gethostbyname(server);
    if(host_info==NULL){
       sprintf(buffer,"*** UNKNOWN SERVER \"%s\" ***",server);
       return -1;
       }
    memcpy((char *)&server_info.sin_addr,host_info->h_addr,host_info->h_length);
    server_info.sin_family=AF_INET;
    server_info.sin_port=htons(80);
    
    /* connect to server */
    if(connect(sock,(struct sockaddr*)&server_info,sizeof(server_info))<0){
       sprintf(buffer,"*** CANNOT CONNECT TO SERVER \"%s\" ***",server);
       return -1;
       }
       
    /* send the request */
    sprintf(request,"GET %s HTTP/1.0\nHost: %s\n\n",page,server);
    send(sock,request,strlen(request),0);
    
    /* read the server response into buffer */
    result=0;
    do{
       result+=(count=recv(sock,readbuffer,sizeof(readbuffer),0));
       for(i=0;i<count;i++) *(buffer++)=readbuffer[i];
       }while(count>0);
    
    /* close the socket and return bytes read */
    close(sock);
    return result;
    }
    
int main(){
   static char buffer[128*1024];
   int bytes;
   char pagename[256],hostname[256];
   int userid=26795;
   char *bufptr;
   int solved;

   strcpy(hostname,"acm.uva.es");
   sprintf(pagename,"/cgi-bin/OnlineJudge?AuthorInfo:%d",userid);
   
   bytes=get_htmlpage(buffer,hostname,pagename);
   if(bytes<0){ /* error */
      printf("%s\n",buffer);
      return 1;
      }
   
   bufptr=strstr(buffer,"solved problems:");
   if(bufptr==NULL){ /* error */
      printf("*** UNABLE TO DECODE AUTHORINFO PAGE ***\n");
      return 1;
      }
   while(!isdigit(*bufptr)) bufptr--;
   while(isdigit(*bufptr)) bufptr--;
   bufptr++;
   solved=0;
   while(isdigit(*bufptr)){
      solved=10*solved+*bufptr-'0';
      bufptr++;
      }
      
   printf("USER %d SOLVED %d PROBLEMS\n",userid,solved);
   return 0;
   }
It's very Quick-n-Dirty. To make it work in another environment you need at least other includes, but functions and structures can also be different. Google is your best friend.

One very important request:

PLEASE DON'T FLOOD UVA WITH HTML REQUESTS CAUSED BY BUGGY PROGRAMS!!!

We're all dependent on the UVA host having sufficient bandwidth. One noob with a stupid program can spoil the fun for all of us!

User avatar
shamim
A great helper
Posts: 498
Joined: Mon Dec 30, 2002 10:10 am
Location: Bozeman, Montana, USA

Post by shamim » Fri Feb 11, 2005 8:39 am

Thanks a lot, it does work...

although there was a comiile error indicating that the close function is not defined, i just
placed the line in comments and it worked fine..

Post Reply

Return to “C++”