368 - Indexing Web Pages

All about problems in Volume 3. If there is a thread about your problem, please use it. If not, create one with its number in the subject.

Moderator: Board moderators

Post Reply
ajeet.singh82
New poster
Posts: 6
Joined: Wed Oct 10, 2012 7:31 am

368 - Indexing Web Pages

Post by ajeet.singh82 »

Hi,

The problem description says we got to read the files instead of standard input starting with index.htm.
The only HTML command you need to worry about is the HREF command, and you can assume that it will always be in the form <A HREF="filename">, with no additional spaces or other characters; that the name of the file is legal and in the same directory as the file you are already reading; and that the name of the file will not exceed twelve characters in length. Filenames will always end with ``.htm".
Contrary, Sample input says files are given as standard input each separated by line -
The initial HTML file you should start indexing will be named index.htm. Next the other files, including webpage.in, with a single blank line separating each listing. The words in webpage.in will be placed one word per line, with no additional spaces.
(understanding above is NP-Hard):)

Now if I start with case-2, I am confused with sample input because there in no way to figure out "file name" ? see file -2 listing which some how to be layout.htm

What is correct input , output format?
Please clarify
-A
brianfry713
Guru
Posts: 5947
Joined: Thu Sep 01, 2011 9:09 am
Location: San Jose, CA, USA

Re: 368 IO specification confusing

Post by brianfry713 »

You should only read from stdin and write to stdout, follow the sample I/O. The problem description was probably copied from a contest that required you to read from files, but on this website for this problem (and most if not all others) you need to read from stdin instead.

You always start reading from stdin the contents of "index.htm", then keep a queue of all filenames that are referenced but have not yet been read. Once you read a blank line, go to the next filename in the queue until no files are left. Then you read in the words until EOF.
Check input and AC output for thousands of problems on uDebug!
RandyWaterhouse
New poster
Posts: 4
Joined: Tue Dec 13, 2016 1:41 pm

Re: 368 - Indexing Web Pages

Post by RandyWaterhouse »

Thanks Brian, that was really helpful. The description had me confused, I actually tried to open files "index.htm" et cetera (even though that would be highly unusual for UVA).
Post Reply

Return to “Volume 3 (300-399)”