Page 1 of 1

368 - Indexing Web Pages

Posted: Fri Nov 23, 2012 2:56 pm
by ajeet.singh82

The problem description says we got to read the files instead of standard input starting with index.htm.
The only HTML command you need to worry about is the HREF command, and you can assume that it will always be in the form <A HREF="filename">, with no additional spaces or other characters; that the name of the file is legal and in the same directory as the file you are already reading; and that the name of the file will not exceed twelve characters in length. Filenames will always end with ``.htm".
Contrary, Sample input says files are given as standard input each separated by line -
The initial HTML file you should start indexing will be named index.htm. Next the other files, including, with a single blank line separating each listing. The words in will be placed one word per line, with no additional spaces.
(understanding above is NP-Hard):)

Now if I start with case-2, I am confused with sample input because there in no way to figure out "file name" ? see file -2 listing which some how to be layout.htm

What is correct input , output format?
Please clarify

Re: 368 IO specification confusing

Posted: Fri Nov 30, 2012 10:26 am
by brianfry713
You should only read from stdin and write to stdout, follow the sample I/O. The problem description was probably copied from a contest that required you to read from files, but on this website for this problem (and most if not all others) you need to read from stdin instead.

You always start reading from stdin the contents of "index.htm", then keep a queue of all filenames that are referenced but have not yet been read. Once you read a blank line, go to the next filename in the queue until no files are left. Then you read in the words until EOF.

Re: 368 - Indexing Web Pages

Posted: Thu Dec 29, 2016 11:03 am
by RandyWaterhouse
Thanks Brian, that was really helpful. The description had me confused, I actually tried to open files "index.htm" et cetera (even though that would be highly unusual for UVA).