|Crawl a web-site or a portion of it with 'wget', collecting 1000-20000 documents on the class server. Index that collection including html, pdf, and text files. Prepare a UI that allows one to search through that collection. Make sure to display a message on your UI to let the user know what collection he/she is searching in. While displaying the results as a rank-list, you should show the title and snippets of the results and provide a link from the title of a result to the crawled local file or the live file on the web. |
Email the instructor (1) your wget command, [3 points] (2) parameter file for indexing, [2 points] and (3) a link to your working site [5 points] with "INLS 490: Assignment-11" in the subject field.