CSCI 4141 -- FINAL MILESTONE

Given: September 24, 2006

Due: November 27, 2006 (week of)

The purpose of this milestone is to write the retrieval system based on the files generated in milestone 2, with a GUI interface. You may use any programming language and interface package you wish, including web browser and CGI, etc.

The retrieval system should have 3 major functions, search based on query, display and user feedback. It should have a web-like look and feel (I suggest you use a browser).

Project Evaluation

SEARCH FUNCTION BASED ON QUERY

1. The search function should permit the user to enter a string of keywords.

2. The system then should produce a ranked list of newsitems based on the vector space model and the cosine similarity measure.

3. The ranked list should display the headline of each newsitem. I suggest you only give the user the top 20 headlines.

DISPLAY FUNCTION

1. The user should be able to display an item in the database, but with all the XML tags removed. Make sure that you use different fonts for the headline, etc.

2. The user should be able to select the item for display by clicking on the headline in the list returned by the query function.

Rocchio Feedback Method

Once an initial search has been done on the query, the user should be able to evaluate each item as relevant or non-relevant (restricted to top 20 items) and then the system should incorporate these judgements as per the Rocchio Feedback Method to modify and rerun the query. The user should be able to do this any number of times.

HANDINS

NOTES

If you want, I will look at your retrieval system at any time to give you advice on the functionality and look-and-feel of the system.