A Comparison of Keyword- and Keyterm-based Methods for Automatic Web Site Summarization

Authors: 

Yongzheng Zhang
Evangelos Milios
Nur Zincir-Heywood

Author Addresses: 

Faculty of Computer Science
Dalhousie University
6050 University Ave.
PO Box 15000
Halifax, Nova Scotia, Canada
B3H 4R2

Abstract: 

Automatic Web site summarization, which is based on keyword and key sentence extraction from narrative text, is an effective means of making the content of a Web site easily accessible to Web users. This work is directed towards summary generation based on multi-word terms extracted by the C-value/NC-value method. Keyterm-based summaries are compared with keyword-based summaries for a list of test Web sites. The evaluation indicates that keyterm-based summaries are significantly better than keyword-based summaries, which have previously been shown to be as informative as human-authored summaries.

Tech Report Number: 
CS-2004-11
Report Date: 
October 2, 2004
AttachmentSize
PDF icon CS-2004-11.pdf1.23 MB