CoDE Publications CoDE Publications
IRIDIA Publications IRIDIA Publications
SMG Publications
WIT Publications
WIT Publications
SMG Publications
Home People Research Activities Publications Teaching Resources
By Class By Topic By Year
By Class By Topic By Year
login
J.-P. Norguet, B. Tshibasu-Kabeya, G. Bontempi, and E. Zimányi. Category-Based Audience Metrics for Web Site Content Improvement using Ontologies and Page Classification. In Proceedings of the 11th International Conference on Applications of Natural Language to Information Systems, NLDB, number 3999 in Lecture Notes in Computer Science, pages 216-220. Springer-Verlag, Klagenfurt, Austria, May-June 2006.
© Springer-Verlag 2006 – http://dx.doi.org/10.1007/11765448_21

Abstract

With the emergence of the World Wide Web, analyzing and improving Web communication has become essential to adapt the Web content to the visitors' expectations. Web communication analysis is traditionally performed by Web analytics software, which produce long lists of page-based audience metrics. These results suffer from page synonymy, page polysemy, page temporality, and page volatility. In addition, the metrics contain little semantics and are too detailed to be exploited by organization managers and chief editors, who need summarized and conceptual information to take high-level decisions. To obtain such metrics, we propose to classify the Web site pages into categories representing the Web site topics and to aggregate the page hits accordingly. In this paper, we show how to compute and visualize these metrics using OLAP tools. To solve the page-temporality issue, we propose to classify the versions of the pages using support vector machines. To validate our approach, we perform experiments on real data with SQL Server OLAP Analysis Service, the R statistical tool, and our prototype WASA-PC. Finally, we compare our results against directory-based metrics and concept-based metrics.


Updated: 2017-03-27