University of Twente Student Theses

Login

Information Retrieval models based on electrical circuit analysis using Ohm’s law

Donselaar, V.L. van (2012) Information Retrieval models based on electrical circuit analysis using Ohm’s law.

[img] PDF
130kB
Abstract:The TFxIDF weighting function is a well-known and proven model in modern information retrieval systems. The model allows docu- ments to be ranked by relevance based on the frequency of terms issued by a user’s query. Despite the fact that it yields good results, an clarification for its success is not so obvious. Attempts have been made to explain the model in terms of statistics or common sense. This paper tries to find similarities and differences with the theory of network analysis. A simplified network model based on the principle of an electrical circuit acts as a guide to gain under- standing of the model’s operation. The correctness of this model is tested by implementing it as a function of the Terrier Informa- tion Retrieval System, whereupon it is evaluated against Terrier’s predefined TFxIDF model. Results show that the precision of the network model is not as high as a TFxIDF model would typically achieve. Nonetheless the network model shows a new approach for calculation of the document score based on multiple termed queries, which improves the precision of the top 10 results.
Item Type:Essay (Bachelor)
Faculty:TNW: Science and Technology
Subject:54 computer science
Programme:Advanced Technology BSc (50002)
Link to this item:https://purl.utwente.nl/essays/74667
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page