Introduction to Information Retrieval
Keith van Rijsbergen
Computing Science
IR: Retrieval of unstructured data (text documents, images, videos etc.)
Some general terms used in IR:
Term Frequency
Frequency of word occurrence in a document is a useful measure for word significance.
Inverse document frequency
The value of a keyword varies inversely with the log of the number of documents in which it occurs.
IR Model
Explains the structure and processes of IR systems
Clarifies the general characteristics of IR systems
There exist various models
Boolean, Vector space, Probabilistic, Language models, Cognitive etc.
Cranfield Paradigm
• Document collection
• Relevance judgements in advance
• Run strategy A and B
• Evaluate A and B in terms of Precision & Recall
• Compare A with B statistically
• State whether A is comparable to B, A is better than B, B is worse than A



.gif)
0 comments:
Post a Comment