<Table> <Tr> <Td> </Td> <Td> This article includes a list of references, but its sources remain unclear because it has insufficient inline citations . Please help to improve this article by introducing more precise citations . (July 2012) (Learn how and when to remove this template message) </Td> </Tr> </Table> <Tr> <Td> </Td> <Td> This article includes a list of references, but its sources remain unclear because it has insufficient inline citations . Please help to improve this article by introducing more precise citations . (July 2012) (Learn how and when to remove this template message) </Td> </Tr> <P> In information retrieval, tf--idf or TFIDF, short for term frequency--inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus . It is often used as a weighting factor in searches of information retrieval, text mining, and user modeling . The tf - idf value increases proportionally to the number of times a word appears in the document, but is often offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general . Nowadays, tf - idf is one of the most popular term - weighting schemes . For instance, 83% of text - based recommender systems in the domain of digital libraries use tf - idf . </P> <P> Variations of the tf--idf weighting scheme are often used by search engines as a central tool in scoring and ranking a document's relevance given a user query . tf--idf can be successfully used for stop - words filtering in various subject fields, including text summarization and classification . </P>

The tf-idf score of a term increases with