Kinds of Semantic Similarity

The widespread use of semantic similarity is demonstrated by the large number of existing methods used for its calculation. The most relevant measurements made so far can be divided into those using taxonomies or hierarchical structures, those using corpus and those using search engine results. Some measures use a combination of both.

One of the most widely used taxonomies as a data source is WordNet. WordNet is a collection of English terms displayed hierarchically. The use of WordNet gives good results for the calculation of the semantic similarity between terms of general scope. Apart from WordNet, in other cases, specific taxonomies are used for a certain area, and therefore cannot be independent of the user.

In the case of using dictionaries and other corpora, the processing is required of a large collection of texts which, moreover, if they are not suitable, will not produce good results. They are also usually domain-dependent. The use of web search engines has also been used in recent years, already that the web can cover all the concepts of the real world. However, its results do not exceed those of traditional measures with taxonomies.

この記事が気に入ったらサポートをしてみませんか?