A study in explaining unseen words in Indonesian using analogical clusters

Fam, Rashel; Lepage, Yves; Gojali, Susanti; Purwarianti, Ayu

UCSYRR Home
/
Conferences
/
International Conference on Computer Applications (ICCA)
/
Fifteenth International Conference On Computer Applications (ICCA 2017)
/
View Item

A study in explaining unseen words in Indonesian using analogical clusters

Fam, Rashel; Lepage, Yves; Gojali, Susanti; Purwarianti, Ayu

URI: http://onlineresource.ucsy.edu.mm/handle/123456789/890

Date: 2017-02-16

Abstract:

Abstract We propose a pipeline to explain, on the level of form, the unseen words contained in an Indonesian test set, by using analogical clusters. Analogical clusters are extracted from a training set by relying on formal relations between words. The unseen words which can be explained on the level of form are then verified on two other representation levels: morpho-logical and semantic. In our experiments on the BPPT corpus, 98 % of unseen words were explained on the level form, out of which 58 % could also be explained on the two levels of morphological and semantic representations.

Description:

This work was supported by a grant from the Japanese Society for the Promotion of Science (JSPS): grant number 15K00317 entitled ‘Language productivity: fast extraction of productive analogical clusters and their evaluation using statistical machine translation.’

Show full item record