SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian

Stanković, Ranka; Šandrih, Branislava; Stijović, Rada; Krstev, Cvetana; Vitas, Duško; Marković, Aleksandra

dc.creator	Stanković, Ranka
dc.creator	Šandrih, Branislava
dc.creator	Stijović, Rada
dc.creator	Krstev, Cvetana
dc.creator	Vitas, Duško
dc.creator	Marković, Aleksandra
dc.date.accessioned	2020-02-07T11:55:51Z
dc.date.available	2020-02-07T11:55:51Z
dc.date.issued	2019
dc.identifier.issn	2533-5626
dc.identifier.uri	https://dais.sanu.ac.rs/123456789/7162
dc.description.abstract	In this paper we present a model for selection of good dictionary examples for Serbian and the development of initial model components. The method used is based on a thorough analysis of various lexical and syntactic features in a corpus compiled of examples from the five digitized volumes of the Serbian Academy of Sciences and Arts (SASA) dictionary. The initial set of features was inspired by a similar approach for other languages. The feature distribution of examples from this corpus is compared with the feature distribution of sentence samples extracted from corpora comprising various texts. The analysis showed that there is a group of features which are strong indicators that a sentence should not be used as an example. The remaining features, including detection of non-standard and other marked lexis from the SASA dictionary, are used for ranking. The selected candidate examples, represented as featurevectors, are used with the GDEX ranking tool for Serbian candidate examples and a supervised machine learning model for classification on standard and non-standard Serbian sentences, for further integration into a solution for present and future dictionary production projects.	en
dc.language.iso	en	sr
dc.publisher	Brno : Lexical Computing CZ s.r.o.	sr
dc.relation	info:eu-repo/grantAgreement/MESTD/Basic Research (BR or ON)/178003/RS//
dc.relation	info:eu-repo/grantAgreement/MESTD/Integrated and Interdisciplinary Research (IIR or III)/47003/RS//
dc.relation	info:eu-repo/grantAgreement/MESTD/Basic Research (BR or ON)/178009/RS//
dc.rights	openAccess	sr
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.source	Electronic lexicography in the 21st century : Smart lexicography	sr
dc.subject	Serbian	sr
dc.subject	good dictionary examples	sr
dc.subject	automatization of dictionary-making	sr
dc.subject	feature extraction	sr
dc.subject	machine learning	sr
dc.title	SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian	en
dc.type	article	sr
dc.rights.license	BY-NC-ND	sr
dcterms.abstract	Крстев, Цветана; Витас, Душко; Станковић, Ранка; Марковић, Aлександра; Шандрих, Бранислава; Стијовић, Рада;
dc.citation.spage	248
dc.citation.epage	269
dc.type.version	publishedVersion	sr
dc.identifier.fulltext	https://dais.sanu.ac.rs/bitstream/id/28286/stankovic.et.al.sasa.2019.pdf
dc.identifier.rcub	https://hdl.handle.net/21.15107/rcub_dais_7162

Документи

Име:: stankovic.et.al.sasa.2019.pdf
Величина:: 981.1Kb
Формат:: PDF

Отварање

Овај документ се појављује у следећим колекцијама

ИСЈ САНУ - Општа колекција / General collection

Приказ основних података о документу