Приказ основних података о документу

dc.creatorStanković, Ranka
dc.creatorŠandrih, Branislava
dc.creatorStijović, Rada
dc.creatorKrstev, Cvetana
dc.creatorVitas, Duško
dc.creatorMarković, Aleksandra
dc.date.accessioned2020-02-07T11:55:51Z
dc.date.available2020-02-07T11:55:51Z
dc.date.issued2019
dc.identifier.issn2533-5626
dc.identifier.urihttps://dais.sanu.ac.rs/123456789/7162
dc.description.abstractIn this paper we present a model for selection of good dictionary examples for Serbian and the development of initial model components. The method used is based on a thorough analysis of various lexical and syntactic features in a corpus compiled of examples from the five digitized volumes of the Serbian Academy of Sciences and Arts (SASA) dictionary. The initial set of features was inspired by a similar approach for other languages. The feature distribution of examples from this corpus is compared with the feature distribution of sentence samples extracted from corpora comprising various texts. The analysis showed that there is a group of features which are strong indicators that a sentence should not be used as an example. The remaining features, including detection of non-standard and other marked lexis from the SASA dictionary, are used for ranking. The selected candidate examples, represented as featurevectors, are used with the GDEX ranking tool for Serbian candidate examples and a supervised machine learning model for classification on standard and non-standard Serbian sentences, for further integration into a solution for present and future dictionary production projects.en
dc.language.isoensr
dc.publisherBrno : Lexical Computing CZ s.r.o.sr
dc.relationinfo:eu-repo/grantAgreement/MESTD/Basic Research (BR or ON)/178003/RS//
dc.relationinfo:eu-repo/grantAgreement/MESTD/Integrated and Interdisciplinary Research (IIR or III)/47003/RS//
dc.relationinfo:eu-repo/grantAgreement/MESTD/Basic Research (BR or ON)/178009/RS//
dc.rightsopenAccesssr
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.sourceElectronic lexicography in the 21st century : Smart lexicographysr
dc.subjectSerbiansr
dc.subjectgood dictionary examplessr
dc.subjectautomatization of dictionary-makingsr
dc.subjectfeature extractionsr
dc.subjectmachine learningsr
dc.titleSASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbianen
dc.typearticlesr
dc.rights.licenseBY-NC-NDsr
dcterms.abstractКрстев, Цветана; Витас, Душко; Станковић, Ранка; Марковић, Aлександра; Шандрих, Бранислава; Стијовић, Рада;
dc.citation.spage248
dc.citation.epage269
dc.type.versionpublishedVersionsr
dc.identifier.fulltexthttps://dais.sanu.ac.rs/bitstream/id/28286/stankovic.et.al.sasa.2019.pdf
dc.identifier.rcubhttps://hdl.handle.net/21.15107/rcub_dais_7162


Документи

Thumbnail

Овај документ се појављује у следећим колекцијама

Приказ основних података о документу