Member of our project team, Renato Soic, participated in the conference SoftCOM 2021 which took place in Hvar, Croatia. SoftCOM is an international conference that gathers researchers and professionals from academia and industry to share experiences and new ideas in such a dynamic area as Information and Communication Technology. HR-SYNTH team presented its paper titled „N-gram Based Croatian Language Network“ which describes how a language network was constructed from a large n-gram collection and how it can be utilized in scope of natural language processing tasks.
Here is the abstract of the paper:
In scope of natural language processing, language networks represent a method which enables observation of linguistic units and their interactions in different linguistic contexts. Here, we describe a language network constructed from an N-gram system collected by Croatian online academic spellchecking service Hascheck. Additionally, we present a service providing a set of functionalities which enable analysis of word transitions. The described approach can provide deeper insight into morphological, semantic, and pragmatic features of Croatian language. A use-case scenario related to application from the domain of natural language generation is provided.