Page d'accueil // Recherche // FSTC // Computer Sci... // Projets de r... // A Semantic Search Engine for the Retrieve of Similar Patterns in Luxembourgish Texts

A Semantic Search Engine for the Retrieve of Similar Patterns in Luxembourgish Texts

Financement: University of Luxembourg External Organisation Funding
Date de début: 15 janvier 2018
Date de fin: 14 janvier 2021

Description

The aim of STRIPS is to develop a toolbox of semantic search algorithms for Luxembourgish. We want to implement search algorithms to retrieve and to monitor, e.g., temporal patterns of named entities in Luxembourgish texts. The term ‘semantic’, hereby, does not only refer to the usage of keywords or Bag-of-Words (for example: names, geographic identifiers), but fosters also on more complex structures like, for example, on concepts (e.g., topics or themes) and a document’s sentiment (e.g., a positive or a negative polarity of the document). The main focus of STRIPS lies in the linguistic processing of texts written in Luxembourgish (particularly stemming, use of phonetic dictionaries and tagged word list for Luxembourgish; Part-of-speech-tagged text corpus), in similarity learning aspects to allow fuzziness in search queries, and in the identification of temporal cross-dependencies inside the Luxembourgish text corpus. To validate the project, we have given heterogeneous text sources (official news items and user-contributed comments) by RTL.

 

Membres