Language Technology (in general)

Identification and translation of verb+noun multiword expressions: a Spanish-Basque study

This is a summary of the PhD thesis written by Uxoa Iñurrieta under the supervision of Dr. Gorka Labaka and Dr. Itziar Aduriz. Full title of the PhD thesis in Basque: "Izena+aditza Unitate Fraseologikoak gaztelaniatik euskarara: azterketa eta tratamendu konputazionala". The defense was held in San Sebastian on November 29, 2019. The doctoral committee was integrated by Ricardo Etxepare (Centre National de la Recherche Scientifique), Margarita Alonso (Universidad de Coruña) and Miren Azkarate (University of the Basque Country).

Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation

To detect how and when readers are experiencing engagement with a literary work, we bring together empirical literary studies and language technology via focusing on the affective state of absorption. The goal of our resource development is to enable the detection of different levels of reading absorption in millions of user-generated reviews hosted on social reading platforms. We present a corpus of social book reviews in English that we annotated with reading absorption categories.

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

The combination of visual and textual representations has produced excellent results in tasks such as image captioning and visual question answering, but the inference capabilities of multimodal representations are largely untested. In the case of textual representations, inference tasks such as Textual Entailment and Semantic Textual Similarity have been often used to benchmark the quality of textual representations. The long term goal of our research is to devise multimodal representation techniques that improve current inference capabilities.

Aditza+izena Unitate Fraseologikoak gaztelaniatik euskarara: azterketa eta tratamendu konputazionala // Verb+Noun Multiword Expressions: A linguistic analysis for identification and translation

Unitate Fraseologikoak (UFak) hizkuntzek bere-bereak dituzten hitz-konbinazio idiomatikoak dira. Hizkuntzaren Prozesamenduko (HPko) tresnek kalitatezko emaitzak izan ditzaten, beharrezkoa da halakoak ondo tratatzea, baina lan horrek hainbat zailtasun ditu; besteak beste, hitzez hitzeko itzulgarritasun eza. Tesi-lan honetan, aditza+izena motako UFen azterketa linguistiko bat egin dugu, halakoek HPren alorrean sortzen dituzten bi arazo garrantzitsuri aurre egiten laguntzeko: batetik, corpusetan UFak automatikoki identifikatzeari, eta bestetik, UF horiek gaztelaniaren eta euskararen

LINGUATEC: Desarrollo de recursos lingüı́sticos para avanzar en la digitalización de las lenguas de los Pirineos

El objetivo del proyecto es desarrollar, probar y difundir nuevos recursos, nuevas herramientas y aplicaciones lingüı́sticas innovadoras para mejorar el nivel de digitalización del aragonés, vasco y occitano.

Literal occurrences of Multiword Expressions: rare birds that cause a stir

Multiword expressions can have both idiomatic and literal occurrences. For instance pulling strings can be understood either as making use of one’s influence, or literally. Distinguishing these two cases has been addressed in linguistics and psycholinguistics studies, and is also considered one of the major challenges in MWE processing. We suggest that literal occurrences should be considered in both semantic and syntactic terms, which motivates their study in a treebank.

Pages

Subscribe to RSS - Language Technology (in general)