Hizkuntzaren tratamendua (oro har)

Identification and translation of verb+noun multiword expressions: a Spanish-Basque study

This is a summary of the PhD thesis written by Uxoa Iñurrieta under the supervision of Dr. Gorka Labaka and Dr. Itziar Aduriz. Full title of the PhD thesis in Basque: "Izena+aditza Unitate Fraseologikoak gaztelaniatik euskarara: azterketa eta tratamendu konputazionala". The defense was held in San Sebastian on November 29, 2019. The doctoral committee was integrated by Ricardo Etxepare (Centre National de la Recherche Scientifique), Margarita Alonso (Universidad de Coruña) and Miren Azkarate (University of the Basque Country).

Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation

To detect how and when readers are experiencing engagement with a literary work, we bring together empirical literary studies and
language technology via focusing on the affective state of absorption. The goal of our resource development is to enable the detection
of different levels of reading absorption in millions of user-generated reviews hosted on social reading platforms. We present a corpus
of social book reviews in English that we annotated with reading absorption categories. Based on these data, we performed supervised,

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

The combination of visual and textual representations has produced excellent results in tasks such as image captioning and visual question answering, but the inference capabilities of multimodal representations are largely untested.
In the case of textual representations, inference tasks such as Textual Entailment and Semantic Textual Similarity have been often used to benchmark the quality of textual representations.

Verb+Noun Multiword Expressions: A linguistic analysis for identification and translation

Multiword Expressions (MWEs) are combinations of words which exhibit some kind of idiosyncrasy. Due to their idiosyncratic nature, they pose several problems to Natural Language Processing (NLP). In this PhD, two of the most challenging tasks concerning MWE processing are addressed: the automatic identification of MWE occurrences in corpora and their translation in Machine Translation (MT).

Aditza+izena Unitate Fraseologikoak gaztelaniatik euskarara: azterketa eta tratamendu konputazionala

Unitate Fraseologikoak (UFak) hizkuntzek bere-bereak dituzten hitz-konbinazio idiomatikoak dira. Hizkuntzaren Prozesamenduko (HPko) tresnek kalitatezko emaitzak izan ditzaten, beharrezkoa da halakoak ondo tratatzea, baina lan horrek hainbat zailtasun ditu; besteak beste, hitzez hitzeko itzulgarritasun eza. Tesi-lan honetan, aditza+izena motako UFen azterketa linguistiko bat egin dugu, halakoek HPren alorrean sortzen dituzten bi arazo garrantzitsuri aurre egiten laguntzeko: batetik, corpusetan UFak automatikoki identifikatzeari, eta bestetik, UF horiek gaztelaniaren eta euskararen

LINGUATEC: Desarrollo de recursos lingüı́sticos para avanzar en la digitalización de las lenguas de los Pirineos

El objetivo del proyecto es desarrollar, probar y difundir nuevos recursos, nuevas herramientas y aplicaciones lingüı́sticas innovadoras para mejorar el nivel de digitalización del aragonés, vasco y occitano.

Literal occurrences of Multiword Expressions: rare birds that cause a stir

Multiword expressions can have both idiomatic and literal occurrences. For instance pulling strings can be understood either as making use of one’s influence, or literally. Distinguishing these two cases has been addressed in linguistics and psycholinguistics studies, and is also considered one of the major challenges in MWE processing. We suggest that literal occurrences should be considered in both semantic and syntactic terms, which motivates their study in a treebank.

Unitate Fraseologikoen agerpen literalak, urre baina urri

Unitate fraseologiko asko idiomatikoki eta literalki uler daitezke. Esate baterako, ziria sartzeak bi esanahi
izan ditzake testuinguruaren arabera: norbaiti iruzur egitea edo nonbait ziri bat sartzea literalki. Lan honetan,
corpusetan oinarritutako azterketa eleaniztun baten berri emango dugu, eta erakutsiko dugu, batetik, halako
hitz-konbinazioak oso gutxitan erabiltzen direla literalki praktikan, eta bestetik, idiomatiko-literal bereizketa

Zer i(ra)kas dezakegu geure corpusekin "jolastuz"?

Hizkuntzak ikasteko askotariko metodologiak erabili izan dira: metodo zuzena, itzulpen metodoa, metodo audiolinguala, metodo komunikatiboa, hurbilpen lexikoa, ariketetan oinarritutako metodoa, ikasleen erroreetan oinarritutakoa edota metodo eklektikoak. Azken urteotan, berriz, corpusekin «jolasteak» hizkuntzak modu esanguratsuan i(ra)kasteko aukerak eskaintzen dizkigunaren ustean gaude.

Orriak

RSS - Hizkuntzaren tratamendua (oro har)-rako harpidetza egin