Título da tese | FrameNet Annotation for Multimodal Corpora: devising a methodology for the semantic representation of text-image interactions in audiovisual productions |
Resumo da tese |
Multimodal analyses have been growing in importance within several approaches to Cognitive Linguistics and applied fields such as Natural Language Understanding. Nonetheless fine-grained semantic representations of multimodal objects are still lacking, especially in terms of integrating areas such as Natural Language Processing and Computer Vision, which are key for the implementation of multimodality in Computational Linguistics. In this Dissertation, we propose a methodology for extending FrameNet annotation to the multimodal domain, since FrameNet can provide fine-grained semantic representations, particularly with a database enriched by Qualia and other interframal and intraframal relations, as it is the case of FrameNet Brasil. To make FrameNet Brasil able to conduct multimodal analysis, we outlined the hypothesis that similarly to the way in which words in a sentence evoke frames and organize their elements in the syntactic locality accompanying them, visual elements in video shots may, also, evoke frames and organize their elements on the screen or work complementarily with the frame evocation patterns of the sentences narrated simultaneously to their appearance on screen, providing different profiling and perspective options for meaning construction. The corpus annotated for testing the hypothesis is composed of episodes of a Brazilian TV Travel Series critically acclaimed as an exemplar of good practices in audiovisual composition. The TV genre chosen also configures a novel experimental setting for research on integrated image and text comprehension, since, in this corpus, text is not a direct description of the image sequence but correlates with it indirectly in a myriad of ways. The Dissertation also reports on an eye-tracker experiment conducted to validate the approach proposed to a text-oriented annotation. The experiment demonstrated that it is not possible to determine that text impacts gaze directly and was taken as a reinforcement to the approach of valorizing modes combination. Last, we present the Frame2 dataset, the product of the annotation task carried out for the corpus following both the methodology and guidelines proposed. The results achieved demonstrate that, at least for this TV genre but possibly also for others, a fine-grained semantic annotation tackling the diverse correlations that take place in a multimodal setting provides new perspective in multimodal comprehension modeling. Moreover, multimodal annotation also enriches the development of FrameNets, to the extent that correlations found between modalities can attest the modeling choices made by those building frame-based resources. |
Data e horário | Dia 27 de junho de 2023, às 14:00 horas, no laboratório da FrameNet. |
COMPOSIÇÃO DA BANCA:
Nome do(a) Prof.(a) | Título e instituição | Vínculo institucional | Função na banca | |
01 | Tiago Timponi Torrent | Doutor/UFRJ | UFJF | Orientador e Presidente |
02 | Mark Turner | Doutor/ University of California | Case Western Reserve University | Co-orientador |
03 | Janina Wildfeuer | Doutora/ Universität Bremen | University of Groningen | Membro Titular Externo |
04 | André Vinícius Lopes Coneglian | Doutor/ Universidade Presbiteriana Mackenzie | UFMG | Membro Titular Externo |
05 | Aline Alves Fonseca | Doutora/ UFMG | UFJF | Membro Titular Interno |
06 | Ely Edison da Silva Matos | Doutor/ UFJF | UFJF | Membro Titular Interno |
07 | Patrícia Nora de Souza Ribeiro | Doutora/ Unicamp | UFJF | Suplente Externo |
08 | Adriana Silvina Pagano | Doutora/ UFMG | UFMG | Suplente Externo |