Theoretical and Computational Neuroscience
Author: María Belén Paez | Email: bpaez2@gmail.com
Maria Belen Paez1°, Facundo Totaro1°, Julieta Laurino3°, Laura Kaczer3°, Juan Kamienkowski1°2°4°, Bruno Bianchi1°2°
1° Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación. Buenos Aires, Argentina.
2° CONICET-Universidad de Buenos Aires. Instituto de Ciencias de la Computación (ICC). Buenos Aires, Argentina.
3° Universidad de Buenos Aires. Departamento de Fisiología, Biología Molecular y Celular. Buenos Aires, Argentina.
4° Facultad de Ciencias Exactas y Naturales, Maestría en Explotación de Datos y Descubrimiento del Conocimiento, Universidad de Buenos Aires, Buenos Aires, Argentina
The assignment of meaning to a word is a key process in language comprehension. However, since words can have more than one meaning (i.e., homonym and polysemy), there is additional complexity to consider. Nevertheless, both our brain and state-of-the-art Language Models (LM) can assign meaning to each word when processing it, taking into account the context in which it appears. The aim of the present study is to understand if the mechanisms used by the brain and the LM to solve this task are analogous. Current LM processes text by executing a series of transformations sequentially. For these, words are converted to embeddings (i.e., the vectorized representation) that are modified, introducing information about the surrounding context, as they pass through the layers. Our goal is to compare the human behavior of meaning assignment (measured in an online experiment) with model neural representations across layers, focusing on how embeddings of ambiguous words vary depending on several biasing contexts. We measured the model’s bias as the cosine distance between the embedding of the meaning and the contextualized embedding of the ambiguous word in a given layer. Our results show a non-linear progression of the biasing. This is in line with previous works, and suggests that middle and final layers process different aspects of the input texts.