Seed Funding Projects from CCLS

“The student felt unwell, so she asked the teacher to go to the nurse’s office.”
Most people would assume that the student is asking for permission to leave in the above sentence because that’s what the context indicates. However, a large language model (LLM), like ChatGPT, often defaults to its biased structural interpretation in which the student is asking for the teacher to leave rather than for permission to leave (despite this interpretation not making much sense given the context and our world knowledge). Unlike humans, LLMs struggle to leverage context clues when parsing an ambiguous sentence, and it is this shortcoming that our research addresses. Thanks to the funding from CCLS, we will be able to utilize the remote GPUs and cloud service necessary to train our LLMs (BART and GPT-4o) to overtly generate a sentence’s syntactic and semantic structures using graph architecture. We furthermore intend to pair this with a novel dataset that contains structurally ambiguous but contextually-clear sentences (like the one above) that we will fine-tune our models on. Because the graph task forces an overt realization of the sentence’s structure, our hypothesis is that this architecture will provide the scaffolding necessary for models to more easily form structural attachments and interpretations that leverage contextual clues, instead of falling back on its pretrained biased interpretations. The application of our work will go to further improve language models’ performance, as well as highlight the need to evaluate models on the difficult and niche aspects of language that, while less common, are an inherent and unavoidable feature of human language and communication.

Mary “Katie” Kennedy, 4th year PhD student in the Linguistics Department, Dornsife, USC

Nelly Marutyan, 4th year PhD student in the Linguistics Department, Dornsife, USC

Latest News

USC CCSL Launch!

We are very excited to launch the new USC Center for Computational Language Sciences! We are a multidisciplinary group of USC affiliated faculty that are dedicated to revolutionizing our understanding of language through groundbreaking computational research.