This talk summarizes our research on how LLMs generate narratives and recurring tropes in real-world information-seeking setups via prompting.
Talk outline:
- Knowledge collapse and epistemic diversity: What they mean and why they matter for real-world information access (5 mins).
- Framework overview: How we measure epistemic diversity across LLM outputs (5 mins).
- Experimental design, results: Curating dataset for comparisons across model families, search results, and Wikipedia pages (7 mins).
- Implications for designing LLM-powered systems that preserve information diversity (10 mins)
Key takeaways for AI practitioners:
- When can retrieval-augmented generation (RAG) increase diversity?
- Can expanding Wikipedia via translation improve epistemic diversity or reinforce existing tropes?
- What are some open challenges in measuring cultural and contextual diversity in LLM outputs?
- Where are we headed in terms of model sizes, fluency, and breadth of knowledge?
Useful links:
Sarah Masud
Sarah Masud is currently a postdoc at the University of Copenhagen, exploring stereotypes and narratives. During her PhD from Indraprastha Institute of Information Technology, New Delhi, she explored the role of different context cues in improving computational hate speech-related tasks