Shaping the Conversation: Enriching Collection Access and Use with Generative AI

Northwestern University Libraries will leverage the power of generative artificial intelligence (GenAI) to revolutionize the way libraries provide access to knowledge. First, the project will build an installable, open-source, semantic discovery product that allows users to chat with library and archival collections. Second, the project will test and validate a toolkit to augment metadata for digitized collections. The project will implement the semantic chat-based discovery tool and an automated metadata tool using existing GenAI models in Northwestern’s Digital Collections. To expand impact nationally, the team will develop toolkits comprising research methods, examples, detailed documentation, and open-source software packages so library professionals and researchers across the country can implement the tools in their own contexts. By exponentially expanding access to, and providing interpretation of, information from primary resources, this project will increase library access to knowledge around AI tools, allowing libraries across the country to adapt and meet rapidly changing expectations.

Supported by the Institute of Museum and Library Services. National Leadership Grant LG-256703-OLS-24, Awarded 2024.

Upcoming workshops and demos

Continuing the Conversation: Building Open-Source Semantic Discovery Tools: code4lib 2025, Day 2

March 11, 2025

11:30am

David Schober and Brendan Quinn

We’ll discuss our work with Retrieval Augmented Generation (RAG) architectures, generating embedding vectors at scale in production, and the challenges of grounding AI responses in data managed by experts at the library. We will also discuss developing a toolkit that will allow other institutions to explore this impactful technology with their own data. Learn about opportunities, challenges, and discoveries we’ve made so far.

  • Learn the fundamentals of semantic search and retrieval and integrations with genAI tools
  • Learn about the semantic discovery tool developed as part of the IMLS Leadership grant
  • Hear about the complexities of testing and evaluating generative AI pipelines and semantic retrieval systems
  • Understand how Northwestern is leveraging AI in Digital Collections and expanding to other contexts

How does it work?

Currently implemented on Northwestern Libraries Digital Collections for users logged in with a Northwestern NetID, the search tool enhanced by generative AI parses a plain language question and responds in prose, using the metadata of the digitized collections to ground responses. See a demo:

Deliverables

Open-source discovery product

Installable open-source discovery product designed as a starting point for institutions seeking to implement AI-driven semantic search for end users alongside or in lieu of existing search tooling

Metadata generation toolkit

Testing and validation of a metadata toolkit to augment and suggest metadata for digitized collections

Related workshops and education

Ongoing education opportunities to engage with the wider library and museum community on how to use this transformative technology

Project members

  • James Lee (Principal Investigator)
  • David Schober (Team Lead and Projects Manager)
  • Brendan Quinn (Senior Developer)
  • Charles Loder (Senior Developer)
  • Jamie Carlstone (Metadata and Subject Matter Expert)
  • Jen B. Young (Metadata and Subject Matter Expert)
  • Frank Sweis (User Experience Professional)
  • Cory Slowik (Communication Professional)