AI Archives and Narrative Systems

REMIT

Exploring how 120,000 hours of broadcast material can be structured into source-linked narratives.

Client eM+ Laboratory / RTS
Year 2026
Role LLM workflow, retrieval architecture, graph indexing, and research prototype
Medium LLMs, agents, RAG, semantic embeddings, knowledge graphs, audiovisual archive pipelines

Context

REMIT was developed at eM+, EPFL, using more than 120,000 hours of RTS video. The work explores how spoken archives could be turned into audiovisual arguments without losing the texture of the original material.

The source material includes interviews, broadcasts, and testimonies. The goal is not to collapse them into a single summary, but to keep fragments, voices, and recurring motifs available for montage and interpretation.

System

The prototype starts from a curatorial question and retrieves source-linked material from a corpus of conversations, broadcasts, and testimonies. Passages remain tied to speakers, timestamps, topics, and contexts, so different interpretive frames can be tested against the same archive.

The research explores how source-linked collages, subtitles, media trims, and AI-assisted voice sequences could be generated while keeping provenance visible.

Technical Notes

The pipeline combines large language models, agents, retrieval-augmented generation, semantic embeddings, and graph-based indexing. Transcripts are segmented into topics and enriched with timestamps, entities, metadata, and vector representations.

The intended retrieval stack uses hybrid retrieval, query expansion, and cross-encoder reranking to identify relevant excerpts for a given question. A knowledge-graph layer maps relationships between topics, people, places, events, and concepts.

Gallery