Projects Last Projects
Presentation Multi‑Speaker Conversational LLM
Adi Tsach
Supervised by Noam Rotstein
At the time when this project was launched, most production voice interfaces for large language models (LLMs) collapse audio into text and then reason as if the input were typed. The act of transcription discards information that is crucial in natural multi‑party conversation: who is speaking, how they sound, when turns begin and end, and how emphasis and emotion evolve over time. This project designs and implements a speaker‑aware dialog system that treats audio as a first‑class signal. The central thesis of this work is that speaker and time aware interfaces are necessary to move beyond the single‑speaker voice assistant paradigm and accommodate complex multi-speaker conversations. We operationalize this thesis through a system that listens, segments, aligns, attributes, retrieves, and then reasons. The implementation is fully local (important for privacy in meetings) yet designed to be swappable: every stage (ASR, diarization, retrieval, LLM, TTS) can be replaced as better components appear.

Please, see project report.
Please, see final presentation.
Please, see demo clip.









