Name: A long, short history of realtime AI agents
Start: 2025-02-01T17:10:00
End: 2025-02-01T17:10:00
Location: K.3.601

Abstract

Until a few months ago, the only working approach for connecting realtime AI agents to WebRTC streams and phone calls was to use lengthy pipelines of speech to text, agent orchestration, and text to speech, often using multiple machine learning models from commercial vendors. That has changed with new realtime speech to speech models, most famously the (closed) OpenAI advanced voice, but what are the open source ways to build these kind of systems? This talk walks through my experience with using 4 different projects to build functional systems which can use open source (open weights) models at their core. We will talk about how we have integrated Jambonz, Livekit, and Ultravox (Fixie.AI) within our Aplisay framework and what this allows us to do.

Attachments

Talk pdf

Links

External Links

Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.

A long, short history of realtime AI agents

Notes

Abstract

Attachments

Speakers

Links

External Links