B2 Bachelor VoiSum Use AI to transcribe / summarize voice messages

Team

  • Jasmin Furche
  • Tiphannie Byakuleka
  • Veronika Zhaboklitskaya
  • Akrem Cheniour
  • Jannis Elsner
  • Jenny Phuong Anh Nguyen
  • Yusuf Ünlen

Supervision

Prof. Dr. Tobias Lenz

Voice In, Clarity Out

Ever get those super long voice messages from friends or family and think "ugh, not another one"? Sure, you can blast through them at 2x speed, but let's be real - who wants to sit through all those "umm"s and random stories? That's where our project comes in handy: we're building this cool product that takes those rambling voice messages, uses AI to cut out all the fluff, and gives you back a clean, short summary. No more wasting time on voice messages that could've been much shorter!

Our Goal

In today’s fast-paced world, we’re aiming to transform how people consume voice messages by creating an AI-powered solution that cuts through the noise. Our tool takes those rambling voice notes and delivers just the essential content - no fluff, no filler words, just the key points you need. By combining speech-to-text, smart summarization, and voice synthesis, we’re not just saving time - we’re making voice communication more efficient while keeping it personal.

Features

So, let’s break down what our software can actually do. First up, our software provides an integrated transcription-and-summarization service via a single endpoint, delivering both a full transcript of your audio and a concise summary. This all-in-one approach streamlines the process, eliminating the need to handle transcription and summarization separately. However “transcribe” and “summarize” can still be called separately, if need be.

For a more focused overview, the keywords feature extracts and highlights the key points of your content. When a specific level of detail is required, the custom-length summary option lets you determine the length of the summary.

Getting started is straightforward: simply record a new audio clip or upload an existing file, then let our API handle the rest. Think of it as an AI-driven personal assistant that efficiently condenses your audio content—minus the hassle of manual editing.

Process

Our project began when Professor Tobias Lenz introduced the initial concept, prompting us to establish a GitHub repository and organize our tasks using a Kanban board for effective workflow management. We conducted an extensive review of existing AI models to ensure we were not duplicating efforts. After gaining insight into the landscape, we formulated our project plan and selected a technology stack optimized for our goals.

Adhering to an agile methodology, we held regular weekly meetings to review progress, test new iterations, and refine our solution. This iterative approach allowed us to rapidly incorporate feedback and make adjustments as necessary, ensuring the final product was both robust and user-friendly.

Team

We divided our Team into 2 specialized groups, one to build the API, the backbone of our software and the other to explore and build different Frontend solutions.