Operational Service

AI Simultaneous Translation for Events

How the real-time automatic translation service works, integrated into Converso's MRSI approach.

What It Is

Converso provides an AI-based simultaneous translation service for live events. The system captures the speaker's audio in real time, transcribes it, translates it into the target language, and distributes it as synthesized audio and/or text to listeners.

This is not a software product sold to third parties. It is a managed service: Converso configures, activates, monitors, and controls the entire chain during the event.

The service is delivered remotely (cloud) or with an on-site technician, depending on event complexity. The operational base is in Milan, but the service is available globally.

Technical Flow

The signal path from speaker to listener follows these stages:

mic

1. Audio Input

The speaker's audio is captured from the venue mixer, a dedicated audio feed, or a direct microphone. Analog (XLR, jack) and digital (Dante, AES67, NDI) inputs are supported. The RSI Bridge can be used, which bundles AV input, internet connectivity, and power redundancy into a single device. But it's not mandatory: the audio stream can also be sent directly from our web broadcaster interface, without additional hardware.

arrow_downward

memory

Converso®

2. Converso AI Engine

The audio signal enters the Converso AI Engine, the proprietary system that manages the entire processing chain. The engine performs real-time speech-to-text transcription, applies automatic translation to the target language, and generates synthesized audio output. The approach can also be hybrid: main languages handled by professional interpreters, secondary ones managed by AI. The entire process is monitored and controllable by the Converso technician.

graphic_eq Speech-to-Texttranslate AI Translationtune Live monitoring

arrow_downward

output

3. Output & Distribution

The translation is produced in multiple formats — synthesized audio, translated text, subtitles — and distributed through two parallel channels:

settings_input_hdmi

Control Room

Real-time text on teleprompters, translated audio in the control room mix, and dedicated feeds for operators and speakers.

smartphone

Converso App

Users receive the translation directly on their smartphone via QR code or direct link, with no dedicated devices or proprietary headphones required.

closed_caption Live Subtitles

Real-Time Subtitles

The webapp and control room can display real-time subtitles in addition to audio. Useful for those who can't use headphones, for the deaf and hard of hearing, or for those who prefer reading in a foreign language.

check_circleSynchronized real-time subtitles
check_circleAvailable in all event languages
check_circleAccessibility for deaf attendees
check_circleIdeal for those who prefer reading

Smartphone

Operating Modes

The service supports three configurations, selectable per language and per event:

smart_toy

Full AI

Fully automatic transcription and translation. Suitable for events with many target languages, limited budget, or low-criticality content (e.g., parallel sessions, workshops, internal training).

checkScalable to dozens of simultaneous languages
checkNo interpreter required
checkAudio and/or text output
checkLow latency, real-time output

record_voice_over

Human interpretation

The interpreter works in a booth (physical or virtual) and the translation is distributed via the same app. The flow for the end user is identical to AI mode.

checkCertified professional quality
checkHandles nuances, irony, technical jargon
checkSame app for the end user
checkCan be combined with AI on other languages

join_inner

Hybrid mode

The event's main languages are covered by professional interpreters. Secondary or low-demand languages are handled by AI. The decision is per language, not per event.

checkInterpreter for main language (e.g., English)
checkAI for secondary languages (e.g., Portuguese, Japanese)
checkIndependent per-language configuration
checkTransparent switching for the user

MRSI Approach

MRSI stands for Managed Remote Simultaneous Interpretation. It is the operational model through which Converso delivers interpretation and AI translation services.

Why "Managed"

The service is not delivered to the client as a self-service platform. Every event has a dedicated Converso technician who configures the system, manages startup, monitors the session in real time, and intervenes in case of anomalies.

Technician's role

engineeringLanguage channel configuration, audio routing, transcription parameters
engineeringPre-event testing with end-to-end flow verification
engineeringReal-time monitoring of transcription and translation quality
engineeringFallback management: manual switch from AI to interpreter if needed
engineeringCoordination with audio control room and on-site AV service

Quality control and fallback

The technician verifies transcription quality in real time. If quality drops below threshold (ambient noise, strong accent, connection issues), they can intervene manually: parameter adjustment, switch to human interpreter, backup channel activation.

Context Training: the system speaks your event's language

Every organisation has its own language. Converso®'s context training trains the system on the event-specific vocabulary before the session begins.

Glossaries and terminology

Product names, acronyms, registered trademarks, industry terminology: the system acquires them and treats them as primary reference in translation.

Documents and materials

Presentations, scripts, agendas, press releases: any text that anticipates the event content is automatically integrated into the active model.

Precision where it matters

In professional settings, getting a product name or an institutional guest's name wrong is a real problem. Context training addresses exactly these critical points.

A profile that grows

For recurring clients, the terminology profile is preserved and updated with every new project. The more Converso® works with you, the more it speaks like you.

User Experience

For the end user (the event listener), nothing changes regardless of the active mode:

smartphoneAccess via Converso app from smartphone (QR code or link)
smartphoneSelect the desired language
smartphoneReceive translated audio, translated text, or both
smartphoneWhen the channel is AI-powered, the user is informed transparently
smartphoneNo installation needed: the app works from the browser

The interface, access flow, and available options are identical across all modes. The Converso app is the same for AI, human interpretation, and hybrid mode.

For AV Service Providers and Organizers

The AI translation service introduces no changes to the audiovisual service or event organizer workflow:

settings_input_hdmiAudio input remains the same: mixer feed, Dante, dedicated line
settings_input_hdmiNo additional interpreter booths needed for AI-covered languages
settings_input_hdmiNo infrared receivers or dedicated devices needed for the audience
settings_input_hdmiDistribution happens via data network (event WiFi or users' mobile network)
settings_input_hdmiThe Converso technician interfaces directly with the audio control room
settings_input_hdmiPre-event testing follows the same logic as standard soundchecks

For AV service providers, the setup is equivalent to a standard remote interpretation event. The AI component is transparent from the audio infrastructure perspective.

Technical Information and Configuration

For technical specifications, connectivity requirements, compatibility testing, or configuration for a specific event.

mail Contact the technical team