Operational Service

AI Simultaneous Translation for Events

How the real-time automatic translation service works, integrated into Converso's MRSI approach.

What It Is

Converso provides an AI-based simultaneous translation service for live events. The system captures the speaker's audio in real time, transcribes it, translates it into the target language, and distributes it as synthesized audio and/or text to listeners.

This is not a software product sold to third parties. It is a managed service: Converso configures, activates, monitors, and controls the entire chain during the event.

The service is delivered remotely (cloud) or with an on-site technician, depending on event complexity. The operational base is in Milan, but the service is available globally.

Technical Flow

The signal path from speaker to listener follows these stages:

mic

1. Audio Input

The speaker's audio is captured from the venue mixer, a dedicated audio feed, or a direct microphone. Analog (XLR, jack) and digital (Dante, AES67, NDI) inputs are supported. The RSI Bridge can be used, which bundles AV input, internet connectivity, and power redundancy into a single device. But it's not mandatory: the audio stream can also be sent directly from our web broadcaster interface, without additional hardware.

arrow_downward
memory
Converso®

2. Converso AI Engine

The audio signal enters the Converso AI Engine, the proprietary system that manages the entire processing chain. The engine performs real-time speech-to-text transcription, applies automatic translation to the target language, and generates synthesized audio output. The approach can also be hybrid: main languages handled by professional interpreters, secondary ones managed by AI. The entire process is monitored and controllable by the Converso technician.

graphic_eq Speech-to-Texttranslate AI Translationtune Live monitoring
arrow_downward
output

3. Output & Distribution

The translation is produced in multiple formats — synthesized audio, translated text, subtitles — and distributed through two parallel channels:

settings_input_hdmi

Control Room

Real-time text on teleprompters, translated audio in the control room mix, and dedicated feeds for operators and speakers.

smartphone

Converso App

Users receive the translation directly on their smartphone via QR code or direct link, with no dedicated devices or proprietary headphones required.

closed_caption Live Subtitles

Real-Time Subtitles

The web app can display real-time subtitles in addition to audio. Useful for those who can't use headphones, for the deaf and hard of hearing, or for those who prefer reading in a foreign language.

  • check_circleSynchronized real-time subtitles
  • check_circleAvailable in all event languages
  • check_circleAccessibility for deaf attendees
  • check_circleIdeal for those who prefer reading
Smartphone
Smartphone

Operating Modes

The service supports three configurations, selectable per language and per event:

smart_toy

Full AI

Fully automatic transcription and translation. Suitable for events with many target languages, limited budget, or low-criticality content (e.g., parallel sessions, workshops, internal training).

  • checkScalable to dozens of simultaneous languages
  • checkNo interpreter required
  • checkAudio and/or text output
  • checkLow latency, real-time output
record_voice_over

Human interpretation

The interpreter works in a booth (physical or virtual) and the translation is distributed via the same app. The flow for the end user is identical to AI mode.

  • checkCertified professional quality
  • checkHandles nuances, irony, technical jargon
  • checkSame app for the end user
  • checkCan be combined with AI on other languages
join_inner

Hybrid mode

The event's main languages are covered by professional interpreters. Secondary or low-demand languages are handled by AI. The decision is per language, not per event.

  • checkInterpreter for main language (e.g., English)
  • checkAI for secondary languages (e.g., Portuguese, Japanese)
  • checkIndependent per-language configuration
  • checkTransparent switching for the user

MRSI Approach

MRSI stands for Managed Remote Simultaneous Interpretation. It is the operational model through which Converso delivers interpretation and AI translation services.

Why "Managed"

The service is not delivered to the client as a self-service platform. Every event has a dedicated Converso technician who configures the system, manages startup, monitors the session in real time, and intervenes in case of anomalies.

Technician's role

  • engineeringLanguage channel configuration, audio routing, transcription parameters
  • engineeringPre-event testing with end-to-end flow verification
  • engineeringReal-time monitoring of transcription and translation quality
  • engineeringFallback management: manual switch from AI to interpreter if needed
  • engineeringCoordination with audio control room and on-site AV service

Quality control and fallback

The technician verifies transcription quality in real time. If quality drops below threshold (ambient noise, strong accent, connection issues), they can intervene manually: parameter adjustment, switch to human interpreter, backup channel activation.

User Experience

For the end user (the event listener), nothing changes regardless of the active mode:

  • smartphoneAccess via Converso app from smartphone (QR code or link)
  • smartphoneSelect the desired language
  • smartphoneReceive translated audio, translated text, or both
  • smartphoneWhen the channel is AI-powered, the user is informed transparently
  • smartphoneNo installation needed: the app works from the browser

The interface, access flow, and available options are identical across all modes. The Converso app is the same for AI, human interpretation, and hybrid mode.

For AV Service Providers and Organizers

The AI translation service introduces no changes to the audiovisual service or event organizer workflow:

  • settings_input_hdmiAudio input remains the same: mixer feed, Dante, dedicated line
  • settings_input_hdmiNo additional interpreter booths needed for AI-covered languages
  • settings_input_hdmiNo infrared receivers or dedicated devices needed for the audience
  • settings_input_hdmiDistribution happens via data network (event WiFi or users' mobile network)
  • settings_input_hdmiThe Converso technician interfaces directly with the audio control room
  • settings_input_hdmiPre-event testing follows the same logic as standard soundchecks

For AV service providers, the setup is equivalent to a standard remote interpretation event. The AI component is transparent from the audio infrastructure perspective.

Technical Information and Configuration

For technical specifications, connectivity requirements, compatibility testing, or configuration for a specific event.

mail Contact the technical team