In designPipeline highlight
SAGE Voice Sync — the projector follows the leader.
A small dedicated listener box near the AV booth captures audio from the soundboard, transcribes the prayer leader in real time using on-device Whisper AI (multilingual, Hebrew + English + transliteration), matches what it hears against the live SAGE deck, and posts an advance through SAGE's existing operator API. The projector tracks the leader automatically — no operator at a laptop, no wireless clicker fumbling, no “wait, what slide are we on.”
Manual operator input always wins. A wireless clicker, the operator screen, or a touch on the projector instantly overrides voice sync for a few seconds — so the AI is a co-pilot, not an autopilot. Designed to fail soft: if the listener box is offline, the operator drives manually exactly as today, with no service-day surprise.
Why it's simpler than you'd expect
Existing SAGE PowerPoint-replacement systems require operators to embed metadata into every slide note, refactor the deck for machine-readability, and maintain a separate trigger map. SAGE already has all that structure — siddur page text, Hebrew, transliteration, English, section transitions, honor codes — as first-class data in the catalog. The listener just queries a small “match candidates” endpoint and posts an HTTP advance call. No deck refactoring. No metadata wrangling.
Smarter than dual-trigger logic
Sliding-window N-gram matching against the next 4-5 sequential slides PLUS the first slide of every upcoming section — so section-skipping (“let's jump ahead to Musaf”) works automatically. Transliteration normalization handles sh / š / kh / ḥ variants. Confidence threshold tunable per service. Telemetry feeds the calibration analytics pipeline so per-prayer timing data accumulates automatically.
Auto Quiet Mode for sermons
When the current slide is in a section flagged as a quiet zone (sermon, silent amidah, meditation), the listener narrows matching to section-transition cues only — so a 15-minute d'var Torah doesn't risk advancing the projector mid-thought. Resumes full matching automatically when the leader transitions back into liturgy. No wake-word toggling, no manual sleep mode.
Hardware: small, silent, ~$700–$1,600
A Jetson Orin Nano dev kit ($499) plus a Focusrite Scarlett 2i2 ($199) is the compact path — fanless, lives in an AV closet, runs Whisper-large-v3-turbo at real-time. A Mac Mini M4 Pro ($1,400) is the easier deployment if your team prefers macOS. 100% local processing — no cloud calls, no internet dependency during services.
Architecture preview
SAGE gets a new /api/sage/[serviceId]/match-targetsendpoint returning a trimmed payload of upcoming slide text content optimized for matching. A new tenant.services.run_voice_syncpermission gates a service-account API token used by the listener box. New voice_sync_eventsand voice_sync_listener_sessionstables capture telemetry for post-service review and feed the calibration analytics pipeline. The reference Python listener ships open-source as a Docker container + a Jetson image, so synagogues can self-host with confidence.