EventQuote AI — Uri Maayan

Event-production sales reps spend hours between the sales call and the signed quote. The call happens in Hebrew or English, the tech requirements are scattered across the conversation, the inventory lives in a spreadsheet that only three people know how to read, and the final .docx is copy-pasted from last month’s quote with half the line items wrong. Toggle sidenote The real pain wasn’t the quote itself — it was the Friday-night bookkeeping. Reps would come back from a full week of sales calls and spend the weekend reconstructing what clients actually asked for. The cycle kills deals; faster competitors just get the yes. The reps wanted one tab, one meeting, one PDF — and to stop doing bookkeeping on a Friday night.

What it does

EventQuote AI listens to a live client meeting through a browser mic, chunk-streams the audio to a FastAPI Whisper worker over a WebSocket, and builds a structured requirement list as the conversation happens. When the meeting ends, it retrieves matching equipment from a per-tenant ChromaDB-backed RAG layer, runs a pricing and margin pass over tenant-specific rules in Postgres, and hands the sales rep a ready-to-send .docx or .pdf in seconds.

It lives inside the rep’s existing workflow — JWT-multi-tenant accounts, QuickBooks / Hashavshevet / Priority ERP sync, and a WhatsApp Business bot for approvals and follow-ups. 235 i18n keys, full RTL for Hebrew. Sixteen REST route modules, 650+ pytest tests covering 1,360+ assertions, one WebSocket endpoint streaming live transcripts and partial extractions to the Vue UI.

Hard parts

Live transcription with partial updates. Browser-captured audio is chunked and sent over a WebSocket to the Whisper worker. Partial transcripts flow back through the same socket so the rep can watch the model hear. Naive approaches buffered the whole meeting and transcribed at the end — too late to be useful. The fix was a rolling-window Whisper invocation with overlap stitching, so partial output is coherent and final output is complete.

Schema-constrained extraction. An LLM (OpenAI, Gemini, or local Ollama through a provider interface) consumes the rolling transcript and emits a structured requirement JSON. The schema is strict, and validation failures retry with the error context. Anything the model can’t ground in the transcript gets flagged for human review rather than silently invented — line items that don’t trace back to a retrieved catalog entry are quarantined.

RAG with tenant-isolated collections

Each tenant’s catalog is its own ChromaDB collection. Rerank is a cross-encoder over (requirement, catalog-chunk) pairs. Without rerank, dense similarity alone surfaced semantically adjacent but commercially irrelevant items — e.g. “stage monitors” matched to floor monitors in the AV category when the meeting was about pro audio. The rerank step was the single biggest quality improvement.

ERP adapters that don’t bite back. One adapter per ERP (QuickBooks, Hashavshevet, Priority). The rep pushes an approved quote; the adapter normalizes line items and tax codes; the ERP gets a clean document instead of a copy-paste. The hard part is the tax-code mapping — each ERP has its own shape, each tenant customizes it, and a malformed code is a silent corruption, not an error. A contract test per ERP catches regressions before deploys.

Synthetic training data. Training and regression data for the requirement extractor is generated through a synthetic-call pipeline — seeded with real equipment catalogs, rendered through prompt templates, then scored against a rubric before entering the test set. This is the only way to get enough diverse Hebrew-sales-call data; real recordings are scarce and PII-sensitive.

Result

A rep’s Friday-night bookkeeping, folded into the meeting itself.

The platform is built to live under a real sales team’s meetings — in Hebrew, with RTL typography, WhatsApp approvals, and an ERP that won’t take malformed tax codes. Making the model useful was the easy part; making the platform boringly reliable around the model is most of the work.

Current state: end-to-end working build, exercised against one tenant’s catalog and one ERP integration; not yet running for paying clients. The 650+ tests and the platform discipline are the reason it’s ready to.