ChiPy __Main__ Meeting


When: July 9, 2026, 6 p.m.

Where: mHUB

1623 W Fulton St Chicago, IL 60612

RSVPs

Registration for this event will close on Thursday July 9 at 12:00 p.m.

Attendance:
In Person Pythonistas: 5

Topics


  • Inference at Scale: Transcribing Millions of Insurance Calls with Whisper and Azure ML
    By: Jimmy Scray
    Experience Level: Intermediate
    Length: 25 Minutes
    Description:

    Transcribing a few audio files with Whisper is easy. Transcribing millions of recordings efficiently, reliably, and cost-effectively is a very different problem.

    In this talk, I'll dive into the Python code and infrastructure behind a large-scale speech transcription platform built for the insurance industry. Starting from a notebook prototype, we'll explore how the system evolved into a distributed inference pipeline running across thousands of GPU workers.

    Rather than focusing on machine learning theory, we'll focus on inference engineering: benchmarking CPU and GPU workloads, maximizing throughput, orchestrating jobs with Azure Machine Learning, handling spot-instance interruptions, and writing resilient Python code that can recover from failures and resume processing automatically.

    Along the way, I'll share benchmark results, architecture decisions, code examples, and the lessons learned while processing millions of real-world recordings.

    If you're interested in Python, distributed systems, performance optimization, or production machine learning infrastructure, this talk will show what happens after the model is trained.