Service

GitHub - altalt-org/Lightning-SimulWhisper: An MLX/CoreML implementation of SimulStreaming. ~15x increase in performance

altalt-org

2025.11.09

·GitHub·by Anonymous

#MLX#CoreML#Whisper#Speech Recognition#Apple Silicon

Key Points

1Lightning-SimulWhisper is an MLX/CoreML implementation of SimulStreaming for real-time local transcriptions on Apple Silicon, offering substantial performance and power efficiency improvements.
2It leverages a hybrid architecture, using a CoreML encoder for up to 18x faster encoding on the Neural Engine and an MLX decoder for up to 15x faster decoding, employing the AlignAtt policy for simultaneous speech recognition.
3This optimized design enables real-time execution of larger Whisper models (e.g., medium, large-v3-turbo) on Apple Silicon devices with significantly lower power consumption compared to MLX-only solutions.

Service

altalt-org

2025.11.09

·GitHub·by Anonymous

#MLX#CoreML#Whisper#Speech Recognition#Apple Silicon

1Lightning-SimulWhisper is an MLX/CoreML implementation of SimulStreaming for real-time local transcriptions on Apple Silicon, offering substantial performance and power efficiency improvements.
2It leverages a hybrid architecture, using a CoreML encoder for up to 18x faster encoding on the Neural Engine and an MLX decoder for up to 15x faster decoding, employing the AlignAtt policy for simultaneous speech recognition.
3This optimized design enables real-time execution of larger Whisper models (e.g., medium, large-v3-turbo) on Apple Silicon devices with significantly lower power consumption compared to MLX-only solutions.