Zyglio
01A voice-based training assistant powered by local LLM inference.
- Engineered a voice-based training assistant using vLLM and a quantized Qwen2.5 model for efficient local inference.
- Designed a real-time voice pipeline with LiveKit, Faster-Whisper, and Chatterbox, achieving 250-300 ms latency.
- Fine-tuned the model with LoRA, TRL, and PEFT to improve accuracy for domain-specific learning scenarios.
vLLMQwen2.5LiveKitFaster-WhisperLoRA