Parakeet TDT - Ultra-Fast ASR
Powered by NVIDIA Parakeet TDT 0.6B on ZeroGPU.
Performance: 3000x faster than Whisper - transcribes 1 hour in ~1 second on GPU!
Capabilities:
- Word-level timestamps
- Long audio support (auto-chunking)
- High accuracy (state-of-the-art on LibriSpeech)
API Endpoints for EagleEye:
POST /call/api_transcribe- Full transcriptionPOST /call/api_transcribe_segment- Short segment transcription
API Usage for EagleEye Integration
Full Transcription
from gradio_client import Client
client = Client("Cadayn/parakeet-zerogpu")
result = client.predict(
audio_url="https://example.com/audio.mp3",
api_name="/api_transcribe"
)
print(result)
# {"success": True, "text": "...", "words": [{"start_s": 0.0, "end_s": 0.5, "text": "Hello"}, ...]}
Segment Transcription (for streaming)
import base64
with open("segment.wav", "rb") as f:
audio_b64 = base64.b64encode(f.read()).decode()
result = client.predict(
audio_base64=audio_b64,
start_s=30.0, # Offset for timestamp alignment
api_name="/api_transcribe_segment"
)
print(result)
# {"success": True, "words": [{"start_s": 30.5, "end_s": 31.0, "text": "word"}, ...]}