openai-gpt4o-transcribe wins on accuracy (lowest CER). deepgram-nova-3 is cheapest per minute. deepgram-nova-3 has the lowest latency.
Highlighted rows are noisy clips. Lat = API latency in seconds.
| Sample | openai-gpt4o-transcribe | deepgram-nova-3 | ||||
|---|---|---|---|---|---|---|
| CER | WER | Lat | CER | WER | Lat | |
| en-a-clean | 0.0401 | 0.0473 | 4.12s | 0.0518 | 0.0676 | 1.5s |
| en-a-noise noisy | 0.0518 | 0.0878 | 3.33s | 0.0968 | 0.1622 | 4.45s |
| en-b-clean | 0.0472 | 0.0818 | 4.04s | 0.0730 | 0.1321 | 2.91s |
| en-b-noise noisy | 0.0572 | 0.1006 | 4.24s | 0.0730 | 0.1195 | 0.99s |
| ko-a-clean | 0.0548 | 0.2079 | 4.04s | 0.0685 | 0.2376 | 1.99s |
| ko-a-noise noisy | 0.0651 | 0.1980 | 4.03s | 0.0753 | 0.3168 | 0.92s |
| ko-b-clean | 0.0337 | 0.0690 | 4.1s | 0.0599 | 0.1638 | 2.23s |
| ko-b-noise noisy | 0.0899 | 0.1810 | 4.2s | 0.1423 | 0.3276 | 0.76s |
Average CER on clean vs noisy clips. Lower Δ = more noise-robust.
| Model | Clean avg CER | Noisy avg CER | Degradation Δ |
|---|---|---|---|
| openai-gpt4o-transcribe | 0.0440 | 0.0660 | +0.0221 |
| deepgram-nova-3 | 0.0633 | 0.0969 | +0.0336 |
Cost = audio duration × price/min. Latency = API response time only — rate-limit pauses excluded.
| Model | $/min | Audio | Est. cost | Avg latency | Total latency |
|---|---|---|---|---|---|
| openai-gpt4o-transcribe | $0.0060 | 7.88 min | $0.047279 | 4.01s | 32.10s |
| deepgram-nova-3 | $0.0052 | 7.88 min | $0.040975 | 1.97s | 15.75s |
CER — Character Error Rate
Primary metric for Korean. Spaces stripped before comparison — Korean spacing is inconsistent across models. Follows KsponSpeech evaluation standard.
WER — Word Error Rate
Secondary metric. Less reliable for Korean due to ambiguous word boundaries. Use CER as primary for Korean content.
Loanword Accuracy
Accuracy on English loanwords and code-switched terms (오븐, 레시피, 간 맞추기). Critical for kitchen use case.
Composite Score
Weighted: CER 55% + WER 30% + Loanword 15%. Relative between models. Speed excluded — measures API latency, not model quality.