Question 1

What is AI lyric transcription and how does DropCue do it?

Accepted Answer

AI lyric transcription uses a speech-to-text model to convert sung vocals into written text. DropCue runs OpenAI Whisper on any uploaded vocal track in one click, returns time-aligned lyrics, and auto-populates the lyrics field on the track detail page. Whisper was trained on 680,000 hours of multilingual audio and handles vocals over instrumentation far better than general dictation engines used by Otter or Rev.

Question 2

How does DropCue compare to Sonix.ai for lyric transcription?

Accepted Answer

Sonix.ai charges $10 per audio hour on pay-as-you-go or $22 per month for a 5-hour cap. Sonix was built for podcasts, interviews, and meeting transcription, not sung vocals over a mix. DropCue includes Whisper-powered lyric transcription with Pro at $12 per month annual, has no per-minute fees, and stores the transcribed lyrics directly on the track record alongside BPM, key, writers, and publisher splits.

Question 3

How does DropCue compare to Otter.ai for music lyrics?

Accepted Answer

Otter.ai is $16.99 per month for Pro and is purpose-built for meeting and conversation transcription. Otter's model struggles with sung vocals because it expects single-speaker speech without backing instrumentation. DropCue uses OpenAI Whisper specifically tuned for music vocal workflows and is $4.99 cheaper per month, with no separate transcription account or per-minute caps.

Question 4

How does DropCue compare to Rev for lyric transcription?

Accepted Answer

Rev offers human transcription at $1.50 per audio minute (roughly $4.50 for a 3-minute song) and AI transcription at $0.25 per minute. Rev is accurate but slow on human jobs (12 to 24 hour turnaround) and has no music catalog integration. DropCue's Whisper transcription typically finishes in 20 to 40 seconds for a 3-minute song, auto-fills the lyrics field on the track, and costs zero per song once you have Pro.

Question 5

How accurate is DropCue's Whisper-powered transcription?

Accepted Answer

DropCue uses OpenAI Whisper, the same model that powers many professional transcription services. On clean vocal recordings most output requires only light cleanup. Heavy effects, dense vocal layering, and obscure proper nouns may need manual edits. The lyrics field stays fully editable in DropCue after transcription so you can polish the result before saving and pitching.

Question 6

Do music supervisors and PROs actually need accurate lyrics?

Accepted Answer

Yes. Music supervisors search libraries by lyric content for briefs like songs about home or tracks that mention summer. ASCAP, BMI, and SESAC all require lyrics when registering cues placed in film, TV, or advertising. Publishers screen for explicit language before placement. A catalog without lyrics is invisible to lyric-aware search and slower to register with PROs.

Question 7

What audio formats does lyric transcription support?

Accepted Answer

Lyric transcription works on every audio format DropCue accepts: WAV, MP3, AIFF, FLAC, and M4A. You do not need to convert your masters before transcribing. If the track plays in DropCue, Whisper will run on it.

Question 8

Are there per-song or per-minute fees on DropCue?

Accepted Answer

No. Lyric transcription is included with the Pro plan at $12 per month on annual billing. There are no per-song fees, no per-minute caps, and no usage tiers. Transcribe as many tracks in your catalog as you need. Compare that to Sonix at $10 per hour or Rev at $0.25 per minute, where heavy users pay hundreds per month for the same work.

Question 9

Does it work on non-English vocals?

Accepted Answer

OpenAI Whisper supports 99 languages with varying accuracy. English is the strongest, followed by Spanish, French, German, Italian, and Portuguese. Less-resourced languages may have higher error rates. DropCue's editable lyrics field lets you correct language-specific misreads after transcription.

Transcribe lyrics from any vocal track in one click

What AI lyric transcription is

Why music supervisors and PROs need accurate lyrics

DropCue vs Sonix vs Otter vs manual transcription

Where transcribed lyrics actually earn their keep

Pricing: $12/month, no per-minute fees

How it pairs with the rest of DropCue

Lyric transcription, built into your pitch workflow

Feature	DropCue	Sonix.ai	Otter.ai	Manual
Per-minute cost	Included with Pro	$10/hour	$16.99/mo flat	$30 to $100 per song
Monthly plan	$12/mo annual	$22/mo (5 hours)	$16.99/mo	N/A
Built for music vocals	Yes (Whisper)	No (podcast/meeting)	No (meetings)	Yes
One-click from track page	Yes	No	No	No
Auto-populates lyrics field	Yes	No	No	No
Time-aligned timestamps	Yes	Yes	Yes	No
Editable after transcription	Yes (in-platform)	Export-based	Export-based	Yes
Integrated sync metadata	Yes (BPM, key, writers)	No	No	No
No per-minute fees	Yes	No	No	No
Searchable lyric library	Yes	No	No	No