How to Mix Sound for Video (Easier Way)

How to Mix Sound for Video — and Why Your Music Choice Decides How Hard It Is

Mixing sound for video means setting each track to a level where the audience hears everything clearly and never notices the mix. The standard targets: dialogue averaging around -12 dB, music ducked to roughly -18 to -25 dB under voice, and sound effects between -12 and -18 dB but never louder than the dialogue. Get those relationships right and a muddy clip turns clean.

We watched a genuinely useful breakdown of exactly this and wanted to credit it, summarise what it teaches, and add our own take on the part nobody mentions: the music you start with decides how hard the mix is.

The Breakdown We’re Building On

The video is How to Mix Sound for Video in Adobe Premiere Pro by Kelsey at Premiere Gal. It’s a clear, no-nonsense walkthrough of mixing a real timeline — dialogue, music, and sound effects — using Premiere’s Audio workspace and Essential Sound panel. If you mix in Premiere, watch the whole thing; she shows the meters, the keyframes, and a before/after that makes the point better than any paragraph can.

She doesn’t mention us, and the mixing craft she teaches is universal — it works no matter where your music comes from. What we’re adding below is the music-side perspective: how the kind of music you drop on the timeline makes her technique either a fight or a five-minute job.

Original breakdown by Premiere Gal — credit where it’s due.

Dialogue First, at Around -12 dB

The video’s first rule: mix one track at a time, and start with the voice. Tag your clips as Dialogue in the Essential Sound panel, run Auto-Match to pull loudness toward broadcast standard, and aim for an average near -12 dB — it will drift above and below, and that’s normal. Mute everything else while you work.

Our commentary: this order matters because dialogue is your anchor. Every other level is set relative to the voice, not in isolation. She also flags a real trap — pushing Reduce Noise too hard hollows out the voice. The same restraint applies to music: the more your music asks for, the less room the voice has. A clean dialogue track plus crowded music is still a bad mix. Set the voice, protect it, then build everything around it.

Ducking: The Technique That Makes or Breaks the Mix

Ducking is automatically dipping the music whenever the voice speaks, then letting it rise back up in the gaps. In Premiere you tag the music track as Music, enable Ducking against Dialogue, and generate keyframes — the music drops to roughly -18 to -25 dB under speech and comes back up on a logo reveal or an intro sting. You can hand-edit the keyframes or drop an Exponential Fade on the tail for a clean fade-out.

This is the heart of the video, and it’s solid. Here’s the part the tool can’t fix, though: ducking only changes the volume of the music. It can’t change what the music is doing. If your track has a busy melody and a bright vocal-range hook playing right where your narrator talks, turning it down 10 dB just makes a muddy clash quieter. The frequencies still collide. You end up fighting a finished song instead of mixing it.

Why Layered, One-Key Music Ducks Cleaner

Layered music means your cue arrives as separate building blocks instead of one bounced file: a low foundation (drones, pads, textures), a mid layer (rhythmic loops, chord progressions), a top layer (melodies and signature phrases), and bonus elements like risers and impacts. Because the layers are built in one key, you can remove or add any of them without the cue clashing.

That changes everything about ducking. Under dialogue, you don’t just lower volume — you thin the arrangement. Pull the melody and keep the pad and a soft pulse. The voice now sits in clear space instead of fighting a hook for the same frequencies. When the narration stops and the logo lands, bring the top layer back and let the cue bloom. That’s a tonal mix, not a volume war. This is what we mean by tonal sounds for video editing — music designed to be reshaped moment to moment.

Every layer you’re mixing here is pulled straight from our Moods and Emotional Ambiances series — finished cues you can also break down into stems. Press play, then hit New mix to reshuffle; it all stays in one key, so the layers never clash.

Grab the packs the mixer is pulling from:

Moods Vol. 3

View pack →

Emotional Ambiances Vol. 2

View pack →

Moods Vol. 2

View pack →

A Duende Soundtrack Kit ships exactly these layers, every sound tagged by key and tempo so they stack without clashing. You audition and combine them in the free desktop app, then drag plain WAV or MP3 into Premiere, Resolve, or Final Cut. The ducking step Kelsey teaches still applies — it just has far less to do, because the arrangement is already getting out of the voice’s way.

Sound Effects: Mix Each One, Keep Them Invisible

The video’s SFX advice is precise: aim for about -12 to -18 dB, never louder than the dialogue, and mix every effect individually — a whoosh, a pop, and a logo hit all land at different levels, so go clip by clip. The guiding principle she repeats is the best line in the whole video: good sound editing is invisible. If a level jump makes the viewer think “what was that?”, it pulled them out of the story.

Our extension: SFX are the punctuation of your mix, and clean, production-ready ones get to invisible faster. Cinematic Whooshes for transitions and Trailer Impacts for the logo hit arrive pre-shaped, so your per-clip level pass is a quick nudge rather than a rescue. And when an impact is in the same key as your kit, it doesn’t just punctuate — it lands musically. That’s a hit you feel but never notice.

Reach for these kits to build it:

A Mixing Workflow That Starts Before the Mix

Put it together and the workflow is simple. Record a clean voice close to the mic. Set dialogue to an average around -12 dB and protect it. Then, instead of dropping a finished song and clawing it back down with keyframes, build your cue from layers — thin under voice, full in the gaps. Add SFX clip by clip, never above the dialogue. Finish with fades so nothing cuts hard.

The before/after at the end of Kelsey’s video proves the levels matter. Our point is that the source material matters just as much: a mix is only as easy as the music will allow. Layered, one-key building blocks hand you the dynamics before you touch a single keyframe — which means the invisible mix she’s chasing is the default, not a battle.

Free download

Free Layer Starter: 4 cinematic cues + every layer

Want to feel how layered scoring works before you buy anything? Grab our free Layer Starter — four finished cues from our Moods and Emotional Ambiances series, each broken out into all its individual layers, so you can stack, mute, and reshape them in your own edit. Everything’s in one key. Pop in your email and it’s yours.

Frequently asked questions

What level should dialogue sit at when mixing video?

Aim for an average of around -12 dB for dialogue. It will move above and below that as the speaker varies, which is normal — watch the meter rather than chasing a fixed number. Mix the voice first with everything else muted, since every other track’s level is set relative to your dialogue, not in isolation.

What is audio ducking and why does it matter?

Ducking automatically lowers the music whenever the voice speaks, then lets it rise in the gaps. In Premiere Pro you enable it in the Essential Sound panel and generate keyframes. Under dialogue, music typically sits around -18 to -25 dB. It keeps narration intelligible while still letting the score breathe during intros and reveals.

Why does layered music make ducking easier?

Ducking only changes volume, not what the music is doing. A busy finished song still clashes with the voice when you turn it down. Layered, one-key music lets you remove the melody under dialogue and keep just a pad or pulse, so the voice has clear frequency space — then add layers back in the gaps.

What level should sound effects be in a video mix?

Aim for roughly -12 to -18 dB, and never louder than the dialogue. Mix each effect individually — a whoosh, a pop, and a logo hit all land at different levels, so go clip by clip. The goal is an invisible mix the audience feels but never consciously notices.

What is a Duende Soundtrack Kit?

A Soundtrack Kit is cinematic music delivered as building blocks: a low foundation of drones and pads, a mid layer of loops and chord progressions, a top layer of melodies, plus impacts, risers and transitions. Everything is built in one key so layers stack and swap without clashing — ideal for thinning music under dialogue.

Can I use Duende sounds in Premiere Pro and DaVinci Resolve?

Yes. Every sound is plain WAV or MP3 that drags straight into Premiere Pro, DaVinci Resolve, or Final Cut. The free desktop app tags each sound by key and tempo so you can audition and combine layers before you commit, but it’s a helper — the files work in any editor on their own.

Watch the full mixing breakdown on Premiere Gal, nail your dialogue and ducking levels, and when you want music that gets out of the voice’s way on its own, browse the kits and start building.

Sound Mixing for Video: Why Layered Music Ducks Better

How to Mix Sound for Video — and Why Your Music Choice Decides How Hard It Is

The Breakdown We’re Building On

Dialogue First, at Around -12 dB

Ducking: The Technique That Makes or Breaks the Mix

Why Layered, One-Key Music Ducks Cleaner

Sound Effects: Mix Each One, Keep Them Invisible

A Mixing Workflow That Starts Before the Mix

Free Layer Starter: 4 cinematic cues + every layer

Frequently asked questions