Every producer opening a DAW for the first time encounters two types of tracks: MIDI and audio. The visual distinction is obvious β MIDI tracks show colored blocks of notes, audio tracks show waveforms β but understanding what is actually different between them is foundational to everything else in music production. MIDI and audio are not two ways of doing the same thing. They are fundamentally different kinds of data, with different strengths, different editing possibilities, and different roles in a production. This guide explains both clearly and completely.
MIDI is a set of performance instructions (note, timing, velocity) with no actual sound, while audio is a recorded waveform of actual sound. MIDI gives you flexibility to edit individual notes and change instruments after recording, whereas audio is fixed but represents real performances. They're complementary: MIDI controls virtual instruments, and audio captures the final sound or live recordings.
What Is MIDI?
MIDI stands for Musical Instrument Digital Interface. It is a communication protocol developed in 1983 that allows musical instruments, computers, and software to send performance instructions to each other. The critical thing to understand immediately: MIDI contains no audio. It is not a recording of sound. It is a set of instructions describing a musical performance.
Think of MIDI as sheet music. Sheet music tells a musician which notes to play, when to play them, how long to hold them, and how hard to strike them. Sheet music itself makes no sound β it is instructions. MIDI works the same way. A MIDI file or MIDI track tells an instrument (hardware or software) what notes to play. The instrument interprets those instructions and produces the actual sound. Change the instrument, and the same MIDI data produces a completely different sound.
A MIDI note contains several pieces of information: the pitch (which note on the musical scale), the velocity (how hard the note was played, 0β127), the timing (when the note starts, expressed in musical time β beats and bars rather than seconds), and the duration (how long the note is held). Additional MIDI data types include controller data (knob and fader movements, like modulation wheel position or filter cutoff), pitch bend, aftertouch, and program change messages that tell an instrument to switch to a different preset.
In a DAW, MIDI data is displayed and edited in the piano roll β a grid where the vertical axis represents pitch (matching the keys of a piano shown on the left side) and the horizontal axis represents time. Each note appears as a horizontal rectangle: its left edge marks where the note starts, its right edge where it ends, and its vertical color or height represents velocity. You can click any note and drag it to a different pitch or a different position in time. You can make a note shorter or longer by dragging its right edge. You can delete notes, add new ones, and change velocities β all without re-recording anything. This is the fundamental power of MIDI: complete, non-destructive editability after the fact.
What Is Audio?
Audio, in the context of music production, is recorded sound β actual waveform data captured from a microphone, a direct instrument input, or rendered from a software instrument. Unlike MIDI, audio contains the sound itself. When you record a vocalist into a DAW, the fluctuations in air pressure caused by the voice are converted into electrical signals by the microphone, those signals are converted to digital data by the audio interface, and that data is stored as a waveform file (WAV, AIFF, or similar). When you play that file back, the process reverses: digital data becomes electrical signal becomes speaker movement becomes sound.
Audio in a DAW appears as a waveform β the visual representation of the sound's amplitude over time. Louder sounds create taller waveforms; quieter sounds create shorter ones. Silence appears as a flat line. You can see the rhythmic structure of a drum recording in its waveform: each kick drum hit creates a sharp spike, each snare creates a different spike shape, the hi-hats create dense clusters of smaller movement.
Audio is fixed at the moment of recording. The specific tone of the guitar, the exact timbre of the vocalist's voice, the room acoustics, the microphone character β all of these are captured permanently in the waveform file. Unlike MIDI, you cannot change the pitch of a note by clicking on it and dragging it to a new position (without pitch-shifting tools). You cannot change a chord voicing by selecting the wrong note and moving it. What you recorded is what the waveform contains.
Audio files are also significantly larger than MIDI files. A MIDI file stores only performance instructions β perhaps a few kilobytes for a full arrangement. An audio file stores sample-by-sample waveform data β a single minute of uncompressed stereo audio at 24-bit/48kHz is approximately 17MB. A full production with multiple audio tracks consumes gigabytes of storage; the equivalent MIDI arrangement takes a fraction of a megabyte.
The Key Differences Side by Side
MIDI
- Contains instructions, not sound
- Fully editable after recording
- Instrument sound changeable at any time
- Tiny file size (kilobytes)
- Can be transposed instantly without quality loss
- Requires a software or hardware instrument to produce sound
- Used for: virtual instruments, synthesizers, drum programming, sequencing
- Tempo changes affect MIDI automatically β it is in musical time
Audio
- Contains actual recorded sound
- Fixed at the moment of recording
- Sound character determined by recording β microphone, room, instrument
- Large file size (megabytes per minute)
- Pitch shifting degrades quality at extremes
- Plays back directly β no instrument needed
- Used for: vocals, live instruments, hardware synths, samples
- Tempo changes require time-stretching (affects sound quality)
When to Use MIDI
MIDI is the right choice whenever you are triggering a software instrument β a virtual synthesizer, a software sampler, a drum plugin, a sample library. When you play a MIDI controller (keyboard, drum pads) and that performance drives a plugin in your DAW, you are generating MIDI data. The MIDI data tells the plugin which notes to play; the plugin generates the audio.
Programming a beat with a drum machine plugin
Your DAW records MIDI note data when you tap the pads. The drum plugin reads the MIDI and plays the kick, snare, and hat samples. You can go into the piano roll afterward and move any hit to the exact grid position, change the velocity to make a hi-hat quieter, or delete any note entirely. The beat never needs to be re-recorded.
Recording a synthesizer part with a MIDI keyboard
You play the synth part on a keyboard controller. The DAW records the MIDI performance β the notes, timing, and velocity. The software synthesizer generates the audio in real time. If you played a wrong note, open the piano roll and move it. If the part is slightly out of time, quantize it. If you decide the sound should be a pad instead of a lead, switch the plugin preset β the same MIDI data triggers the new sound.
Composing for a virtual orchestra
Film and TV composers use MIDI to write for large virtual ensembles β strings, brass, woodwinds, percussion β where each section is a different software instrument triggered by MIDI. The entire score can be transposed, its tempo changed, individual instruments swapped, and articulations edited in the piano roll without any re-recording.
MIDI is also the right choice when you want to iterate quickly. Because instrument sounds are interchangeable, the production decision about what sound to use can be deferred until later. Many producers sketch arrangements entirely in MIDI using placeholder sounds, then spend time at the end of the production refining the sound design β finding the right synthesizer patch, the right sample, the right instrument character for each part.
When to Use Audio
Audio is the right choice whenever you are capturing a real-world sound that cannot be replicated by a software instrument: a vocalist, an acoustic guitar, a live drum kit, a bass guitar played through a specific amplifier. The natural acoustic character of these instruments β the way a guitar string resonates, the room sound of a vocal recording, the mechanical noise of a drumkit β is what makes them sound real and distinctive, and none of that is transferable through MIDI.
Recording a vocal performance
The singer performs into a microphone. The audio interface captures the waveform. The resulting audio file contains the singer's exact voice, the microphone's tonal character, and the room acoustics β all permanently embedded. This cannot be "programmed" in a piano roll. The only way to change a note is to re-sing it or use pitch correction software to shift the captured audio.
Recording a hardware synthesizer
Even though a hardware synthesizer is triggered by MIDI, its audio output β the specific analog or digital voice circuit, the filter character, the subtle imperfections of the hardware β can only be captured by recording it as audio. A producer who wants the exact sound of a specific vintage synthesizer must record that synthesizer's audio output, because no software plugin perfectly replicates the hardware original.
Using a sample loop
A drum break sample is audio β a pre-recorded performance that is dropped into the arrangement as a waveform. It cannot be edited note-by-note in a piano roll because it was never MIDI. It can be chopped, time-stretched, pitch-shifted, and processed, but fundamentally it is a fixed audio recording.
How MIDI and Audio Work Together
In a complete modern music production, MIDI and audio coexist throughout the session. A typical pop production might contain: audio tracks for lead vocal, backing vocals, and recorded acoustic guitar; MIDI tracks for the synthesizer pads (triggering a software synth), bass line (triggering a bass synthesizer plugin), and programmed drums (triggering a drum plugin); and additional audio tracks created by bouncing (rendering) finished MIDI parts to audio for more efficient CPU processing.
The standard production workflow moves from MIDI toward audio as a session develops. Arrangement and composition happen in MIDI because of its flexibility β wrong notes are corrected, instrument choices are changed, parts are transposed. As sections are finalized, MIDI tracks are often bounced to audio to reduce CPU load from running multiple software instruments simultaneously and to lock in the final sound before mixing begins.
This bounce process β converting MIDI to audio β is straightforward in every DAW. The DAW plays back the MIDI track through the software instrument in real time and simultaneously records the audio output to a new file. The resulting audio file contains the sound of the MIDI performance exactly as it was when the MIDI track played through that software instrument with those settings. The MIDI original remains unchanged in the project β only the new audio file has been created.
MIDI Files and Audio Files
A MIDI file (.mid) stores MIDI data in a standardized format readable by any DAW, notation software, or MIDI-compatible hardware. Importantly, a MIDI file has no inherent sound β open it in a different DAW or on a different computer, and it will play through whatever instruments that system assigns to it. A MIDI piano melody might sound like a realistic grand piano on one system and a cheap synthesizer on another, depending on how the receiving software maps the MIDI data to instrument sounds.
Audio files (WAV, AIFF, FLAC, MP3) store the actual waveform data. A WAV file of a piano recording sounds the same on every system that plays it β the sound is embedded in the file. File size reflects the captured sound data: a 3-minute stereo WAV at 24-bit/44.1kHz is approximately 45MB. The same 3-minute arrangement as a MIDI file might be 50KB. The audio file is 1000 times larger because it stores actual sound rather than instructions.
Practical Tips for Producers
Keep MIDI arrangements editable for as long as possible in a production. The ability to change notes, timing, velocity, and instrument sounds late in the process is a significant creative advantage. Only bounce to audio when the part is final or when CPU resources demand it.
When recording live instruments β guitar, bass, vocals β always record as audio. Do not try to use MIDI to replicate a live performance that would be faster and more authentic to record directly. Save MIDI for programmed parts and software instruments.
Learn the piano roll thoroughly. Most DAW users spend more time in the piano roll than anywhere else in the software. Understanding how to navigate, select, move, resize, quantize, and humanize MIDI notes efficiently is the single skill that most accelerates production speed for any producer who uses software instruments.
When collaborating with other producers, be clear about whether you are sharing MIDI files or audio stems. A MIDI file requires the recipient to have the same or compatible software instrument to reproduce your intended sound. An audio stem (bounced audio of a MIDI track) plays identically on any system but cannot be edited note-by-note. Both have their place in collaboration workflows β the choice depends on whether you want the recipient to hear your exact sound or to be able to edit and change the performance.
Practical Exercises
Create and Edit Your First MIDI Note
Open your DAW and create a new MIDI track. Draw or record a simple 4-note melody (any notes, any tempo). Now edit each note individually: click the second note and drag it up or down to change its pitch. Double-click the fourth note and delete it. Play back your edited melody. Notice how you changed the performance without touching any sound files. This is MIDI flexibility. Now create an audio track, record yourself humming or singing the same melody, and try to edit one note's pitch. You'll immediately feel the difference β audio is locked; MIDI is flexible.
MIDI Control and Instrument Swapping
Create a MIDI track and draw 8 bars of a simple chord progression (use quarter notes, any key). Assign it to your DAW's default synth. Play it back and record the output as audio on a separate track. Now swap the MIDI track's instrument to a completely different sound (piano, strings, pad, or bass). Compare the two sounds side by side. Decide which instrument works better for your progression. Delete the old audio recording and re-record with your chosen instrument. This demonstrates MIDI's core strength: changing performances without re-playing or re-recording. Audio would require you to sing or play it again.
Hybrid Production: MIDI Foundation with Audio Layers
Build a 16-bar track using both MIDI and audio strategically. Start with a MIDI drum pattern (kick, snare, hi-hat) using your DAW's stock plugin. Add a MIDI bass line and MIDI chord progression (synth pad or piano). Record live audio on top: either sing a melody, play a guitar loop, or layer a vocal phrase. Now edit: adjust MIDI note timing and velocities to tighten the drums, transpose the MIDI chords up a half-step, and use time-stretch to fit your audio vocal to the grid. Export stems separately (drums, bass, chords, vocal). Reflect on what was better served by MIDI (flexibility, editing) versus audio (human feel, uniqueness). This mirrors real productions.
Frequently Asked Questions
MIDI stands for Musical Instrument Digital Interface. It's crucial to understand that MIDI contains no actual soundβit's purely a set of instructions describing a musical performance, like sheet music. This means the same MIDI data can produce completely different sounds depending on which instrument (hardware or software) plays it back.
A MIDI note contains: pitch (which note on the musical scale), velocity (how hard the note was played, ranging from 0β127), timing (when the note starts, expressed in beats and bars), and duration (how long the note is held). These four elements define a complete MIDI note in your DAW.
MIDI tracks display colored blocks representing notes on a piano roll grid, while audio tracks show a visual waveform of the actual sound recording. This visual difference reflects their fundamental distinction: MIDI is instructions for an instrument, while audio is the recorded sound itself.
Beyond note data, MIDI includes controller data (knob and fader movements like modulation wheel or filter cutoff), pitch bend messages, aftertouch information, and program change messages that tell an instrument to switch to a different preset. These expanded data types give producers fine control over instrument parameters.
MIDI timing is expressed in musical time using beats and bars rather than seconds, making it naturally synchronized with the project's tempo. This allows MIDI notes to automatically adapt if you change the BPM, unlike audio which stays locked to specific time positions.
Since MIDI contains only performance instructions rather than sound, changing the instrument that plays back the MIDI will produce a completely different sound while maintaining the exact same notes, timing, and performance characteristics. This is one of MIDI's greatest strengths in music production.
MIDI is compared to sheet musicβboth tell a musician which notes to play, when to play them, how long to hold them, and how hard to strike them. Just as sheet music itself makes no sound, MIDI is only instructions until an instrument interprets and plays it.
MIDI data is displayed and edited in the piano rollβa grid interface where the vertical axis represents pitch (matching piano keys shown on the left side) and the horizontal axis represents time. This visual layout makes it intuitive to see and modify notes and their properties.
MIDI is performance instructions β notes, timing, velocity. No sound. Audio is recorded sound β actual waveform data. MIDI is sheet music; audio is the recording.
No. MIDI contains no sound. The audio you hear when MIDI plays comes from the synthesizer or software instrument interpreting the MIDI data.
Yes β fully. Every note's pitch, timing, duration, and velocity can be changed non-destructively in the piano roll. This is MIDI's primary advantage over audio.
When triggering software instruments (synths, samplers, drum plugins), programming beats, composing virtual arrangements, or whenever you want to edit notes after the fact.
When recording real instruments (vocals, guitar, acoustic drums), capturing hardware synthesizers, or using pre-recorded samples and loops.
A .mid file storing MIDI note and controller data β no audio. Plays through whatever instrument the receiving software assigns. Tiny file size.
Converting a MIDI arrangement with its software instruments into a fixed audio file (WAV, AIFF). The MIDI renders to actual waveform data.
Yes β MIDI was designed for hardware. Connect via 5-pin DIN MIDI cable or USB MIDI. The hardware generates audio; record that output separately into your DAW.
A value from 0β127 indicating how hard a note was played. Usually controls volume but can also control filter, layer switching, and other parameters.
Neither β they serve different purposes. MIDI for flexibility and programming; audio for capturing real performances and final delivery. Professional production uses both.
Bounce or export in your DAW. Logic Pro: File β Bounce. Ableton: File β Export Audio. FL Studio: File β Export. The DAW renders the MIDI through the software instrument to a WAV file.