To record a podcast, you need a dynamic microphone (Rode PodMic at ~$99 or Shure SM7B at ~$399), an audio interface (Focusrite Scarlett Solo at ~$120), free DAW software like GarageBand or Audacity, and headphones. Your recording environment matters more than your gear β a $100 mic in a treated closet will outperform a $400 mic in an echoey room.
Updated May 2026 by MusicProductionWiki Staff
Podcasting is one of the few creative formats where audio quality directly determines whether listeners stay past the first two minutes. Audiences tolerate imperfect content delivered in clear, clean audio. They will not tolerate great content buried under echo, room hiss, and background noise. Getting your recording setup right before you publish your first episode is more important than the content itself β because nobody will hear your content if they've already closed the tab.
This guide covers the complete podcast recording process from the ground up: microphone selection, audio interfaces, acoustic treatment, DAW setup and workflow, recording technique, and the post-production steps that transform a raw recording into a professional-sounding episode. Whether you're starting with a $79 USB microphone or building a dedicated studio around a $399 Shure SM7B, every section here applies.
Step 1: Choose Your Microphone
Dynamic vs. Condenser for Podcasting
The first and most consequential decision in podcast recording is microphone type. Dynamic microphones and condenser microphones capture audio through fundamentally different transducer mechanisms, and that difference has serious practical implications for podcasters recording in typical home environments.
Dynamic microphones use a moving-coil transducer that requires the sound source to be relatively close β typically 6β10 inches β but rewards that proximity with aggressive off-axis rejection. HVAC hum, traffic rumble from outside, keyboard clicks, and the natural reverb of an untreated room are all significantly attenuated. Dynamic mics do not require 48V phantom power, are physically robust, and have a characteristically warm, mid-forward frequency response that suits spoken-word recording. For podcasters who aren't recording in dedicated, acoustically treated spaces, the dynamic microphone's room rejection is a genuine production advantage.
Condenser microphones use a thin diaphragm suspended near a backplate, producing higher sensitivity and more extended frequency response at both ends. That sensitivity captures more detail β which is excellent for music recording and for voices in professional studios. In an untreated home office, however, that same sensitivity picks up every echo, air conditioning hum, and ambient noise that a dynamic mic would reject. Condensers require 48V phantom power from an interface or mixer. They produce cleaner, more detailed vocal recordings only when the recording environment supports them.
Recommendation for most podcasters: Start with a dynamic microphone. The room rejection makes recordings more consistently usable without significant post-production. Move to a condenser only after investing in proper acoustic treatment or if you already have an unusually quiet, well-treated recording space. For a deeper look at this tradeoff, see our condenser vs. dynamic microphone guide.
Recommended Podcast Microphones by Budget
| Microphone | Price | Type | Connection | Notes |
|---|---|---|---|---|
| Audio-Technica ATR2100x | $79 | Dynamic | USB + XLR | Dual connectivity; excellent beginner mic with upgrade path to XLR later |
| Rode PodMic | $99 | Dynamic | XLR | Best entry-level broadcast mic; optimized for speech; tight cardioid pattern |
| Shure SM58 | $99 | Dynamic | XLR | Industry workhorse; extremely durable; cardioid; proven on stage and in studios |
| Rode PodMic USB | $149 | Dynamic | USB + XLR | PodMic broadcast capsule with USB added β ideal no-interface setup |
| Rode NT1 5th Gen | $245 | Condenser | XLR + USB | 4dB self-noise; best condenser for treated spaces; USB mode requires no interface |
| Shure SM7B | $399 | Dynamic | XLR | Industry standard for broadcast; used by top podcasters globally; needs high-gain interface |
The Shure SM7B is the industry standard for podcast vocal recording β used by prominent podcasters, broadcast journalists, and streamers globally for its warm, present, broadcast-quality tone. One important caveat: the SM7B requires significantly more gain than most budget interfaces can cleanly provide. If your audio interface offers less than 60dB of gain, a CloudLifter CL-1 ($149) or a Triton FetHead ($69) inline preamp booster is strongly recommended. Many podcasters start with the Rode PodMic and upgrade to the SM7B once they commit to the format long-term β both are cardioid dynamic microphones optimized for voice, and the workflow is identical.
Step 2: Audio Interface or USB Microphone?
XLR microphones β the professional standard across broadcast, music recording, and podcasting β require an audio interface to connect to your computer. The interface converts the microphone's analog signal to digital audio, provides the necessary gain via a microphone preamp, and supplies 48V phantom power for condenser microphones. The interface connects to your computer via USB-C (on current-generation interfaces) and appears as an audio input device in your DAW and operating system.
USB microphones have the interface circuitry built directly into the microphone body. They plug directly into a computer's USB port with no additional hardware. This makes them simpler and cheaper to get started, with no separate interface purchase required. The tradeoff is less flexibility β you can't swap capsules, upgrade the preamp independently, or easily add a second microphone to a separate track.
Recommended Audio Interfaces for Podcasting
The Focusrite Scarlett Solo ($120) handles one XLR microphone and is the most widely recommended single-host podcast interface. It provides clean preamp gain, 48V phantom power, and direct monitoring. For a full breakdown of its features, our Focusrite Scarlett Solo review covers the 4th-generation model in detail.
The Focusrite Scarlett 2i2 ($199) handles two XLR inputs simultaneously β essential for two-host in-person podcast setups. Both hosts record to separate tracks in the DAW, allowing independent level control and processing in post. See our Scarlett Solo vs. 2i2 comparison to determine which is right for your setup.
The Focusrite Scarlett 4i4 ($219) extends to four inputs, useful for panel podcasts with three or more in-person participants, or for setups that combine microphone inputs with instrument direct inputs.
If you're recording solo and want to avoid the interface entirely, the Rode PodMic USB and Rode NT1 5th Gen both offer USB modes that produce excellent quality without a separate interface. The NT1 5th Gen's USB implementation is particularly clean β it uses a high-quality onboard converter that rivals many entry-level dedicated interfaces. For more on selecting the right interface for your home studio, see our comprehensive best audio interfaces under $200 guide.
Step 3: Acoustic Treatment for Podcasting
Acoustic treatment is the single highest-leverage improvement most podcasters can make. A professionally treated recording space transforms any microphone β including a budget dynamic β into something that sounds broadcast-quality. An untreated space undermines even the most expensive gear.
The two acoustic problems to solve are echo/reverb (sound bouncing off hard surfaces and arriving at the microphone after a delay) and low-frequency room modes (bass frequencies that build up in corners and create a boomy, uneven frequency response). For spoken-word podcast recording, echo is the primary concern. Most home podcasters don't need to address room modes unless they're also mixing music in the same space.
Acoustic Treatment Options by Budget
Free/Immediate (No Budget): Record in a closet full of hanging clothes. The clothing acts as broadband absorption β it absorbs mid and high frequencies very effectively and reduces echo dramatically. Many professional-sounding independent podcasts are recorded in clothing closets. Alternatively, record in a small room with carpet, upholstered furniture, bookshelves full of books, and heavy curtains. These are all effective natural absorbers.
Low Budget ($50β$150): Moving blankets ($30β$60 for a pack) hung around your recording position are highly effective broadband absorbers. A furniture moving blanket draped over a microphone stand boom arm or hung on a curtain rod creates a significant improvement in room sound. Acoustic foam panels in specific corner and wall placements also work at this price point, though they are less effective at low frequencies than marketed.
Dedicated Treatment ($200+): Proper acoustic panels with 2β4 inch thick rigid fiberglass or rockwool insulation absorb across a wider frequency range than foam. For full treatment guidance, see our dedicated home studio acoustic treatment guide.
Practical Setup for Podcast Recording
You don't need to treat an entire room. Focus treatment on the reflection points closest to the microphone. A simple reflection filter β a curved panel of acoustic foam that mounts behind the microphone β reduces early reflections from the wall behind the microphone. Models from Kaotica Eyeball and SE Electronics are widely used by podcasters and voice actors. Combined with a closet or soft-furnished room, this is often sufficient for podcast-quality recordings.
Position your microphone away from hard, parallel walls. Recording in the center of a room is better than recording in a corner. If your room has concrete or plaster walls and no soft furnishings, even the best microphone will sound problematic until treatment is added.
Step 4: DAW Selection and Session Setup
Your DAW (Digital Audio Workstation) is where you record, edit, and process your podcast audio. For podcast production, DAW choice matters less than it does for music production β the basic operations (record, cut, export) are available in every option from free to professional. What matters is picking software you'll actually use consistently.
DAW Options for Podcasters
GarageBand (Mac, Free): The most commonly recommended starting point for podcasters on Mac. Straightforward interface, solid built-in EQ and compression, and zero cost. The audio quality is identical to Logic Pro β GarageBand is Logic Pro with a simplified interface. Exports to high-quality AAC and WAV without issue.
Audacity (Free, Mac/Windows/Linux): The most popular free cross-platform option with solid editing tools. Noise reduction built in, non-destructive editing with undo history, and export to MP3, WAV, and FLAC. The interface is less polished than GarageBand but the feature set is powerful for the price. The Label Track feature in Audacity is excellent for marking edit points during recording.
Reaper ($60 discounted license for personal use): A professional-grade DAW at an accessible price point. Highly customizable, available for Mac, Windows, and Linux, and supports an extensive plugin ecosystem including podcast-specific tools. The learning curve is steeper than GarageBand but the control over routing, monitoring, and processing is significantly greater. Strongly recommended for podcasters who want to grow into more complex production.
Adobe Audition ($20.99/month as part of Creative Cloud): The professional standard in broadcast and podcast production. Advanced noise reduction via the Adaptive Noise Reduction and Noise Print tools, spectral frequency display for identifying and removing specific noise events, batch processing for consistent episode processing, and tight integration with Adobe Premiere for video podcasters. Worth the subscription cost for serious producers.
Logic Pro ($199.99, Mac, one-time purchase): Excellent for podcasters who also produce music. The built-in plugin suite is comprehensive, Smart Tempo and Flex Time editing are useful for cleaning up pacing, and the interface is polished. For beginners deciding between DAWs, our best DAW for beginners guide compares all major options in depth.
Setting Up a Podcast Session
Once you've selected your DAW, the session setup for podcast recording is consistent across all platforms:
- Sample rate: 44.1kHz or 48kHz. Both are standard for podcast delivery. 48kHz is preferred if you may also use the audio with video content.
- Bit depth: 24-bit for recording. This gives you significantly more dynamic range headroom than 16-bit during capture and processing, even though the final export will be 16-bit MP3 or AAC.
- Track count: One track per microphone/participant. Recording each host or guest to a separate track is essential β it allows independent noise gating, EQ, compression, and level adjustment in post.
- Input monitoring: Use your interface's direct monitoring (zero-latency hardware monitoring) rather than DAW monitoring. This eliminates the processing latency that causes an unpleasant echo in your headphones while recording.
- File format for recording: Always record to uncompressed WAV or AIFF. Never record directly to MP3 β compressing audio at the recording stage discards data you cannot recover in post.
Step 5: Recording Technique and Levels
Gain Staging and Target Levels
Setting correct gain before recording is one of the most important β and most overlooked β steps in podcast production. The goal is to capture a healthy signal without clipping, while leaving sufficient headroom for processing in post-production.
Target recording level: Aim for an average RMS level of β18dBFS to β12dBFS during normal speech, with peaks not exceeding β6dBFS. This headroom is essential. If your peaks are hitting β3dBFS or higher during recording, you have almost no room for compression and limiting in post without distortion. Speak into the microphone at a comfortable conversational distance (6β12 inches with a pop filter) and set the interface gain until your loudest speech peaks hit approximately β12dBFS on the DAW's input meter β this is a reliable starting point.
What to avoid: Never let the input clip (hit 0dBFS). Digital clipping produces harsh, irrecoverable distortion. If you see the input meter hitting red, reduce the gain at the interface before recording β not in the DAW. Recording a signal that's too hot and then pulling it down in the DAW does not fix clipping; the distortion is already in the captured audio.
Gain vs. Volume: Gain is set at the interface (how hard you're pushing the signal before it enters the computer). Volume is adjusted in the DAW after capture. Always optimize gain at the source.
Microphone Technique
Distance: Dynamic microphones are typically used at 6β10 inches from the mouth. Closer positioning (3β6 inches) takes advantage of the proximity effect β a bass boost that occurs when directional microphones are used very close to the source β which can add warmth and presence to thin or bright voices. Condenser microphones are typically used at 6β12 inches.
Pop filter: Essential for any vocal recording. Plosive sounds β the burst of air on P, B, and T sounds β create low-frequency thumps that saturate the microphone capsule and are difficult to fix in post. A pop filter (foam windscreen or a double-layer mesh filter on a gooseneck arm) placed 1β2 inches in front of the microphone eliminates most plosive problems. The SM7B ships with both a foam windscreen and a close-talk windscreen; the PodMic has an integrated pop filter.
Off-axis rejection: Dynamic cardioid microphones are most sensitive directly on-axis (facing the capsule directly) and reject sound from the rear and sides. Position yourself directly in front of the microphone's capsule, not to the side. If you're recording two hosts in the same room with one microphone between them, each host will be partially off-axis β this is a compromise that significantly reduces audio quality. Always use one microphone per person.
Headphones during recording: Always monitor through closed-back headphones while recording. This lets you hear issues β clicks, handling noise, room echo β in real time before they accumulate across an entire episode. Monitoring through speakers while recording risks microphone bleed from the speakers into the recording.
Managing Background Noise
Six approaches in order of impact:
- Choose a dynamic microphone β they reject room noise better than condensers at comparable price points.
- Treat your recording space β even moving to a closet full of clothes dramatically reduces echo and high-frequency reflections.
- Record closer to the microphone β 6β10 inches with a pop filter. Closer proximity increases the direct signal-to-room ratio.
- Use a noise gate in your DAW β a gate plugin mutes the microphone when you're not speaking, preventing room noise from accumulating during pauses.
- Apply noise reduction in post β iZotope RX, Adobe Audition's Noise Reduction, or Auphonic can remove consistent background noise (HVAC hum, fan noise) using a noise profile captured from a room tone recording.
- Choose your recording window carefully β minimize HVAC cycling, traffic, and household noise. Record early morning or late evening if your environment is noisy during the day.
Step 6: Recording Remote Guests and Co-Hosts
Remote podcast recording β where participants are in different locations β introduces a new set of technical challenges. The core problem is that traditional video conferencing tools (Zoom, Google Meet, Skype) transmit compressed audio optimized for voice intelligibility, not audio quality. The codec compression and packet loss recovery applied to streaming calls produces audio artifacts that are immediately apparent in a professional podcast context.
Dedicated Podcast Recording Platforms
Riverside.fm and Squadcast are the professional standards for remote podcast recording. Both platforms solve the streaming quality problem through the same approach: each participant records audio locally on their own device at full quality (uncompressed WAV), and the platform uploads and syncs those local recordings after the session. What you hear during the call may be compressed streaming audio, but what you receive for editing is each participant's locally recorded, full-quality audio β completely independent of internet connection quality during the session.
Riverside.fm ($15/month for Standard plan) also records video locally at up to 4K, making it the standard for video podcast production. Squadcast (now part of Descript) integrates directly with Descript's transcription-based editing workflow. Both are significantly better than Zoom for professional podcast production.
Zoom: Free and ubiquitous, but produces compressed audio that sounds noticeably worse than dedicated podcast recording platforms. Acceptable for casual content or guest pre-interviews, but not recommended for final episode recording if audio quality is a priority. If Zoom is unavoidable, ask participants to record their own audio locally using Audacity or a voice memo app simultaneously, and sync in post using the Zoom audio as a reference track.
Double-Ender Recording
A double-ender is a remote recording approach where each participant records their own audio locally and shares the files afterward. This produces the highest possible quality for remote recording β each person's audio is captured directly by their own interface and microphone with no streaming compression whatsoever. The files are synced in post using a clap or countdown at the start of the session as a sync point. Riverside.fm and Squadcast automate this process; doing it manually requires each participant to use Audacity, GarageBand, or their DAW of choice to record their own track.
For podcasters who are also building out their home studio infrastructure, understanding the full signal chain is valuable context β our guide to home recording studio setup covers signal flow, monitoring, and room setup in broader detail.
Step 7: Editing and Post-Production
Raw podcast recordings require editing and processing before publication. The extent of editing varies by format β a tightly scripted solo show may need minimal cuts, while a long-form interview may require significant restructuring β but the processing chain is consistent across all podcast types.
Standard Podcast Processing Chain
Apply processing in this order on each voice track:
- Noise gate: Set the threshold just above the noise floor β the gate opens when you speak and closes during silence, preventing background noise from accumulating. Typical settings: threshold β40 to β30dBFS, attack 5β10ms, release 100β200ms, hold 50β100ms.
- High-pass filter (EQ): Cut all frequencies below 80β100Hz. This removes low-frequency rumble, HVAC vibration, and proximity effect buildup without affecting the intelligibility of the voice. Most podcast voices have minimal useful content below 100Hz.
- EQ (tone shaping): Moderate cuts in the 200β400Hz range reduce muddiness and boxiness. A gentle presence boost in the 2β5kHz range increases vocal intelligibility and definition. Air boost (10β16kHz) adds brightness if the microphone sounds dull. For detailed vocal EQ technique, see our how to EQ vocals guide.
- Noise reduction: If background noise is present, apply iZotope RX or Adobe Audition's Noise Reduction using a noise profile captured from a 1β2 second room tone recording at the beginning of the session. Keep reduction modest (6β10dB) β aggressive noise reduction introduces digital artifacts and makes voices sound unnatural.
- Compression: Compression reduces dynamic range β the difference between the quietest and loudest parts of speech β making the voice feel more consistent and present. For podcasting, a ratio of 3:1 to 4:1, moderate attack (20β40ms to preserve transients), fast release (100β200ms), and 3β6dB of gain reduction is a common starting point. The goal is consistent loudness without squashing the natural dynamics of speech. Our guide on using compression on vocals covers ratio, attack, and release in depth.
- Limiting: A limiter (compressor with a ratio of β:1 or very high ratio) at β1 to β2dBFS catches any remaining peaks before the output stage and prevents inter-sample clipping during export and encoding.
- Loudness normalization: Export to the target loudness for podcast platforms. Spotify, Apple Podcasts, and most streaming platforms normalize podcast audio to β16 LUFS (integrated). Targeting β16 LUFS integrated with a true peak of β1dBTP in your export is the standard specification.
Editing Workflow
For content editing (removing mistakes, long pauses, filler words, and off-topic tangents), there are two main approaches:
Timeline editing in the DAW: Listen through the recording, mark edit points, and cut directly in the DAW timeline. Most experienced podcast editors work this way β it gives full control over precise edits and crossfades. The Edit label marker system in Audacity and the Markers panel in Reaper and Adobe Audition facilitate this workflow efficiently.
Transcript-based editing (Descript, Riverside, Adobe Audition speech): Descript transcribes the audio and allows editing the recording by editing the text transcript β deleting a word in the transcript removes it from the audio. This is significantly faster for podcasters who aren't comfortable with traditional DAW timeline editing. The AI-powered filler word removal in Descript (removing every "um," "uh," and "you know" automatically) can save substantial editing time on interview content.
Export and Publication Specifications
- Format: MP3 (most widely supported) or AAC (slightly better quality at equivalent file size). WAV is not appropriate for distribution β file sizes are prohibitively large for streaming.
- MP3 bitrate: 128kbps mono for voice-only content; 192kbps stereo for music-heavy content or shows with production music. 64kbps mono is acceptable for very basic voice content with no music.
- Metadata: Embed episode title, show name, episode number, and cover art as ID3 tags. Most DAWs and dedicated podcast export tools handle this.
- Loudness target: β16 LUFS integrated, β1dBTP true peak for podcast platform delivery.
Step 8: Episode Length and Publishing Workflow
How Long Should a Podcast Episode Be?
Episode length should match your content, not a predetermined target. Most successful podcasts run 20β60 minutes β substantive enough to justify the listener's time, concise enough to complete during a commute or workout. Interview podcasts commonly run 45β90 minutes. Solo commentary episodes tend to be shorter: 15β30 minutes. Panel discussions with multiple guests often run 60β90 minutes.
The consistent finding from podcast audience research is that episodes that feel padded β long intros, excessive recapping, tangents that don't pay off β lose listeners faster than dense, tightly edited episodes that end at the natural conclusion of the content. When the content is done, the episode is done. Artificially extending episodes to hit a round number does not improve listener metrics.
Publishing Workflow
Once your episode is edited, processed, and exported, the publishing workflow is straightforward:
- Podcast host: Upload your MP3 to a podcast hosting platform (Buzzsprout, Anchor/Spotify for Podcasters, Libsyn, Transistor, or Podbean). These platforms store your audio files and generate the RSS feed that podcast directories use to distribute your episodes.
- Show notes and metadata: Write episode show notes, embed timestamps, and add guest links. Strong show notes improve discoverability in podcast search.
- Distribution: Submit your RSS feed to Apple Podcasts, Spotify, Google Podcasts, Amazon Music, and other directories. Most directories require a one-time submission; subsequent episodes are automatically pulled from the RSS feed.
- Episode artwork: Each episode can have unique artwork, or you can use your standard show artwork. Minimum specification is 1400Γ1400 pixels; 3000Γ3000 pixels is the current recommendation for all major platforms.
For podcasters who are also musicians or producers looking to build an audience beyond the podcast, understanding broader music promotion strategies is valuable context β our guide to promoting music independently covers audience-building principles that translate directly to podcast growth.
Practical Exercises
Record a 5-Minute Test Episode
Set up your microphone, interface, and DAW using the session settings in this guide (24-bit, 44.1kHz, input peaks at β12dBFS). Record a 5-minute test monologue speaking at normal conversational distance and volume, then listen back through headphones to identify any echo, hiss, or level issues before your first real episode. This diagnostic recording reveals your environment's weaknesses before they ruin a real recording.
Build a Podcast Processing Chain
Import a raw podcast recording into your DAW and build the full processing chain described in Step 7 from scratch: noise gate, high-pass filter at 100Hz, EQ shaping, compression at 3:1β4:1, and a limiter at β1dBFS. Export and run the file through an integrated loudness meter to confirm you're hitting β16 LUFS before uploading to a podcast host. Adjust gain and limiter threshold until the export matches the target specification.
Remote Double-Ender with Manual Sync
Arrange a remote recording session with one other person where both participants record their own audio locally using Audacity or their preferred DAW. Begin with a three-count clap visible in both waveforms, then sync the two tracks manually in your DAW using the clap transient as a reference point. Apply independent processing chains to each track, balance the levels, and export a properly mixed stereo episode β without using Riverside or Squadcast automation.