/diːˈtjuːn/
Detune is a synthesis parameter that shifts one or more oscillators slightly above or below a reference pitch, measured in cents. When multiple detuned oscillators are layered, their interference patterns create warmth, width, and natural movement.
Before the chorus hits, before the filter opens, before anything else moves the listener — it's the almost-imperceptible drift between two oscillators that turns a sine wave into something that breathes.
Detune is a synthesis parameter that displaces the pitch of one or more oscillators by a precisely controlled interval — almost always measured in cents, where 100 cents equals one semitone. Rather than raising or lowering pitch to a musically distinct interval, detune operates in the sub-semitone range — typically ±50 cents maximum in most synthesizers, though some instruments allow up to ±100 cents per oscillator. The result is not a recognizable harmony but a tight cluster of nearly identical pitches whose interaction produces acoustic beating, spectral enrichment, and the perception of physical space inside a sound.
The psychoacoustic mechanism at work is amplitude modulation through phase interference. When two sinusoidal signals at nearly identical frequencies are summed, they periodically reinforce and cancel each other at a rate equal to the absolute difference in their frequencies. A detuning of 5 cents between two oscillators tuned to A4 (440 Hz) produces a beating frequency of approximately 1.27 Hz — slow enough to register as warmth and movement rather than a distinct tremolo effect. Increase the detune spread to 20 cents and the beating rate reaches roughly 5 Hz, entering a zone that feels more animated and aggressive. This relationship between cents offset, fundamental frequency, and beating rate is not constant: the same cent value produces a faster beat on higher notes than on lower ones, which is why analog synthesizers with natural tuning drift sound more alive in the upper registers.
In a multioscillator or unison context, detune becomes a spatial tool as much as a timbral one. Most modern synthesizers that offer a unison mode — stacking multiple virtual oscillator voices on a single key press — accompany this with a detune parameter that spreads those voices across a user-defined pitch range. When stereo panning is applied to these voices simultaneously (a technique often called unison width or unison spread), the pitch differences between left-panned and right-panned oscillators create inter-aural comb filtering that the brain interprets as acoustic width. This is the foundational mechanism behind the supersaw waveform popularized by Roland's JP-8000 in 1996 and replicated in virtually every trance, EDM, and cinematic synthesis context since.
Detune occupies a distinct conceptual space from related modulation sources. An LFO modulating oscillator pitch is vibrato — a time-varying pitch deviation that moves in and out of tune cyclically. Detune, in its most basic form, is a static offset — a fixed pitch displacement that remains constant unless itself modulated. However, most contemporary synthesizers allow detune to be modulated by LFOs, envelopes, or random sources, blurring this distinction and enabling effects that range from slow chorus-like movement to chaotic pitch randomization. Understanding detune as a static foundation that can be animated through modulation is the key to using it intentionally rather than decoratively.
At the mixing and production level, detune also appears as a parameter in several effects processors — most notably pitch shifters, chorus units, and dedicated doubler plug-ins — where it governs the pitch offset of a processed signal relative to the dry source. In these contexts the term is functionally equivalent to its synthesis counterpart but the signal architecture differs: rather than mixing oscillators at the synthesis stage, the effect creates detuned copies of an already-rendered audio signal. The distinction matters for phase behavior, latency compensation, and the quality of the resulting beating, which is why producers who understand the underlying principle make more informed choices about where in the signal chain to introduce pitch dispersion.
At its most fundamental level, detune functions by altering the phase accumulation rate of a digital oscillator or the physical tension-frequency relationship of an analog VCO. In a digital synthesizer, an oscillator generates a waveform by incrementing a phase pointer through a lookup table at a rate determined by a tuning register. Increasing or decreasing that rate by a factor corresponding to a cent offset — mathematically, multiplying the base frequency by 2^(cents/1200) — shifts the output pitch without altering waveform shape. A +7 cent offset on a 440 Hz oscillator raises the output to approximately 441.8 Hz. When this signal is summed with an unmodified 440 Hz oscillator, the two signals drift in and out of phase alignment at approximately 1.8 Hz, producing the audible beating that defines detuned synthesis character.
In unison mode, the synthesizer instantiates multiple copies of the oscillator stack — commonly 2, 4, 6, 8, or 16 voices — and applies a detune spread parameter that distributes these copies evenly across a pitch range centered on the played note. If the spread is set to ±10 cents with 8 voices, each voice is assigned a unique offset: approximately −10, −7.1, −4.3, −1.4, +1.4, +4.3, +7.1, and +10 cents. The resulting composite signal is a dense cluster of nearly identical frequencies that, when summed, produces a rich comb-filtered spectrum. Adding stereo panning — alternating voices left and right — means that the left channel contains a different set of beating relationships than the right channel, creating true stereo width that cannot be replicated by simply widening a mono signal after the fact.
The waveform type interacts critically with detune behavior. A pure sine wave contains only its fundamental frequency, so beating between two detuned sine oscillators produces a single, clean amplitude modulation. A sawtooth wave is harmonically rich — it contains all integer overtones — so each harmonic pair between two detuned sawtooth oscillators beats at a rate proportional to its harmonic number. The third harmonic of a 5-cent-detuned oscillator beats three times as fast as the fundamental. This means sawtooth and square waves produce more complex, more spectrally distributed beating than sine waves, which is why supersaw pads feel so thick and alive: there are dozens of simultaneous beating relationships occurring across the frequency spectrum at different rates.
Digital implementations introduce an important consideration: aliasing at extreme detune settings. When a wavetable or bandlimited oscillator is transposed upward by a significant number of cents, its highest harmonics may be pushed above the Nyquist frequency and fold back into the audible spectrum as inharmonic artifacts. Quality synthesizers use oversampled oscillators or dynamic harmonic limiting to prevent this, but budget instruments may introduce audible aliasing at detune values above ±30–50 cents. This is one reason why classic hardware synthesizers — which use analog VCOs or carefully implemented DCOs — often sound smoother at wide detune settings than their digital counterparts.
The practical upshot for producers is that detune is not a single effect but a continuum of behavior governed by the interaction of offset amount, voice count, waveform complexity, stereo distribution, and modulation. Treating it as a simple thickness knob misses the depth available when these variables are understood and controlled independently. Setting detune intentionally — choosing a beating rate that complements the tempo, selecting a voice count that fills the stereo field without masking the low end, modulating the spread with a slow LFO to create organic movement — is what separates polished synthesis from accidental texture.
Diagram — Detune: Frequency spectrum diagram showing two detuned oscillators at 440 Hz and 445 Hz, their individual spectra, combined waveform beating pattern, and resulting amplitude modulation rate.
Every detune — hardware or plugin — operates on the same core parameters. Know these and you can work with any implementation.
Sets how far each oscillator is displaced from the reference pitch, measured in cents (1/100 of a semitone). Typical musical ranges: 0–5 cents for subtle warmth, 5–15 cents for classic analog character, 15–50 cents for aggressive EDM supersaw textures. Values above 50 cents begin to imply a distinct pitch interval and lose the perception of a single unified note.
Determines how many oscillator instances are summed together, with the detune spread distributed evenly across them. More voices (4–16) create denser beating networks and fuller stereo images but consume proportionally more CPU and can cause low-end muddiness. Two-voice unison (one slightly above, one slightly below) is the most phase-coherent configuration and sits cleanly in a mix.
Controls the panning assignment of each unison voice, ranging from fully mono (all voices center) to fully wide (outer voices hard left/right). A width of 50–75% retains center mono compatibility while delivering perceived stereo width. Full width can cause phase cancellation issues on mono playback systems and should be checked before export.
Routes an LFO, envelope, or random source to modulate the detune amount, creating time-varying pitch drift. A slow LFO (0.1–0.5 Hz) at low modulation depth (1–3 cents) simulates analog VCO drift and is one of the most effective ways to make a digital synthesizer feel organic. Faster rates (4–8 Hz) at small depths produce vibrato; chaotic random modulation emulates cassette tape instability.
Distinct from a global detune parameter, fine tune sets a fixed offset per oscillator independently. This allows asymmetric configurations — for instance, OSC1 at 0 cents, OSC2 at +7 cents, OSC3 at −3 cents — that produce character impossible to achieve with a symmetric spread parameter. Many producers replicate specific vintage synth voicings by manually setting per-oscillator fine tune values documented in hardware service manuals.
Some synthesizers (Serum, Vital) offer a 'spread' control that shapes how voices are distributed across the pitch range — linear, exponential, or stacked toward the edges. A linear spread places voices at equal intervals; an edge-weighted distribution concentrates voices at the extremes for a broader perceived width with cleaner center. This parameter is rarely documented but significantly affects the density and phasiness of the result.
Session-ready starting points. These values assume a two-to-four oscillator patch at moderate mix levels; increase unison voice count in dense arrangements only when CPU headroom allows.
| Parameter | General | Drums | Vocals | Bass / Keys | Bus / Master |
|---|---|---|---|---|---|
| Detune amount (cents) | 2–8c subtle | 0–3c (minimal) | 5–12c doubler | 3–7c warmth | 0c (avoid) |
| Unison voices | 2–4 voices | 1–2 voices | 2 voices | 2–4 voices | 1 (no unison) |
| Stereo width | 40–70% | 0–30% | 50–80% | 30–60% | Check mono |
| Beating rate target | 0.5–3 Hz | Avoid beating | 1–4 Hz | 0.5–2 Hz | N/A |
| LFO modulation depth | 0–3c slow LFO | 0c | 0.5–2c | 1–3c | 0c |
| Low-cut after detune | 80–120 Hz | 60–80 Hz | 100–180 Hz | 40–60 Hz | HPF pre-master |
These values assume a two-to-four oscillator patch at moderate mix levels; increase unison voice count in dense arrangements only when CPU headroom allows.
The origins of deliberate oscillator detuning trace to the earliest days of polyphonic analog synthesis. Robert Moog's Minimoog Model D (1970), though monophonic, offered three independently tunable oscillators whose fine-tune controls allowed producers to introduce controlled pitch dispersion. Engineers at recording studios quickly discovered that spreading OSC2 and OSC3 by small amounts — typically 3–7 cents — transformed the instrument's single-voice output into something with the perceived richness of an ensemble. Keith Emerson, Rick Wakeman, and other early adopters made this practice central to their sound design vocabulary, though it remained a manual, imprecise art dictated by the mechanical stability of analog VCOs.
The development of dedicated unison modes with programmable detune parameters arrived in the late 1970s as Japanese manufacturers began implementing microprocessor-controlled tuning systems. The Oberheim OB-X (1979) and its successor the OB-Xa (1980) allowed multiple voices to be assigned to a single key, with voice-to-voice detuning providing the thick, organically drifting texture that defined artists like Vangelis and early Human League records. The sequential Prophet-5 (1978) could be tuned to drift or snap to precision, and producers learned to exploit the instrument's voltage-controlled oscillator imprecision — factory calibration tolerances of ±5 cents per voice — as a feature rather than a flaw. Bob Clearmountain's mixing work on 1980s pop records frequently preserved these imprecisions rather than correcting them.
The digital era introduced both a solution and a problem. The Roland D-50 (1987), Korg M1 (1988), and subsequent workstation synthesizers offered perfectly stable digital oscillators — and sounded sterile to ears accustomed to analog drift. Manufacturers responded by adding detune and chorus parameters to their digital instruments specifically to reintroduce the movement that analog instability had provided organically. The Roland JP-8000 (1996) represented the apotheosis of this approach: its SuperSAW oscillator mode stacked seven sawtooth waveforms with an asymmetric detune spread, creating a sound so immediately recognizable and sonically overwhelming that it defined the trance and eurodance aesthetic of the late 1990s and early 2000s. Producers like Ferry Corsten, Tiësto, and ATB built entire arrangements around the JP-8000's supersaw character.
Software synthesis democratized access to dense unison detune in the early 2000s. Native Instruments Massive (2007) brought the supersaw concept to a new generation of producers, with a detune parameter that could be modulated, automated, and processed in ways impossible on hardware. The subsequent decade saw detune become a central parameter in virtually every major software synthesizer — Serum (Xfer Records, 2014) in particular refined the concept with its Unison detune and Stack modes, allowing producers to stack up to 16 voices with sophisticated spread curves. The resulting aesthetic — dense, wide, aggressively detuned supersaws — became the sonic signature of future bass, melodic dubstep, and hybrid trap, heard in the work of artists like Flume, Madeon, and Illenium throughout the 2010s.
Leads and supersaws represent the most aggressive application of detune in contemporary production. A supersaw lead patch typically uses 7–16 unison voices detuned between ±10 and ±30 cents with maximum or near-maximum stereo spread. The key production consideration is low-end phase cancellation: with many detuned voices summed at wide stereo spread, the sub-80 Hz region becomes acoustically unstable and will collapse unpredictably in mono. Standard practice is to apply a high-pass filter at 100–150 Hz to the detuned lead and layer a separate, monophonic bass element underneath. In Serum, the Stack mode allows an additional set of detuned copies to be tuned down an octave, which must be carefully level-balanced to avoid bass masking.
Pads and atmospheric textures use detune more conservatively — typically 2–12 cents with 2–4 voices — to achieve warmth and movement without the sharp, aggressive beating of supersaw leads. The goal here is a beating rate in the 0.3–2 Hz range that registers as organic life rather than deliberate effect. Many producers supplement light static detune with slow LFO modulation of the detune amount (0.05–0.2 Hz, 1–3 cents depth) to simulate the natural drift of vintage analog hardware. Hans Zimmer's synthesis team, for example, uses manually detuned oscillator clusters combined with custom LFO patterns to create the unstable, alive quality of pads in his cinematic scores.
Bass synthesis requires particular care with detune because low frequencies are especially sensitive to phase relationships. Even 3–5 cents of detune between two bass oscillators can create significant comb filtering in the 60–120 Hz range, causing the low end to thin out or swell unpredictably at different pitches. Many producers keep bass oscillators precisely in tune and introduce subtle detune only in the upper harmonics by using different detune values per oscillator with a crossover-style filter setup, or by applying chorus-style detuning only to a parallel high-pass signal above 200 Hz. Keeping at least one oscillator at 0 cents provides a stable fundamental anchor.
Vocals and acoustic doubling involve a closely related application: using pitch-shifting algorithms in effects processors (iZotope Nectar, Waves Doubler, SoundToys MicroShift) to create detuned copies of a vocal signal, typically ±8–20 cents, panned left and right. This is functionally identical to oscillator detune but applied to recorded audio. The critical difference is artifact character — a good doubler uses high-quality pitch-shifting with minimal formant distortion, while a poor one introduces metallic artifacts at high detune values. Producer Noah Goldstein has discussed using this technique on Frank Ocean's Blonde sessions to create width on lead vocals without recourse to traditional chorus.
One email a week. The techniques behind the terms — curated by working producers, not algorithms.
Abstract knowledge becomes practical when you can hear it in music you know. These tracks demonstrate detune used intentionally, at specific moments, for specific purposes.
The opening lead of this record is one of the most recognized supersaw patches in trance history, built on the Roland JP-8000's SuperSAW oscillator with its characteristic 7-voice detune spread. Listen to how the detuning creates an almost vocal formant-like shimmer that sits above the bass without requiring additional harmonic processing. The patch remains clearly audible at full mix due to its spectral density — the wide detune fills the 800 Hz–4 kHz range with beating content that cuts through pads and kick. This record became a reference for trance lead synthesis and is still used to calibrate supersaw patches two decades later.
Flume's signature sound during the Skin era relied heavily on tightly controlled unison detune — approximately 4–8 cents across 4–6 voices — rather than the aggressive supersaw approach. The chorus lead here demonstrates how a moderate detune value on a brighter waveform creates perceived warmth without aggressive beating. Compare the chorus lead to the verse arpeggiated elements: the verse parts are noticeably tighter in pitch, emphasizing the emotional lift when the detuned chorus elements appear. The stereo width is set conservatively (approximately 50–60%) to maintain mono compatibility across festival PA systems.
This track illustrates detune used as a tool for instability and nostalgia rather than power or width. The synthesizer elements throughout Roygbiv exhibit slow, irregular pitch drift that suggests tape-machine instability or deliberately de-calibrated analog VCOs. The beating rates are inconsistent — some oscillator pairs drift at 0.2 Hz, others at nearly 2 Hz — creating the queasy, memory-like quality characteristic of BoC's aesthetic. The technique here is relevant because it demonstrates detune modulated by slow random LFOs rather than static values: no two bars of the pad have identical beating characteristics, preventing the ear from locking onto a repeating artifact.
Icarus features a lead synthesizer built on dense unison detune with approximately 8 voices and a wide spread — an explicit reference to the JP-8000 supersaw tradition but implemented in software (likely Sylenth1 or Massive based on contemporary interviews). The key production decision visible here is the application of sidechain compression to the lead against the kick, which periodically reduces the detuned lead's volume and creates a pumping interaction. This demonstrates how detune and sidechain work together: the brief volume reductions allow the mix's transients to punch through the spectrally dense detuned layer without requiring EQ cuts in the midrange.
The simplest form: a fixed pitch offset set manually on individual oscillators within a patch, held constant unless changed by the performer. Produces consistent beating at a rate determined by the offset and the played note's frequency. Used in vintage analog synthesis for warmth, ensemble thickness, and to mask the clinical precision of digital oscillators when applied to digital-analog hybrid instruments.
Multiple oscillator copies (2–16 voices) distributed symmetrically above and below the fundamental pitch, typically with simultaneous stereo panning assignment. Produces the most spectrally dense and spatially wide result. The defining sound of trance, EDM, and cinematic synthesis from the mid-1990s onward. CPU or voice-count limitations in hardware implementations often forced creative compromises that became characteristic artifacts.
Detune amount modulated by a slow random or sample-and-hold LFO, simulating the thermal and mechanical instability of vintage VCOs. Produces an organic, unpredictable pitch movement that feels human and imperfect rather than mechanical. Central to the aesthetic of Berlin-school synthesis, lo-fi electronic music, and any production attempting to evoke pre-digital analog character. The Moog Subsequent 37's drift controls are a hardware implementation of this concept.
Detune applied within a wavetable or spectral synthesis context, where individual partials or wavetable frames can be detuned independently of the fundamental. Produces inharmonic, bell-like, or metallic textures depending on which partials are offset and by how much. Used in sound design for hybrid organic-synthetic textures — pad layers in film scoring, designed sound effects, and the evolving timbral character of progressive electronic music.
Applied at the effects-processing stage rather than in synthesis, creating detuned copies of an audio signal through short pitch-shifted delays. Produces a similar timbral thickening to oscillator detune but with different phase behavior and artifact character. Widely used on guitars, vocals, keyboards, and full mixes to add stereo width without changing the underlying synthesis architecture. The Roland Dimension D (1979) is considered a definitive implementation of this approach.
Frequency conflicts — two instruments in the same range at similar levels — are the root cause of muddy mixes.
These MPW articles put detune into practice — specific techniques, real tools, and applied workflows.