Thursday, February 16, 2012

Headphones . . . from an Engineer’s Perspective – Jeremy R. Kipnis

For better or worse, “Cans” --- a colloquialism used by audio engineers to describe the sound and wear-ability of headphones, are a part of most everyone’s lives each and every day. But in particular, they are involved in the production of sound as heard on Albums, Movies, TV, Live Sports, and Video Games (to mention but a few). Before consumers ever hear any of the their familiar program sources, a sound engineer has listened to them at length, made adjustments, equalized frequencies, compressed dynamics, added reverberation, sequenced songs, and mastered the audio using headphones as one of the most critical parts of the entire production process. So is it any wonder that recordings sound so different from each other and when heard on loudspeakers? What’s going on?

When “Cans” were first invented (at the beginning of the 20th Century), they were initially used by the U.S. Navy, and later as part of the telephone and radio industries. You probably have seen the old telephone earpieces that accompanied hand-crank telephones: basically an acoustic lens made of wood (or Bakelite) that amplified a tiny diaphragm moving in response to an incoming electrical signal. So as long as voices were clear (a big if), the design was satisfactory. In fact, the basic sound quality of “Cans” remained largely unimproved, even through the early “Stereo” era of the late 1950’s, as headphones would come to be used to make high fidelity recordings.

* * * * * * * * * * * * * * * *

Over the years, I have listened to a lot of headphones, as well as a lot of speakers. In my pursuit of making extremely transparent “You Are There” holographic recordings, which usually feature just a single-stereo microphone, I always refer to earlier great recordings in my collection. I use these as a reference against which to evaluate the new sounds I am capturing. A sound check involves many changes in the musician’s position as well as the height and distance of the microphone within the acoustic space. Not to mention isolating and eliminating any stray source of sound or noise before the first take.

In my many years as a Tonmeister, audiophile recording producer and engineer, I have found it necessary to utilize headphones as an important part of an album’s creationary processes. In fact, when making professional recordings of any kind, headphones must be an engineer’s primary tool; that is . . . right after the microphone(s), of course. Whether creating music albums simply using two-channel stereo, or capturing multichannel sound scapes for use in multimedia, the sound captured has all been evaluated and mixed using headphones as the primary monitoring tool, at some point. That is a large responsibility for what in most cases is either the cheapest or the most colored part of the recording process.

When I first began as a recording engineer in 1990, I was lucky enough to work with Bob Katz, Steve Guttenberg, and David & Norman Chesky of Chesky Records in New York City. The audiophile label’s goal was to create the illusion of live musicians within a real three-dimensional space: the impression of reality, basically. And I was systematically introduced to and taught how to utilize the most advanced recording technologies available at the time, along with careful microphone placement to achieve aural holography. I became part of a recording team that paid strict attention to every minute detail of the production. The result: a listening experience where each album is tangible, pleasurable, exciting, and realistic to experience. Amazingly, we would never have achieved this without the use of headphones (as well as loudspeakers, of course), and here is why.

The first part of any recording session involves choosing a location with an appropriate acoustic (room sound) that will suit the music and the musicians appearing on the album. Typically in our industry, a specially built recording studio with controllable walls and reverberation character is chosen; making the production process far more malleable, quieter, and more comfortable than on location, somewhere out in the world. Yet, this can reduce the actual acoustic contribution since it must be largely created: a synthetic backdrop of sound. Any “Room Sound” is essentially lost in the mix of microphones (sometimes more than one for each instrument) and effects tracks. But … not for any of the Chesky Productions!

Accurate sound presentation was the gold standard, and I learned quickly how and why the use of a single Stereo Blumlein Microphone (known as a Figure-8) can capture exactly what the company’s charter specifies. Judiciously placed to hear both the musicians and the acoustic space in perfect proportion, Bob Katz would listen to the microphone feed using his STAX headphones, and make minute adjustments that produced uncanny sonic illusions. Even though the sound of the space and the timbre of the instruments is different on “Cans” then it is on Loudspeakers, Bob’s keen hearing could extract and localize all the specifics, which need to be carefully balanced when making a recording. The rest of the team listened on both headphones and loudspeakers; both part of a large number of such audio transducers (each with their own colorations), which evolved through the four years that I worked here.

Several types of headphones were always on-hand in order to closely monitor any variations in sound quality; from imaging and sound staging issues, to extraneous noises or even wrong notes. These “Cans” included (and of which some are familiar brands):

1) STAX Lambda Signature, with either the stock Solid State or the Tube Amplifier combination

2) My personal set of SONY CD-999, with the Diamond Amorphous Heads – Woot Woot -

3) The RCA/BMG Studios ubiquitous SONY CD-6 (for musicians out in the studio)

4) The RCA/BMG AKG Acoustics K-240 Semi Open Studio Headphones (for engineers in the control room)

5) Original GRADO HP-1S with the Signature Ultra-Wide Bandwidth Reference Cable (owned by our RCA/BMG liaison engineer, Bill Allen).

6) Etymotic Research ER-4B (Part of my portable recording ensemble)

7) BeyerDynamic DT-990 (on loan from BeyerDynamic for testing by our engineering team)

Not surprisingly, none of these 1990 – 1994 era “Cans” sounds like the others! They are totally different in character from each other, from reality, and from speakers; in so many important ways. But they each offer a specialized opportunity to observe the sound that we were creating on multiple transducer platforms, as the various Chesky projects unfolded. Because the coloration in most headphones is so great, it is fair to say there choice is a matter of personal taste. But certain characteristics remain consistent from Headphone to Headphone, and (based on the requirements of making an album) these are my observations and preferences amongst those candidates, in order of listening preference:

A) STAX Lambda Signature Electrostatic Ear Speakers were the least colored through the mid-range of any of the candidates; revealing minute differences in microphone and musician placement as well as adjustments to the room’s size and the disposition of reflective versus absorptive surfaces. On the down side, dynamics were compressed, and the ability to play them really loudly was rarely there, except with Bob’s tweaked STAX Solid State amplifier – piping HOT!!!

B) GRADO HP-1S had an extremely well-balanced sound, with heart-throbbing palpability and gobs of Bass dynamics, they also threw a very large, wide, and deep sound-stage. This kind of Bass dynamics was almost totally absent from the STAX Earspeakers.

C) BEYERDYNAMIC DT-990 were a clear winner for their well balanced, three dimensional presentation. While not offering the delineation of musical information found in either the STAX or the GRADO, their over all clear sonic view fo our work made them a necessity to listen to; particularly for resolving noise related issues in the studio environment.

D) AKG L-240 do very little that is wrong, but they also sound veiled in both the treble and the mid-range to the degree that instrumental timbres and percussion lose definition as the sound builds in intensity. What the AKG lacks in finesse, it makes up for by being relatively inexpensive and also very comfortable to wear for hours at a time; an important issue as a professional, working well into the night.

E) The two SONY headphones sounded much more similar to each other than you might expect. A bit like listening down a cardboard paper towel roll, the sound from the more expensive CD-999 was full frequency (20 Hz – 20 kHz), and detailed, with a moderately sized sound-stage, capable of delineating depth, but no height and little rear. While the less expensive CD-6 ended up squashing the sound-stage flat as a pancake, along with crushing the dynamics, and lopping off both the top and bottom octaves, I nevertheless found them to sound better than of the SONY Walkman headsets, sold in the late 1970’s to the present.

F) Etymotic Research claimed flat in-ear response for their “B” (for Binaural) version of the ER-4 Ear Canal Speaker. And although certain details of mid-range and treble were indeed amazingly true sounding, there was a lack of true bass, causing a generally bright, thin sound that was not very audiophile. But I used them primarily with a SONY PCM Portable DAT system, and in that role they were more suited to fieldwork where noise isolation and repeatability of high quality playback under harsh environmental conditions is paramount.

Naturally, speakers produced yet another variation in the reproduced sound that (and we had on-hand many, many speaker types), but rarely did they create the intimacy that headphones achieved. Yet, the palpable impression of musical objects imaging outside of your head is only a real possibility with a well set-up set of speakers, and a keen ear will clearly detect nuances in height, width, and depth of the sound-stage from well-engineered recordings that are simply and almost totally lost while listening with headphones. Unfortunately, no amount of processing has yet produced a complete loudspeaker listening experience from a pair of headphones, and so a great distinction in the presentation continues to exist between these two popular transducer types. Yet there will always be a place for headphones alongside a pair (or more) of speakers, and if you listen closely to the recommended recordings I’ve listed (below) while auditioning your next pair, you won’t be able to help picking the best ones, for you!


BIO: Jeremy R. Kipnis is a producer and engineer of audiophile recordings since 1988. His label, Epiphany Recording Ltd. pioneered the first High resolution Single Stereo Microphone recordings in the world at 192 kHz / 24-bits, early in 1994. This evolved into an internationally awarded A/V playback and editing room design, known as the Kipnis Studio Standard (KSS). He reviews audio, video, and home theater technology for many periodicals throughout the world, and is a frequent guest commentator on television and the internet regarding Ultimate Home Theater and Cinema Technologies. He has been awarded the Guinness World Records Award for "Most Technically Complex Home Theater System in the World" from 2009 - 2012.

SUGGESTED LISTENING: Chesky Records / Produced from between 1990 – 1994


CD75 – The Virtuoso Scarlatti w/ Igor Kipnis

CD78 – Vivaldi: The Four Seasons w/ Igor Kipnis Conducting The Connecticut Early Festival Ensemble


JD49 – Clark Terry Live at the Village Gate

JD68 – Live at The Village Gate: The Second Set


JD55 – Because of You w/ Kenny Rankin




JD63 – Amazonia w/ Ana Caram


JD68 - Best Of Chesky Jazz And More Audiophile Tests Volume 2

Copyright - Kipnis Studios - 2012 ©



  1. I use both the AKG and Etymotic headphones used in this review...excellent write-up.

    I have found the Etymotic bass response to be completely dependent upon the seating and seal of the insert, and got the most consistent response out of the custom molds I had made. But even then, pressing on the edge of the jawbone bone in front of ear always made the low octaves even better. That might be an anomaly of my physique, I've never asked any other Etymotic users about it.

    I still rely upon them for any and all sound capture accuracy confirmation, but do not trust the bass response until I put both index fingers to work pressing those magic spots. Which makes them hands-on headphones. I have flirted with making a tension-wire with those rubber nubs from old reflex hammers to let me be hands-free again!

    John Gannon

    1. I agree, John.

      My experience closely duplicates your own in that the Etymotic Research Ear Canal Speakers have a VERY unnatural bass presentation, even as the midrange and treble are extremely clean and free from coloration. I too use them for all my field work, and even 20 years later, my first set is still going strong and sounding substantially the same as it did when new (minus burn-in, of course)!

      And in spite of how VERY DIFFERENT Headphones, Canal Speakers, On-Ear Head Sets, and Speakers actually sound from each other, they all agree on what makes a GREAT sounding recording and what is simply NOT worth listening to; sonically or otherwise.

      Thanks for commenting :-D

      Cheers -


      Kipnis Studios - Ultimate Media Rooms

  2. Hey there! Thank you for sharing your thoughts about home theater in your area. I am glad to stop by your site and know more about home theater. Keep it up! This is a good read. I will be looking forward to visit your page again and for your other posts as well.
    AV receiver is often referred to as home theater systems or home entertainment system.
    An example of a home theater service is wire organization for a sleek and elegant look

    home theater Wellesley