
Not far from Copenhagen!
Note that I attended only a small fraction of the events out there, and I mostly attended the events that I thought would be personally interesting. If I missed something you thought was critical don't be surprised since I missed some things that I thought were critical too but I couldn't be at two places at once. So, here's what I personally found of interest:
Distortion
Steve Temme from Listen gave a tutorial called "Distortion Measurements: Can we Measure What We Hear?" He has given this same tutorial a few times before at conferences but I don't think anywhere enough because people all absolutely need to be seeing this. There is such a split in the audiophile community between people who are obsessed with objective measurements vs. those who are obsessed with listening tests, and that split is harmful to everyone. Temme talks about how measurements can be correlated with listening so we wind up with objective measurements that mean something useful. Too few people seem to understand this is possible, let alone how absolutely critical it is. I can only hope he gives the same presentation at the next AES show in Nashville.
Nonlinearity in loudspeakers comes from a number of places both magnetic and mechanical. The nonlinearities in the magnetic circuit arise from both eddy currents in the conductive steel magnet assembly and from the nonlinear permeability of that steel. In _Reduction of Mid-to-High-Frequency DIstortion in Loudspeakers through Structural Magnetic Circuit Modification, He Xao and others from the Dynaudio Lab in Copenhagen create a COMSOL software model of the loudspeaker system and investigate opportunities to reduce both problems by making structural changes while using the same material. Convention Express Paper 10273.
Franz Heuchel and Finn Agkervist looked at distortion from the opposite direction, in "Input-output linerarization of loudspeaker dynamics via automatic differentiation." There's a method called "input-output linearization" which can mathematically reverse some kinds of nonlinear transforms. Back in the late 1990's Wolfgang Klippel described using it for loudspeaker distortion reduction. It can compensate for some things that are time-invariant, although it can't compensate for anything where information is lost, like for dead bands. This method requires setting up a system of differential equations to emulate the nonlinearity, and this paper shows a faster and easier automated method of doing so. Hopefully this will cause more speaker manufacturers to look at these methods. Convention Paper 10307.
Nonlinearity in headphones is basically the same kind of problem, but it's harder to subjectively estimate headphone distortion. In "The Perception and Measurement of Nonlinear Distortion" by Sean Olive, listeners are given a number of tasks. They listen to various short musical pieces on headphones and identify which of the two examples in a pair have been artificially distorted to find the lowest detectable threshold using six different headphones. They compare different headphones together playing back samples to determine which are most similar. And they did a more complex comparison tests using only the highest and poorest-rated headphones from the comparison test were compared more aggressively.
Not only were levels very precisely-controlled but the headphones were carefully equalized so they all followed the same frequency response. We know that headphones with identical response curves can sound completely different, and that's a sign that nonlinearity is an issue.
They then compared the results of these tests with various signal measurements made on the headphones themselves. The detection threshold test actually showed the best correlation with good old-fashioned THD measurements, while non-coherent distortion tests with musical signals gave much better correlation with the similarity test. Various other distortion measurements including classic IMD and DFD were also used and surprisingly correlated more poorly with listening.
The author concludes that nonlinear distortion is not important at normal playback levels, while I'd claim that his results show exactly the opposite. But you can read the paper and see the data and decide for yourself in Convention Paper 10263.
John Vanderkooy, emeritus from the University of Waterloo, had a poster on "Modulation Noise in Tape Recording" that was very interesting. Note that people use the phrase "modulation noise" to mean a lot of different things including bias rocks (which are 1/f uncorrelated noise). In this case, he is looking at all correlated noise and distortion products that appear when a pure sine wave is recorded. His first thought was that since so much of this consists of sidebands near that sine wave that it was mostly flutter (and that mostly scrape flutter from the tape going across the head) but careful FFT analysis showed the spectrum wasn't right for that. He suggested that the correlated noise that we see is the sum of several different mechanisms but that much of it is the result of inhomogeneity of the tape itself, and did some tests that support that. I find it ironic that people now enjoy tape recording effects that for years we tried to eliminate, but I am pleased that because of this people are bringing modern high-resolution instrumentation to bear on examining these effects in ways that could not have been done before. This paper does not seem to be available on the AES website for some reason.
Noise Removal
Automatic transcription algorithms for medical use are increasingly important because doctors have illegible handwriting and manual transcription is slow and requires skilled transcriptionists. Accuracy is critical; you don't want to go in for a nose job and get a lobotomy because the software mistook "nose" for "lobe." In "The Ambisonic Denoising Paradox: U-Net Processing Degrades ASR Transcription Quality for Medical Speech," Szymon Zaporowski and Bartlomej Mroz from Gdansk University of Technology took ambisonic recordings made in medical facilities, converted them to mono, and then ran them through Polish voice-to-text transcription. In an attempt to improve performance in noisy environments, they then used U-Net noise reduction processing on the ambisonic system before converting to mono and transcribing. The idea here is that by having a noise reduction system that is able to take spatial effects into account it could remove localized noise sources without affecting the speech.... but in fact the actual results were the opposite of what was expected. The noise reduction degraded the transcription quality. Why? The authors propose some methods to find out. Convention Express Paper 451.
Microphones
One of the technical tours this year was up to the DPA microphone factory and it was a somewhat limited but still excellent facility tour. I wish only that Eddie Brixton from DPA could have been there, but he was out helping set up the convention for the AES. I'll talk about this tour more in a future article to stand by. DPA did have a booth at the show demonstrating their microphones, as did Bruel and Kjaer (now HBK), maker of fine audio instrumentation including measurement mikes.
I did see Martins Saulspurens, one of the grand old men of microphone design and one of the two founders of Blue microphones from back in the days when they made some fine products. He's running a recording studio in Riga now and is always great to talk to because he's always doing something different and fascinating.

Student Design Competition Judges
Every year the Saul Walker Student Design Competition brings in students from all around the world to show their personal designs, both hardware and software. I'd like to point out one of the entries here, and it wasn't one of the winners, but it definitely should be mentioned. Marcelo Wilberger from the Instituto Terciero Tamaba in Argentina built a ribbon microphone. Not tbe best ribbon microphone, but he designed the magnet assembly using off-the-shelf magnets, hammered the ribbon, and wound the transformer himself. His machine shop skills were clearly not the best but I bet they got a lot better in the process of doing this. This wasn't the best project but I bet it was the most educational one and it should be called out as such. The student does deserve a prize and probably it should be a copy of Harry Olson's book.

Marcelo Wilberger
Speakers
Juha Backman from Finland gave a tutorial called "The Roaring Twenties-- The First Decade of Consumer Loudspeakers." All three of the most popular speaker driver designs: the cone speaker, the horn with compression driver, and the electrostatic speaker were designed in a very short period of time. There was an economic boom, mains electricity was becoming available all over the US and Europe and this spurred development of speakers in all directions. It's interesting to see how many of these designs get some important things right while getting things that seem obvious today completely wrong. This was an enjoyable talk and I wish he would write it up.
He also mentioned a reference to a speech PA system being used at the 1920 Olympics in Stockholm but could find no detailed information about it. If anyone has any knowledge about this it would be appreciated.
In a more up to date discussion, distributed mode loudspeakers were a big subject of research at the AES conferences back in the nineties but interest fell off in part due to difficulties of getting flat response. In _Sound Diffusion Properties of a Bending-Wave Loudspeaker Compared with a Conventional Speaker_, Rina Mizukami and Kazuhiko Kawahara from Kyushu University look at the radiation pattern and how the DML acts much less like a point source and much more like a distant and randomized sound source, and how this affects imaging. So much work still needs to be done to make the DML practical for hi-fi applications but this is a step toward that. Convention Express Paper 10270.

Rina Mizukami with her poster
In the old days, horn-loaded bass speakers were a big deal, and by stacking horn speakers together (like the Altec A1 and A2) the low frequency corner and the speaker efficiency can be improved over that of a single bass horn. In "Mutual coupling investigation of bass horn loaded speakers," Aurelian Botau from Resound replicates Harry F. Olson's 1930s experiment on multiple bass horn arrays. He gets less of a benefit than the theoretically calculated value, possibly due to losses from having not sealed cabinets together with gaff tape as is in the industry standard method for setting up improvised linked cabinets. It's interesting to see how he has done this and how he has looked at electrical impedance. Convention Express Paper 408.
Acoustic lenses have been used for ultrasonic frequencies for a long time; they employ some medium that slows down audio like light through glass, in order to direct sound. You can focus it down or spread it out, but at lower frequencies lenses become difficult... and by lower I mean audible frequencies. In _Design and Optimization of Acoustic Lenses for Audible Frequency_, Jadwiga Hyla and Jaroslaw Rubacha investigate two different methods for making acoustic lenses for focussing 1700Hz narrowband signals, showing the various pattern effects. Unfortunately they don't discuss what happens at adjacent frequencies or look at the pattern with respect to frequency, but it's still a good look at a technique. AES Convention Express Paper 431.
L-Acoustics didn't invent line array speakers for sound reinforcement applications but they certainly took them into the new century with digital controls and steering. These systems have some inherent issues such as comb filtering but with proper steering most of the problems can be moved to locations higher or lower than where people are listening. This, however, requires very precise pattern control and therefore timing.
In "Acoustic and Perceptual Consequences of Time Misalignments in Line Array Speakers," Nicolas Epain and Etienne Corteel from L-Acoustics look at what happens when there are very small timing differences between the DACs driving each one of the speaker cabinets, and how very small clock issues on a single cabinet can alter the pattern enough to cause real problems. How much? It's described in the paper which as of this writing is not available on the AES website but hopefully will be soon.
Headphones
One problem with headphones is that they all sound different to different people because people are used to listening to live sound with their own pinnae and shoulders in place and no two people have them in exactly the same positions. Another is that people can't really decide what they are supposed to sound like because people use headphones for different applications. In "A New Reference Target Curve for Studio Headphones," Jonas Foerster and Lukas Keppler evaluate studio control rooms with a binaural dummy head and get a frequency response that is both characteristic of the rooms and the dummy head. They found a few control rooms tested to be outliers but most to be within a fairly small range, and then they set up a headphone curve that is intended to model that. Now, there are still likely to be differences because of your body not being the same as that of the dummy head, but the basic goal of trying to emulate the character of a control room over a pair of headphones is a reasonable one. If only we could emulate the stereo imaging as well! This paper does not appear to be available on the AES website.
We did have a headphone vendor too! Audeze had a booth and the show floor was small and quiet enough that you could actually evaluate their headphones and talk to headphone engineers! This is the kind of thing that I love about this show and even though I'm not likely to buy a pair of their high end LCD-X headphones, listening to them made me understand why people do.
Acoustics
As I said before there were very few vendors but one of the ones that we did have was the Narrowband Absorber Company, Ltd. This is a British company that makes a portable box with a Helmholtz resonator that provides narrowband absorption in the 63 Hz range, combined with some amount of diffusion, for people who might be working in improvised studios or working in different studios with problems they want to mitigate on their own. It may be a big help for people setting control rooms up in dressing rooms and backstage bathrooms, but it might be a useful tool for small home listening rooms as well.`
Massform was also showing some beautiful sculptural acoustic panels, including absorbers and diffusers and combined devices. With only a handful of vendors, fully 20% of all the vendors were selling acoustical control devices which just goes to show how important and how much interest there is in room treatment.
A company called treble.tech was showing an acoustical simulation program which they were promoting for use in teaching machine learning and AI systems about acoustical spaces as well as for sound reinforcement design engineers to help predict the sound expected before a system is set up in a room.
There were a few papers on acoustics but there was also an interesting tutorial give by Marcel Kok called "Drone-Based Class 1 Sound Level Measurements for Three-Dimensional Characterization of Outdoor PA Systems." He was flying drones in circles around a stage to determine the radiation pattern from the stage at different frequencies and altitudes. The goal of a sound system at an outdoor festival is to get sound to the audience and to avoid leaking sound into other stages or to nearby neighborhoods and without measurement how can you be sure this is working? Ground measurements are good but don't give you the full picture of the radiated sound field. The author was using a relatively noisy drone and so suspended the microphone 30 feet below the craft itself on a long cable. This gave him sufficient isolation to get good measurements above 315 Hz but low frequency measurements weren't possible. Even so, the long cable meant that the drone had to move slowly so that the cable would fall competely vertical and the actual location of the microphone could be set precisely. This would have been a good paper instead and although I don't think this is a very practical method yet, it's a good first step toward one. I'd love to see more work being done here.
Perception
In "The efficacy of phantom image perception: an active listener perspective," Wesley Bulla and Song Hui Chon reproduced the classic experiement where audio is played back alternately through an unidentifiable speaker somewhere in front of the listener and then through a pair of stereo speakers and the listener is asked to adjust a panning control to match the position of the phantom image to that of the hidden speaker. In addition to this, though, they tested height imaging as well.
They did this over the course of many years with students, as part of a class in listening and perception, so in the process they got an extensive dataset of measurements.
They got the conventional results with the horizontal panning, where the phantom images were blurred. This may have been the result of using only intensity stereo with no phase differences between channels. But in the vertical trials the phantom placements seemed almost random, showing our mechanism for height detection involving pinna-shading and the major reflection from the shoulders, cannot be fooled in the same way that stereo playback does. Another interesting thing is that it became very clear that training dramatically improved performance on this task. Convention Express Paper 453.

Wesley Bulla talks about Phantom Imaging
Electronics
Once again the AES has put together an Audio Design Roundtable with a group of designers (this time Jamie Angus-Whiteoak, George Massenburg, and Christoph Thompson) to talk about audio design and answer questions from the audience about audio design. The discussion was interesting and spirited with a lot of discussion about what is important and what isn't. This is always a fun session at every AES show.
There were a few papers out there about electronics as well, but not many. Today we live in the 1-bit converter world where PCM audio is translated into a high frequency stream of single bits that can be integrated in time to form analogue audio. In _Systematization of Multiplier-Less Convolution for 1-bit Audio Signal_, Yuti Goma and others from Waseda University demonstrate a simplified version of that process. The good news is that the translation from PCM data to an SBM bitstream is done with just shifts and adds and no multiplication is needed, which greatly simplifies the digital electronics. The bad news is that it's not a SBM bitstream of optimal length so the converter needs to run much faster than it would with a more optimized digital filter. This is trading off complexity for speed and many DAC applications that is a good tradeoff. Convention Paper 10299.

Yoshitaka Hamasaki, immersive sound guru
Immersive Audio
Immersive audio, to oversimplify, is surround with height cues added, and there was an excellent demonstration system provided by the people at Genelec in one of the rooms. This system had 11 channels of audio playback (plus LFE) and was used for a number of things including a series of 3D masterclasses in which various producers who have been working in immersive formats gave talks about their working methods and how they build immersive mixes.
I attended a number of the 3D masterclasses and I am sorry to say that I felt almost all of the examples were too ping-pong sounding. I didn't feel the sense of a hall around me in any of the classical recordings and sadly some of the non-classical stuff had such an unrealistic perspective that it made my head hurt. One set of examples had everything sounding close and in your face but it also sounded like it was all around you. Bing Lin said that I had been damaged too badly by terrible quadrophonic mixes in the seventies and haven't recovered, which may be true.
However, I will put a word in for the masterclass given by Florian Cammerer, who made outdoor nature recordings using various minimalist microphone arrays, all with one microphone per channel and with some steering between speakers but never panning between channels that were not adjacent. His recordings did give a good sense of being outdoors and were delightfully realistic. Is is because it's hard to reproduce the room itself and outdoor recordings don't have those issues? Is it just because of the method or the material? I have no idea. I can't help but like the idea of immersive recordings because I like the whole idea of sitting down to a recording as an experience, but I so seldom hear anything that makes me feel like I am in a real place.
And, "Altering the Immersive Potential: The Case of the Heilung Concert at Roskilde Festival" showed that immersive audio systems have uses outside of sound reproduction. The band Helung set up an immersive PA system for the band itself, so that height cues and location cues around the room could be used to envelop the audience in sound. Now, I am usually skeptical about this sort of thing because I think it too often detracts from the music and from the concert experience itself. What I found interesting here is that the author very specifically made the point that the immersive format was there to serve the music and increase engagement with the music and this is what too many people miss. I'm still not sure if I like the idea but I've not actually heard the concert so who can tell? Unfortunately this paper is not currently available on the AES website but hopefully they will get the website fixed so that you can read it for yourself.
On the small show floor, Areal was showing their Upmix Engine, a device that simulates an immersive environment given a stereo signal. It makes no attempt to extract stems but analyzes the spectral content of a mix and pans different ranges across speakers. I didn't have a good chance to try the system and I remain a bit skeptical as I am still reeling from the "Electronically Rechanneled for Stereo" processes of old, but I'd be very interested in hearing this in a better environment and hoping that maybe I could do so at the next AES show in Nashville.
Odd Things
Museums show us art, science, and history in their own isolated context and some of that context is sound. It might be music, it might be hvac noise, and it might just be echoing footfalls, but they affect the way we look at what is presented. In "The Cognition of Sound in Museums: Toward a Spectrum of Meanings," Alcina Cortez from the NOVA University in Lisbon tried to systematize this and describe it. This is a philosophical discussion of sound, and not something you normally would expect at the AES show so it was an interesting change. Convention Express Paper 462.

Labros Vasileiou with his poster
Another very cool thing that I can't place in any of the normal categories was "Audio data augmentation techniques for frame drum stroke recognition" by Labros Vasileiou and others from the Aristotle University of Thessaloniki and Pagonis Percussions. Automatic music transcription is a big deal today because it's so valuable to be able to take a recording and turn it into a score or into a midi file. A lot of work has gone into automated transcribing of tonal instruments like pianos and fiddles, but being able to do it for percussion is more difficult because it involves not separating sounds by tone so much as by envelope. How can you tell if a sound is a direct strike on the drum or a tap near the outer rim? What if it's a scrape? And how can you do this when there's a guitar on the same track? The author trains some convolutional neural networks on various signals and then gets reasonable results when trying to take apart new recordings. Convention Express Paper 421.
Among the exhibitors, Comsol was showing off their software for combined mechanical and electromagnetic design simulation. This software has become very valuable for designers of loudspeaker drivers and microphones of varying sorts.
And, on the other end of things, Listen was showing off audio measurement systems that let those designers determine what their physical hardware is actually doing so they can both validate models and so they can determine if changes are really improvements or not.

After spending a few days in Sweden I was starting to get used to the sound of Swedish and the odd mix of words from different roots, but then I took the train down to Copenhagen for the conference and everything was different. The sounds were all different and the written language was just different enough to be confusing.
I learned that CO2 is called "Kuldioxid" in Denmark, Probably from the same root as the English word "coal."
"Low Floor Coach" trains are called "Lavgulsvogn" which sounds like it should be a Scotch whiskey but isn't.
And the streets at the university were named after the departments on those streets. The EE department was on "Elektrovej" which seems to me to be a great name for a prog rock band.
Most disturbing quote overheard at the show: "(Harry) Olson's book is hard. Nobody learns that kind of math anymore."
The castle at Elsinore was only a few stops away from the conference site, so I went up and walked in what would have been the footsteps of Hamlet if he had not been a fictional character. The whole area north of the city was beautiful although I don't think I'd want to visit in the winter.
I came back severely jet-lagged to discover that in my absence boards had been failing quality assurance tests because Texas Instruments had replaced their NE5532 opamps with a completely different part with different characteristics being sold under the same name. Without informing customers, without any warning at all or any opportunity to get a last-time buy of the correct and working part. I wish I'd had known a bit earlier or I would have brought it up at the design roundtable.
But on the whole this was an excellent show and it is good to see the folks at the AES doing most of it in-house again.























