Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.
“Performing a simple experiment where we have 5 separate components
1000 Hz sine probe 57 dB SPL 750 Hz sine masker A at 71dB SPL 800 Hz sine masker B at 71 dB SPL 850 Hz sine masker C at 67 dB SPL 900 Hz sine masker D at 65 dB SPL I record the following data
When playing probe + masker A through D individually I experience the probe approximately as intensely as a 1000Hz tone at 53dB SPL. When playing probe + all maskers I experience the probe approximately as intensely as a 1000Hz tone at 48dB SPL.”
I would be very interested in understanding more about their testing methodology and hardware setup especially.
Is the perceiver a trained listener? Are they using headphones or speakers or some other transducer method?
It's awfully difficult to say that there is equivalent perceived SPL for different frequency domains, even as a trained listener. Especially given the different frequency response for different listening setups.
The average user has no chance; hence my curiosity of their specific credentials considering they’re building an entirely new perceptual model based on that.
It would be nice to see ELi5 explanations for items like this akin to Monty's 'A Digital Media Primer for Geeks' ( https://people.xiph.org/~xiphmont/demo/#:~:text=Xiph )
Is this a proposal without experimental verification?
- My understanding is that a gamma chirp is the established filter to use for an auditory filter bank--any reason you choose an elliptical filter instead?
- I didn't look too closely, but it seems like you are analyzing the output of the filter bank as real numbers. I highly recommend you convolve with a complex representation of the filter and keep all of the math in the complex domain until you collapse to loudness.
- I'd not bucket to discrete 100hz time slices, instead just convolve the temporal masking function with the full time resolution of the filter bank output.
- You want to think about some volume normalization step that would give the final minimized Zimtohrli distance metric between A and B*x, where x is a free variable for volume. Otherwise, a perceptual codec that just tends to make things a bit quieter might get a bad score.
- For fletcher munson, I assume you are just using a curve at a high-ish volume? If so, good :)
- Not sure how you are spacing filter bank center frequencies relative to ERB size, but I'd recommend oversampling by a factor of 2-3. (That is, a few filters per ERB).
Apologies if any of these are off base--I just took a quick look.