Author Topic: Waveform display and zoom rendering (Read 6877 times)

Sergex · « **on:** October 12, 2011, 13:08:00 »

Hello,

I am very interested in learning how to properly program a waveform display in audio editing software pretty much like how it's done here in SV. I am already somewhat familiar with the Qt GUI framework. So far I have been able to load a file and have the waveform of the file displayed. However, when I zoom my function basically just scales my image in width, so it stretches the image and the resolution becomes very bad.
I would like to know how to make it, so that when I zoom-in, the level of detail is good like in SV. What is the concept behind this kind of implementation? Does it have something to do with mip-mapping perhaps?
If someone can please help me get on track with this or point me in the relevant sections of the SV source code I would really appreciate it because goind in the code is rather complicated when I don't understand the concept behind it.
Thanks for your help in advance!

cannam · « **Reply #1 on:** October 17, 2011, 12:59:59 »

Quote from: Sergex on October 12, 2011, 13:08:00

I would like to know how to make it, so that when I zoom-in, the level of detail is good like in SV. What is the concept behind this kind of implementation?

The main idea is that each pixel width on the horizontal axis should correspond to the peak sample value for the range of audio samples covered by that pixel. In most cases, peak values are more useful and recognisable to the user than averages, and since the peak of a range of peak values is the same as the peak of the underlying samples covered by that range, that means you can generate a peak cache at an intermediate resolution and refer to it whenever the view is zoomed out at that resolution or further.

Many audio editors restrict the permissible zoom levels to power-of-two samples-per-pixel. This means you can scan the audio file when it is first loaded and generate a peak cache at (say) 64 samples per cache element (meaning that each value in the cache represents the peak value found in a contiguous 64-sample range of the file). Then, in order to display a waveform at (say) 256 samples per pixel, it is necessary only to scan the cache and take the peak of each consecutive set of 4 cache values to generate each pixel level. (If the user zooms in closer than 64 samples per pixel, you just read the relevant section of the audio file directly instead.)

Sonic Visualiser varies this slightly in permitting zoom levels at power-of-sqrt-two samples per pixel (i.e. twice as many zoom resolutions as the most naive implementation) but the basic principle is as described. The cache code is in svcore/data/model/WaveFileModel.cpp.

Incidentally, ensuring that the preview remains consistent and no peaks are missed when scrolling (as well as when zooming) also takes care. The main thing is to remember that each pixel represents a range of the underlying audio file and ensure that the range is preserved -- i.e. work from pixel resolution back to file resolution ("what is the file range for this pixel? which samples contribute to it?") rather than the other way around.

Chris

Sergex · « **Reply #2 on:** November 16, 2011, 12:08:58 »

Thanks Chris, that is really helpful!

Author Topic: Waveform display and zoom rendering (Read 6877 times)

Sergex

Waveform display and zoom rendering

cannam

Re: Waveform display and zoom rendering

Sergex

Re: Waveform display and zoom rendering