I would like to know how to make it, so that when I zoom-in, the level of detail is good like in SV. What is the concept behind this kind of implementation?
The main idea is that each pixel width on the horizontal axis should correspond to the peak sample value for the range of audio samples covered by that pixel. In most cases, peak values are more useful and recognisable to the user than averages, and since the peak of a range of peak values is the same as the peak of the underlying samples covered by that range, that means you can generate a peak cache at an intermediate resolution and refer to it whenever the view is zoomed out at that resolution or further.
Many audio editors restrict the permissible zoom levels to power-of-two samples-per-pixel. This means you can scan the audio file when it is first loaded and generate a peak cache at (say) 64 samples per cache element (meaning that each value in the cache represents the peak value found in a contiguous 64-sample range of the file). Then, in order to display a waveform at (say) 256 samples per pixel, it is necessary only to scan the cache and take the peak of each consecutive set of 4 cache values to generate each pixel level. (If the user zooms in closer than 64 samples per pixel, you just read the relevant section of the audio file directly instead.)
Sonic Visualiser varies this slightly in permitting zoom levels at power-of-sqrt-two samples per pixel (i.e. twice as many zoom resolutions as the most naive implementation) but the basic principle is as described. The cache code is in svcore/data/model/WaveFileModel.cpp.
Incidentally, ensuring that the preview remains consistent and no peaks are missed when scrolling (as well as when zooming) also takes care. The main thing is to remember that each pixel represents a range of the underlying audio file and ensure that the range is preserved -- i.e. work from pixel resolution back to file resolution ("what is the file range for this pixel? which samples contribute to it?") rather than the other way around.