Development Topics > Plugin Development

Sampling rates, input domain and sonification

(1/3) > >>

justin:
Hello there,

Newbie to the vamp-world, just about to write my first plug-in and have encountered some doubts I was hoping someone could help me out with:

1) Sampling rate:

Imagine my algorithm requires a fixed sampling rate (e.g. fs = 44100), block size (e.g. 2048 @ fs = 44.1k) and hop size (e.g. 1024 @ fs=44.1kHz). I understand that I can specify a preferred block and hop size, and even return false in the initialization function if the host specifies something else. But, what about the sampling rate? Specifying a required block/hop size (in samples) is not really useful if the sampling rate is not known.

I realise I can save the value of inputSampleRate to a parameter, then check it in the initialization function and return false if it's not 44100, but that would be quite annoying for a user analysing audio with different sampling rates. Is there no way to re-sample the audio before it is chopped into blocks? Re-sampling the audio after it is already chopped into blocks means I have no control over the block/hop size (in terms of their actual duration in seconds).

2) Time-domain filtering:

Is there any way to apply a time domain filter first, and then get the input in the frequency domain? Just hoping to avoid having to compute the DFT inside the plug-in itself.

3) Sonification (sonic visualiser):

In sonic visualiser I see that some output types can be sonified (e.g. clicks at detected onsets). If the output of my plug-in is a continuous per-frame frequency value (in Hz), is there any way to sonify the output in sonic visualiser, e.g. with a sinusoid that follows the frequency of the output?

Thanks!

Justin

cannam:

Hello!


--- Quote from: justin on July 11, 2012, 15:59:22 ---1) Sampling rate:

Imagine my algorithm requires a fixed sampling rate (e.g. fs = 44100), block size (e.g. 2048 @ fs = 44.1k) and hop size (e.g. 1024 @ fs=44.1kHz). I understand that I can specify a preferred block and hop size, and even return false in the initialization function if the host specifies something else. But, what about the sampling rate? Specifying a required block/hop size (in samples) is not really useful if the sampling rate is not known.

--- End quote ---

I'm afraid the samplerate is one thing your plugin has no control over. It must accept whatever's supplied in the constructor, and can only then return false on initialise if the rate is unsatisfactory.

Note that the plugin's block and hop size can depend on the samplerate, they don't have to be hardcoded -- if the reason you want to fix the samplerate is in order to have known block and hop size in physical units (i.e. seconds), you may be able to do it the other way around -- calculate the block and hop size on request based on the samplerate.


--- Quote ---I realise I can save the value of inputSampleRate to a parameter

--- End quote ---

You don't actually need to save it, it's stored for you in the Plugin base class. (This is probably bad form in terms of software practice, but still)


--- Quote ---Is there no way to re-sample the audio before it is chopped into blocks?

--- End quote ---

No.


--- Quote ---2) Time-domain filtering:

Is there any way to apply a time domain filter first, and then get the input in the frequency domain? Just hoping to avoid having to compute the DFT inside the plug-in itself.

--- End quote ---

No, the frequency-domain input is essentially a convenience option for plugins simple enough to be happy to work from STFT data without having to have too much control over it (the host also controls the window shape, for example). More sophisticated plugins will need to work from time-domain data.

In hindsight I wish we had put a generally accessible FFT implementation in the SDK on the plugin side, as well as in PluginInputDomainAdapter on the host side -- there are now many, many duplicates of FFT functions in Vamp plugins out there! Perhaps it's not too late to add it, even.


--- Quote ---3) Sonification (sonic visualiser):

In sonic visualiser I see that some output types can be sonified (e.g. clicks at detected onsets). If the output of my plug-in is a continuous per-frame frequency value (in Hz), is there any way to sonify the output in sonic visualiser, e.g. with a sinusoid that follows the frequency of the output?

--- End quote ---

No, Sonic Visualiser only contains a MIDI-note-based sound generator that uses sampled sounds. Again though, I know quite a lot of people would find this useful. Maybe I should look at it...

Sorry to have such a negative list of responses for you. The positive side is that it looks as if you've understood the SDK and its limitations pretty well...


Chris


justin:
Hi Chris,

Thanks for the speedy reply!


--- Quote from: cannam on July 12, 2012, 09:21:36 ---
--- Quote from: justin on July 11, 2012, 15:59:22 ---1) Sampling rate:

Imagine my algorithm requires a fixed sampling rate (e.g. fs = 44100), block size (e.g. 2048 @ fs = 44.1k) and hop size (e.g. 1024 @ fs=44.1kHz). I understand that I can specify a preferred block and hop size, and even return false in the initialization function if the host specifies something else. But, what about the sampling rate? Specifying a required block/hop size (in samples) is not really useful if the sampling rate is not known.

--- End quote ---

I'm afraid the samplerate is one thing your plugin has no control over. It must accept whatever's supplied in the constructor, and can only then return false on initialise if the rate is unsatisfactory.

Note that the plugin's block and hop size can depend on the samplerate, they don't have to be hardcoded -- if the reason you want to fix the samplerate is in order to have known block and hop size in physical units (i.e. seconds), you may be able to do it the other way around -- calculate the block and hop size on request based on the samplerate.

--- End quote ---

Yes, I guess I'll have to look into this option. In theory it should be possible, though some algorithmic steps might make this somewhat complicated in my case. Worst-case-scenario the first version of the plugin will only support 44.1kHz :) Would be an awesome future feature though, to have the host re-sample the audio based on the request of the plugin before passing the audio blocks.

Extra question 1: from the programmer's guide I take it the first block is not centred on time zero but rather starts at the first sample of the audio right? (double checking, as this could cause alignment issues when checking against ground-truths centred on time 0).

Extra question 2: imagine I want initialisation to fail because I'm not happy with something (e.g. sampling rate). Is there any way of communicating the specific reason for the failure to the user? On a command-line host I could write to cerr, but for sonic visualiser?


--- Quote ---
--- Quote ---I realise I can save the value of inputSampleRate to a parameter

--- End quote ---

You don't actually need to save it, it's stored for you in the Plugin base class. (This is probably bad form in terms of software practice, but still)

--- End quote ---

Noted, cheers.


--- Quote ---
--- Quote ---2) Time-domain filtering:

Is there any way to apply a time domain filter first, and then get the input in the frequency domain? Just hoping to avoid having to compute the DFT inside the plug-in itself.

--- End quote ---

No, the frequency-domain input is essentially a convenience option for plugins simple enough to be happy to work from STFT data without having to have too much control over it (the host also controls the window shape, for example). More sophisticated plugins will need to work from time-domain data.

In hindsight I wish we had put a generally accessible FFT implementation in the SDK on the plugin side, as well as in PluginInputDomainAdapter on the host side -- there are now many, many duplicates of FFT functions in Vamp plugins out there! Perhaps it's not too late to add it, even.

--- End quote ---

Yes that would definitely speed up the development process for us MIR folk rewriting our code as vamp-plugins.


--- Quote ---
--- Quote ---3) Sonification (sonic visualiser):

In sonic visualiser I see that some output types can be sonified (e.g. clicks at detected onsets). If the output of my plug-in is a continuous per-frame frequency value (in Hz), is there any way to sonify the output in sonic visualiser, e.g. with a sinusoid that follows the frequency of the output?

--- End quote ---

No, Sonic Visualiser only contains a MIDI-note-based sound generator that uses sampled sounds. Again though, I know quite a lot of people would find this useful. Maybe I should look at it...

--- End quote ---

That would be great. Hmm, I've only just started and I seem to be making quite a lot of feature requests... sorry! But like you said, there are many plugins (especially pitch related ones) that would be upgraded from "cool" to "awesome" if such a sonification was available. 


--- Quote ---Sorry to have such a negative list of responses for you. The positive side is that it looks as if you've understood the SDK and its limitations pretty well...

--- End quote ---

No worries, I appreciate the prompt reply. And yes, the combination of the programmer's guide and the "From Method to Plugin" tutorial + skeleton code makes it very easy to get started!

Thanks,

Justin

cannam:

--- Quote from: justin on July 12, 2012, 14:10:55 ---Extra question 1: from the programmer's guide I take it the first block is not centred on time zero but rather starts at the first sample of the audio right? (double checking, as this could cause alignment issues when checking against ground-truths centred on time 0).

--- End quote ---

Depends on the host, but you can tell from the timestamp provided.

The docs (http://code.soundsoftware.ac.uk/embedded/vamp-plugin-sdk/classVamp_1_1Plugin.html#ae4aed3bebfe80a2e2fccd3d37af26996) say that "[t]he timestamp will be the real time in seconds of the centre of the FFT input window". Therefore, if the first timestamp is zero, that should mean you are being passed a window centred on the start of the audio rather than starting with the first sample.


--- Quote ---Extra question 2: imagine I want initialisation to fail because I'm not happy with something (e.g. sampling rate). Is there any way of communicating the specific reason for the failure to the user?

--- End quote ---

Sadly not.


--- Quote ---
--- Quote ---In hindsight I wish we had put a generally accessible FFT implementation in the SDK on the plugin side

--- End quote ---

Yes that would definitely speed up the development process for us MIR folk rewriting our code as vamp-plugins.

--- End quote ---

Well, I was working on the SDK today anyway so I've added it for the 2.4 release. Not everything is updated yet, but the source is at http://code.soundsoftware.ac.uk/projects/vamp-plugin-sdk/files now.


Chris

justin:

--- Quote from: cannam on July 12, 2012, 14:27:41 ---Depends on the host, but you can tell from the timestamp provided.

The docs (http://code.soundsoftware.ac.uk/embedded/vamp-plugin-sdk/classVamp_1_1Plugin.html#ae4aed3bebfe80a2e2fccd3d37af26996) say that "[t]he timestamp will be the real time in seconds of the centre of the FFT input window". Therefore, if the first timestamp is zero, that should mean you are being passed a window centred on the start of the audio rather than starting with the first sample.

--- End quote ---

aha, perfect.


--- Quote ---Sadly not.

--- End quote ---

ok, I'll try make it as clear as possible in the accompanying documentation.


--- Quote ---Well, I was working on the SDK today anyway so I've added it for the 2.4 release. Not everything is updated yet, but the source is at http://code.soundsoftware.ac.uk/projects/vamp-plugin-sdk/files now.

--- End quote ---

Nice. I've had a look - any specific reason for using Cross's implementation? I've been researching free FFT libraries (non GPL so that plugin authors are not obliged to publish the source code), after snooping around http://www.fftw.org/benchfft/ I thought perhaps the code by Ooura (http://www.kurims.kyoto-u.ac.jp/~ooura/fft.html) could do the trick (unless you need non-power-of-two blocks). Anyway, just curious.

Thanks again for all the useful feedback,
Justin

Navigation

[0] Message Index

[#] Next page

Go to full version