Author Topic: Spectrograms  (Read 3298 times)

Novata

  • Newbie
  • *
  • Posts: 1
    • View Profile
Spectrograms
« on: February 07, 2009, 21:20:43 »
Can anyone help me with bibliography or references to web pages on how read and understand spectrograms, please?

I've been surfing the internet, but can't find anything suitable.

Thanks a lot!

Novata

davidf

  • Newbie
  • *
  • Posts: 10
    • View Profile
Re: Spectrograms
« Reply #1 on: May 01, 2009, 00:53:22 »
At the risk of oversimplifying things, a spectrograph is essentially a visual spread sheet of the pitch data from your audio file. It's read out includes not only the numbers you would expect in a spread sheet, but also a visual representation of the pitches. The visual represeentation of the pitches can be changed at your pleasure for what ever purpose you need.

In Sonic Visualiser, letting your cursor hover over a pitch area should give you a text read-out of the exact pitches in pitch names then plus or minus x number of cents. If I'm not mistaken, one cent is a quarter of a semitone.

Aside from the exact pictches in Hz, or pitch names (which is imperically correct so far as is possible) the visual representation is to allow you to interpert the data in some way which makes the analysis of the sound file clearer to you or your audience.

It LOOKS far more complicated than it really is. Much of spectrographic data is made clearer by the way you configure the colour scheme. Sonic Visualiseer has a number of them built in (which you have probably already disscovered).

cheeers,
davidf

cannam

  • Administrator
  • Full Member
  • *****
  • Posts: 221
    • View Profile
Re: Spectrograms
« Reply #2 on: May 01, 2009, 12:40:45 »
Correction: a cent is a hundredth of a semitone.

As David says, a spectrogram is a time/frequency representation with (in this case) time on the x axis, frequency on the y axis, and the strength of a particular frequency component expressed as colour or intensity.

It's computed as a series of short-time Fourier transforms, each one covering a single column (one horizontal unit on the time scale), which calculate the frequencies at which sinusoidal component waves would have to be present in order to sum to the original signal.  The vertical (frequency) resolution depends on the frame size for the Fourier transform (a longer frame gives better frequency resolution), and the horizontal resolution depends on the frame size and amount by which analysis frames overlap (a longer frame gives worse time resolution).

So, you can read directly from the spectrogram (with some very substantial caveats to do with resolution limitations) the frequency of a component of the signal.  If your music consisted of simple sine tones, there would be (roughly, with the same caveats) a single horizontal line for each note whose height told you directly what the frequency of the note was, and therefore what musical pitch it was at (there is a relationship between musical-note and frequency such that ascending by one octave results in a doubling of the frequency).

Musical tones have more than one "partial" frequency -- they consist of more than one frequency component -- usually at closely related frequency intervals such as integer multiples of the lowest, "fundamental" frequency of the note.  These will show up in your spectrogram as a stack of horizontal lines, waving about in correspondence with any pitch variation or vibrato, with the lowest line usually giving the fundamental frequency and usually corresponding to the note's perceived pitch, and the structure of higher partials having some relationship to the timbral quality of the instrument.  Because these structures can be quite complex, it's not always straightforward (indeed not always possible) to read off the actual performed pitch, especially in polyphonic music.

Percussive sounds are often also visible in the spectrogram, usually as fuzzy columns -- they generally consist of noise that is dispersed across broad frequency ranges without a regular structure.


Chris