Start a new topic
Implemented

Spectrogram seekbar

Let the select to see a spectrogram seekbar instead of the waveform seekbar.


A spectrogram looks like this:


A spectrogram shows the use of frequencies for each point of a track (x-axis = time, y-axis = frequency, coloring = loudness).


This way you can recognize low quality audio files, since they have a large black border at the top, because high frequencies have been cut.


Ok, here comes some samples:


Calculated colors:


Blue to pink:


Heat-cam:



Heat-cam dark:


Fire:


Standard spectrogram:



These screens shows the spectorgram at 1904*160 pixels size. Updating each 500ms takes about 9.7-10.6% cpu, raised exponentially.

This means that when running in 125ms it stalls the update interval since calculating and setting the pixles is way too slow.


I must test a smart triple-buffer to see if it helps.


@Sven: Was this how you would like it to be? Should color schemes be selectable?


Note: This is just implemented in a prototype application so far.






>> Updating each 500ms takes about 9.7-10.6% cpu, raised exponentially.
Do you render the spectrogram on each update? Wouldn't it be easier/faster to render three bitmaps (unplayed, played, selected) and then simply blend in as needed?
Maybe you need to re-render the bitmaps on size changes - or you simply calculate at a specific size and re-size them as needed.

>> Was this how you would like it to be?
Well, the first is best, the other's have a really huge black borders at the top. Beside of this: yes, that's what I wanted.

>> Should color schemes be selectable?

Not necessarily, but it would be nice to have.

>>Maybe you need to re-render the bitmaps on size changes - or you simply calculate at a specific size and re-size them as needed.

Yes, that's what I ment with triple-buffering, I need to test that further :)


>>Well, the first is best, the other's have a really huge black borders at the top. Beside of this: yes, that's what I wanted.

Possibly I can tweak the colors in a future build a bit further for the calculated version, I'll analyse that further.

The trick is to avoid to start with a black color.

thumbs up
looks good


greets

>> have a really huge black borders at the top.

is probably dependent on the analyzed samples (FFT)
You can use it with 1024 test instead of 512
If filling out the window then


greets

>>>> have a really huge black borders at the top.

>> is probably dependent on the analyzed samples (FFT)
>> You can use it with 1024 test instead of 512
>> If filling out the window then

If I look at those samples, I think they're made using the same data but different colorings.

Yes, it is the exact same result data, only the palette is changed.


@Emil: I use 1024 already, my first test was using 2048 FFT but that was overkill :)

>>@Emil: I use 1024 already, my first test was using 2048 FFT but that was overkill :)

Okidoki


greets

>>Yes, it is the exact same result data, only the palette is changed.

yes logically .. I'm Stupid ;)


greets

A little more progress:


Backbuffering implemented, CPU time is now low for 50ms updates (~2%).

I also experimented with the coloring and added some new exponential formulas to get a better spread.

The example is a MP3 in 128kbps, hence the lack or magnitude in the high peaks (but the violet color shows that there's some noice).


The picture also shows hover state (middle part) and not played state, rightmost part.


Still not implemented into Neon though.


To be continued...



Great, I'm really looking forward to see it in action. :-)

Implemented in #528


Sweep right without playing..



greets

ok is implemented
but I see a big difference between my spectrogram and the Neon.
i think to do with the blocks.
Can this be?

i read Pixel by Pixel


greets



I use the interpolation technique in Neon to take use of the full FFT buffer no matter which height the data will be rendered on.

So, if the FFT buffer is 8192 floats, I process ALL with a logarithmic formula to not skip any data and the resulting data will be shrinked.


I the code example I saw from you, you seemed to process only the 512 first blocks, which will not give the full frequency spectrum.

For 44100Hz, the range will display 22.5 kHz to 0 (Nyquist transformation).


Possibly that's the reason.


Also, when presenting the data, you need to use a logarithmic scale (avoid linear scales!) to enhance occurances of lower frequencies, otherwise they may be not displayed.

Login or Signup to post a comment