[Coladam] More about tone channels attenuation (volume) -and- digital sounds (PART 5 or 6?)

Daniel Bienvenu newcoleco at yahoo.fr
Sat Dec 12 17:58:16 CET 2009


Hello,

If you want to say "HO NO! NOT ANOTHER TEXT ABOUT SOUND!"... just ignore this one then.

** 5-BIT DELTAS @ 18 Khz **

* EXPERIMENTATION RESULTS *

I did optimize my player that uses what I call DELTAs and RLE (very basic one). The result I'm getting is a sample at approximately 18 Khz, which is near to 44Khz / 2.5 ( please note that it's not an exact evaluation here ) ... and I don't think I can make the player faster than that, but I can slow it down. This allow only a 3 seconds sample to be played in a 32K, even with the compression involved because it's basically only good to reduce silences.

For the "Boys Boys Boys" sample I'm using, it's originaly a 4 seconds long audio file at 44Khz which I reduce by a factor of 2.5, and it plays back well. I'm tired of optimizing this version so I'll keep it "as is" and try something else. So, it's a playblack speed somewhere near 18 Khz or slower.


* AUDIO DATA STRUCTURE *

- HEADER -

1 byte : A value between 20-37 that represents the median (or no signal) in the table of the calculated 46 possible values (0: FFF, 1: EFF, 2: EEF, 3: EEE, 4:DEE, 5:DDE, 6:DDD, 7:CDD, 8:CCD, 9:CCC, 10:BCC, 11:BBC, 12:BBB, etc.)

32 bytes : An adapted DELTAs table where all values are modified by a factor of 3 (because each audio level is a 3 attenuations configuration). Notice that there is no special meaning for the last DELTA like suggested in one of my first message, everything is obtained only by using addition of values (negatives and positives).

- DATA (STREAMING) -

Each byte is structured as the 3 high-bits for the RLE compression (you can see this as a counter) and the 5 low-bits for the DELTA to use to get the next attenuation value(s). The end of data is defined by a single byte FF.

[3-bit COUNTER-1][5-bit DELTA #]

* RESULTS *

I didn't do enough tests to make a real conclusion, but this is what I've got so far :

- The RLE compression is near to useless, except that it gives somehow the same compression as putting the values side-by-side.

- The Delta table is near to useless too, except that it gives 1 more bit of space for the RLE counter, which gives up to 8... only applicable really for Delta = +0.

* CONCLUSION *

Well, 5-bit deltas, depending on which part of the possible 46 attenuation values we consider as middle one (or median), can be applied with a dithering effect to get an acceptable result for its data size. The strategy of the 46 attenuations to choose from gives a good result, even with the noisy arthefacts in the background, and it seems also to gives louder results than the other technics I've seen so far for the ColecoVision.

* BOYS BOYS BOYS TEST FOR VARIOUS MEDIAN VALUES *

For this test, it's a 4 seconds sample instead of 2, and not too loud this time.

Remember, the median value is choosen from the 46 possible attenuation values ( 0 = FFF (mute), 1 = EFF, 2 = EEF, 3 = EEE, 4 = DEE, etc.) and it's the only thing I've changed for this test. The various data size is the effect of the RLE compression; remember that it's almost only to compress not variations (Delta = +0). More there are variations, less it's compressed, more audio informations we got... may also see as more encoded white noise to avoid silences (like cuts). The last big value of each line is not about compression factor, but a reminder of the volume level perception from 0 (mute) to 1 (no attenuation). 

 17152 bytes : median = 37 -> 233 : 0,544443937244913
 17825 bytes : median = 36 -> 333 : 0,501187233627272
 18522 bytes : median = 35 -> 334 : 0,466827212602681
 19031 bytes : median = 34 -> 344 : 0,432467191578089
 19611 bytes : median = 33 -> 444 : 0,398107170553497
 20266 bytes : median = 32 -> 445 : 0,370814035707944
 20818 bytes : median = 31 -> 455 : 0,343520900862391
 21551 bytes : median = 30 -> 555 : 0,316227766016838
 22066 bytes : median = 29 -> 556 : 0,294548058394878
 22628 bytes : median = 28 -> 566 : 0,272868350772918
 23223 bytes : median = 27 -> 666 : 0,251188643150958
 23769 bytes : median = 26 -> 667 : 0,233967839266268
 24200 bytes : median = 25 -> 677 : 0,216747035381578
 24734 bytes : median = 24 -> 777 : 0,199526231496888
 25104 bytes : median = 23 -> 778 : 0,185847260746629
 25611 bytes : median = 22 -> 788 : 0,172168289996370
 26068 bytes : median = 21 -> 888 : 0,158489319246111
 26295 bytes : median = 20 -> 889 : 0,147623726557213
 26642 bytes : median = 19 -> 899 : 0,136758133868315
 26984 bytes : median = 18 -> 999 : 0,125892541179417
 27128 bytes : median = 17 -> 99A : 0,117261694119611
 27321 bytes : median = 16 -> 9AA : 0,108630847059806
 27406 bytes : median = 15 -> AAA : 0,100000000000000
 27420 bytes : median = 14 -> AAB : 0,093144274490809
 27413 bytes : median = 13 -> ABB : 0,086288548981619
 27315 bytes : median = 12 -> BBB : 0,079432823472428
 27143 bytes : median = 11 -> BBC : 0,073987127130959

I did all my previous tests with median = 36 because 0.5 is in the middle of 0 and 1, but it's not the median of the 46 possibilities (from 0 to 45). Around 23, it's a better choice. Even if it sounds better, a median like 11 gives distortions.

The following audio file is based on median 11, 15, 19, 23, 27, 31 and then 37.
http://www.ccjvq.com/newcoleco/sndtest5.mp3

Have a nice day!

Daniel (Newcoleco)


      


More information about the Coladam mailing list