Support

If you have a problem or need to report a bug please email : support@dsprobotics.com

There are 3 sections to this support area:

DOWNLOADS: access to product manuals, support files and drivers

HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects

USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here

NEW REGISTRATIONS - please contact us if you wish to register on the forum

Users are reminded of the forum rules they sign up to which prohibits any activity that violates any laws including posting material covered by copyright

less then 10 samples delay in code??

DSP related issues, mathematics, processing and techniques

less then 10 samples delay in code??

Postby Nubeat7 » Sat Jun 07, 2014 8:41 pm

hello, i have a question about small samples delays (less then10) and want to ask if this code works as i understand ?

Code: Select all
streamin i;
streamout o1;
streamout o2;
streamout o3;

float x1,x2,x3;

o3=x3; //3 samples delay
o2=x2; //2 samples delay
o1=x1; //1 samples delay
x3=x2;
x2=x1;
x1=i;


i also thought about to write some kind of recursive loop function in assembler to loop the input signal around to get one sample delay with each loop, like this you could set the delaytime in samples (like loop 5 times would be 5 samples delay) or do i think the wrong direction?

the goal is to use such a simple code for small latency compensation between dry and effected signals, or should i just use the one sample delay primitives for this?
User avatar
Nubeat7
 
Posts: 1347
Joined: Sat Apr 14, 2012 9:59 am
Location: Vienna

Re: less then 10 samples delay in code??

Postby martinvicanek » Sun Jun 08, 2014 7:22 am

Yes, it works the expected way!
smallDelay.png
smallDelay.png (30.6 KiB) Viewed 33747 times
;)
If you just want the delays you could even do without x1, x2 etc.

As you say this bucket brigade approach is only viable for short delays of a few samples. For long delays you want to use arrays instead. However, arrays are slow, so bucket brigade is preferable for short delays. Would be interesting to find out where the break even point is.
User avatar
martinvicanek
 
Posts: 1328
Joined: Sat Jun 22, 2013 8:28 pm

Re: less then 10 samples delay in code??

Postby nix » Sun Jun 08, 2014 10:20 am

it's a mem,
so writing to memory.
just to clarify for some maybe.

Cheers, that's interesting fellas

Would yous like the delay I built out of prims and snippets?
Seems to work OK
User avatar
nix
 
Posts: 817
Joined: Tue Jul 13, 2010 10:51 am

Re: less then 10 samples delay in code??

Postby KG_is_back » Sun Jun 08, 2014 6:53 pm

martinvicanek wrote:As you say this bucket brigade approach is only viable for short delays of a few samples. For long delays you want to use arrays instead. However, arrays are slow, so bucket brigade is preferable for short delays. Would be interesting to find out where the break even point is.


Arrays are slow only in default Code, because each channel might need different index to read/write, each of the 4 channels need to be read individually. That means: 1sample delay bucked brigade takes 4times less CPU than using array delay (of any length). I've just checked that quick with analyzer primitive and it's true... That means bucket brigade is more efficient for delays upto 4 (maybe even a little more, depending on the index counter in array delay).

However, when the delay is fixed (by fixed I mean both - hard coded delay length and variable delay length that is set on first sample)and identical for all 4 channels, you can use assembly to greatly improve the delay algorithm because you can read and write to array all 4 channels at the time (to the same index). With this approach can be as effective as bucket brigade.

here's a schematic that shows it all. 3 code examples all with loop(1024) to boost CPU load to visible range.

first one is just o code that involves one read and one write from/to array. Default code form code component that every array based delay must contain ( + index counter which is actually not here). on my machine it takes 25-28cycles per sample.

second one is a bucket brigade type one sample delay. it takes 5-6cycles on my machine.

third one is assembly optimized fixed delay (defined on first sample, then stays fixed) and it takes 5-7cycles in my machine. And it supports virtually any delay (the given example is limited to 4096, but you can increase that easily by boosting mem size and both integers in "and eax,???;" operation to (memsize*16-1) )

note - inside the module the delay is also with 1024 loop - fully working code is the one outside the module.
Attachments
delay comparison.fsm
(1.88 KiB) Downloaded 1563 times
KG_is_back
 
Posts: 1196
Joined: Tue Oct 22, 2013 5:43 pm
Location: Slovakia

Re: less then 10 samples delay in code??

Postby Nubeat7 » Sun Jun 08, 2014 9:11 pm

wow, thx KG thats very enlightening, your optimized delay already needs less cycles then the bucket brigade with 2 samples! :)
User avatar
Nubeat7
 
Posts: 1347
Joined: Sat Apr 14, 2012 9:59 am
Location: Vienna

Re: less then 10 samples delay in code??

Postby martinvicanek » Sun Jun 08, 2014 9:34 pm

Good work, KG! I think your array optimization is fair if you compare to bucket brigade where you also have the same delay for all SSE channels. However, sometimes this may be a limitation. Have you looked at Trogz Toolz optimized delays over at the SM forum?
User avatar
martinvicanek
 
Posts: 1328
Joined: Sat Jun 22, 2013 8:28 pm

Re: less then 10 samples delay in code??

Postby KG_is_back » Mon Jun 09, 2014 1:05 am

martinvicanek wrote:Good work, KG! I think your array optimization is fair if you compare to bucket brigade where you also have the same delay for all SSE channels. However, sometimes this may be a limitation. Have you looked at Trogz Toolz optimized delays over at the SM forum?


Where is it? I can't find it anywhere...

anyway, one of main things how to optimize a delay is to make use of SSE instructions as much as possible

here another comparison:

stock FS delay:
minimal delay is 1
uses cca. 36-40cycles (possibly -5cycles, because it has one extra 1024loop which uses cca 5 cycles);
My reworked delay:
works for 0 samples delay too
uses cca.15cycles
Attachments
delay comparison2.fsm
(2.4 KiB) Downloaded 1568 times
KG_is_back
 
Posts: 1196
Joined: Tue Oct 22, 2013 5:43 pm
Location: Slovakia

Re: less then 10 samples delay in code??

Postby martinvicanek » Mon Jun 09, 2014 6:05 am

KG_is_back wrote:
martinvicanek wrote:Have you looked at Trogz Toolz optimized delays over at the SM forum?

Where is it? I can't find it anywhere...
It's visible to registered users only: Trogz Tools. You have to insist, the pages in the SM forum almost never load at the first intent, instead a misleading "Forbidden" message appears. Probably some timeout issue. It is a shame to see such a valuable source of material go down the drain.
User avatar
martinvicanek
 
Posts: 1328
Joined: Sat Jun 22, 2013 8:28 pm

Re: less then 10 samples delay in code??

Postby tester » Mon Jun 09, 2014 9:43 am

With the SM forum it is happening from time to time. Malc isn't sure why. I noticed, that it usually starts when spammers start to play.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
tester
 
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: less then 10 samples delay in code??

Postby Tepeix » Fri Feb 11, 2022 8:43 pm

Hi,

This old topic inspire me to make my first asm delay.
A mini mod delay of 7 samples.

What do you think of it ? Could it be more optimized ?

Here's the code :
Code: Select all
streaminin;streamoutout;
streamin mod;
float a=0; float a1=0;
float a2=0;float a3=0;
float a4=0;float a5=0;
float a6=0;float f0=0;
int abs=2147483647;   
movaps xmm0,in;
movaps xmm1,a;
movaps a,xmm0;
movaps xmm2,a1;
movaps a1,xmm1;
movaps xmm3,a2;
movaps a2,xmm2;
movaps xmm4,a3;
movaps a3,xmm3;
movaps xmm5,a4;
movaps a4,xmm4;
movaps xmm6,a5;
movaps a5,xmm5;
movaps xmm7,a6;
movaps a6,xmm6;
// mod //
subps xmm3,xmm7;
movaps xmm0,mod;
movaps xmm1,xmm0;
cmpps xmm1,f0,1;
andps xmm1,xmm3;
addps xmm7,xmm1;
// abs //
andps xmm0,abs;
// mix //
subps xmm7,xmm5;
mulps xmm7,xmm0;
addps xmm5,xmm7;
movaps out,xmm5;


Combining in series the 4 channels we get a little vibrato.)
Attachments
Minimodbucked v2.fsm
(49.84 KiB) Downloaded 641 times
Last edited by Tepeix on Fri Feb 11, 2022 10:38 pm, edited 2 times in total.
Tepeix
 
Posts: 361
Joined: Sat Oct 16, 2021 3:11 pm

Next

Return to DSP

Who is online

Users browsing this forum: No registered users and 52 guests