If you have a problem or need to report a bug please email : support@dsprobotics.com
There are 3 sections to this support area:
DOWNLOADS: access to product manuals, support files and drivers
HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects
USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here
NEW REGISTRATIONS - please contact us if you wish to register on the forum
Users are reminded of the forum rules they sign up to which prohibits any activity that violates any laws including posting material covered by copyright
Please evaluate my perception of these selectors in DSP/ASM
12 posts
• Page 1 of 2 • 1, 2
Please evaluate my perception of these selectors in DSP/ASM
Hello all
I frequently employ the use of multiplexers, selectors, and logic the more I program. This post is examining "2-input 1-output" selectors WRITTEN IN DSP/ASM, with further questions that I hope you can help me with.
Prim Method (visual concept)
There are selectors that I have the code for, that all accomplish the same result based on identical inputs.
Each implementation is the simplest/most optimized way of performing each respective operation I could find on the forum. Each has their orthodox purpose, which I will describe, but my questions are:
- When performing the task coded below, which implementation is the most efficient? (cpu, most important question)
- Which methods introduce latency?
- Which method is the most accurate?
- Which methods will have a higher chance of introducing noise/pop/dropout/break and why?
- Are they all identical? (I think not)
I have experienced problems with stream boolean switches when modulating quickly at sample rate (no hop), which leads me to believe these are all accomplishing the task in a different way (apparent in asm code).
Optimized Selector Method
DSP:
ASM:
^^ pretty straight forward method. In 9 lines of ASM code it compares the selector via cmpps operand against numbers, and outputs a corresponding input. Also has the added bonus of adding inputs, by simply adding dsp lines like:
out=out+in2&(select==2);
out=out+in3&(select==3);
Cool, but bulks it up and we're concerned with 2 ins here.
If Then Else Method
DSP:
ASM:
^^ In 7 lines of ASM this accomplishes the same thing, utilizing cmpps to compare the selector as true or false, then choosing from the 2 inputs based on this. Same thing less code, so more efficient right?
Crossfade Method:
DSP:
ASM:
This method is also only 7 lines of code, but employs multiplication on both inputs-- and outputs the sum.
All of these assume you use a boolean or boolean to mono comparator, or some other cognate code or signal to generate the selector. If clever one could use the input data to generate the control variable.
Now while the first option of the selector lets you have more input options, I typically reach for the ifthenelse out of simplicity and comfort... and after investigation I assume it is more efficient. That being said if my memory serves correct, I've experienced pops/dropouts/straight up failure with certain inputs when modulating quickly on the selector of the first two methods. For this reason I have at times employed the crossfade, sometimes with a gradient to actually crossfade the inputs. When creating complex devices you can imagine this could be problematic if cpu spikes or it becomes unreliable. With it being 7 lines of code I will assume it is as efficient or more than the ifthenelse method, but maybe mulps isn't as efficient as cmpps. What's the better of the two? If the crossfade is best, then would it be better to do it with the stream math functions?
Thanks, hope I informed someone who didn't grasp these tools, and hopefully someone can enlighten me on the details. I'm no expert.
I frequently employ the use of multiplexers, selectors, and logic the more I program. This post is examining "2-input 1-output" selectors WRITTEN IN DSP/ASM, with further questions that I hope you can help me with.
Prim Method (visual concept)
There are selectors that I have the code for, that all accomplish the same result based on identical inputs.
Each implementation is the simplest/most optimized way of performing each respective operation I could find on the forum. Each has their orthodox purpose, which I will describe, but my questions are:
- When performing the task coded below, which implementation is the most efficient? (cpu, most important question)
- Which methods introduce latency?
- Which method is the most accurate?
- Which methods will have a higher chance of introducing noise/pop/dropout/break and why?
- Are they all identical? (I think not)
I have experienced problems with stream boolean switches when modulating quickly at sample rate (no hop), which leads me to believe these are all accomplishing the task in a different way (apparent in asm code).
Optimized Selector Method
DSP:
- Code: Select all
streamin select;
streamin in0;
streamin in1;
streamout out;
out=in0&(select==0);
out=out+in1&(select==1);
ASM:
- Code: Select all
streamin select, in0, in1;
streamout out;
float _F_0=0, _F_1=1;
// Comparison
movaps xmm0,select;
cmpps xmm0,_F_0,0;
// '&' Operator
andps xmm0,in0;
// Assignment to 'out'
movaps out,xmm0;
// Comparison
movaps xmm0,select;
cmpps xmm0,_F_1,0;
// '&' Operator
andps xmm0,in1;
// '+' Operator
addps xmm0,out;
// Assignment to 'out'
movaps out,xmm0;
^^ pretty straight forward method. In 9 lines of ASM code it compares the selector via cmpps operand against numbers, and outputs a corresponding input. Also has the added bonus of adding inputs, by simply adding dsp lines like:
out=out+in2&(select==2);
out=out+in3&(select==3);
Cool, but bulks it up and we're concerned with 2 ins here.
If Then Else Method
DSP:
- Code: Select all
streamin select;
streamin in0;
streamin in1;
streamout out;
out = in1 + (in0 - in1)&(select == 0);
ASM:
- Code: Select all
streamin select, in0, in1;
streamout out;
float _F_0=0;
// '-' Operator
movaps xmm0,in0;
subps xmm0,in1;
// Comparison
movaps xmm1,select;
cmpps xmm1,_F_0,0;
// '&' Operator
andps xmm0,xmm1;
// '+' Operator
addps xmm0,in1;
// Assignment to 'out'
movaps out,xmm0;
^^ In 7 lines of ASM this accomplishes the same thing, utilizing cmpps to compare the selector as true or false, then choosing from the 2 inputs based on this. Same thing less code, so more efficient right?
Crossfade Method:
DSP:
- Code: Select all
streamin select;
streamin in0;
streamin in1;
streamout out;
out = (select*in0)+((1-select)*in1);
ASM:
- Code: Select all
streamin select, in0, in1;
streamout out;
float _F_1=1;
// '*' Operator
movaps xmm0,select;
mulps xmm0,in0;
// '-' Operator
movaps xmm1,_F_1;
subps xmm1,select;
// '*' Operator
mulps xmm1,in1;
// '+' Operator
addps xmm0,xmm1;
// Assignment to 'out'
movaps out,xmm0;
This method is also only 7 lines of code, but employs multiplication on both inputs-- and outputs the sum.
All of these assume you use a boolean or boolean to mono comparator, or some other cognate code or signal to generate the selector. If clever one could use the input data to generate the control variable.
Now while the first option of the selector lets you have more input options, I typically reach for the ifthenelse out of simplicity and comfort... and after investigation I assume it is more efficient. That being said if my memory serves correct, I've experienced pops/dropouts/straight up failure with certain inputs when modulating quickly on the selector of the first two methods. For this reason I have at times employed the crossfade, sometimes with a gradient to actually crossfade the inputs. When creating complex devices you can imagine this could be problematic if cpu spikes or it becomes unreliable. With it being 7 lines of code I will assume it is as efficient or more than the ifthenelse method, but maybe mulps isn't as efficient as cmpps. What's the better of the two? If the crossfade is best, then would it be better to do it with the stream math functions?
Thanks, hope I informed someone who didn't grasp these tools, and hopefully someone can enlighten me on the details. I'm no expert.
-
guyman - Posts: 207
- Joined: Fri Mar 02, 2018 8:27 pm
Re: Please evaluate my perception of these selectors in DSP/
If you are using the FS alpha, you can also use this:
It uses the andnps (and not) operator, I dont know if its available in the older flwostones.
- Code: Select all
streamin select, in1, in2;
streamout out;
float F0 = 0;
movaps xmm0,select;
cmpps xmm0,F0,0;
movaps xmm1,xmm0;
andps xmm0,in1;
andnps xmm1,in2;
orps xmm0,xmm1;
movaps out,xmm0;
It uses the andnps (and not) operator, I dont know if its available in the older flwostones.
- adamszabo
- Posts: 667
- Joined: Sun Jul 11, 2010 7:21 am
Re: Please evaluate my perception of these selectors in DSP/
Interesting topic!
For anyone reading this who doesn’t know, when you operate a selector prim the schematic gets “recompiled”. This means that if there’s any live audio running there will be a brief stream interruption often causing a click. That doesn’t happen with multiplexor prims.
For this reason, if I have a green bool control, I use 2 stream multipliers with the inverted bool feeding one and the outputs connected together. Then I use a de-zipper for the control to make a smoother cross-fade. Thus the de-zipper, which is linear, produces a cross-fade. I think this is important if the user might want to automate the switching, but if it’s a static setting, to choose a source for example, then it’s not so important.
Selecting at stream rates will also produce clicks unless the instantaneous value of both inputs is exactly the same when the control bool changes. So I would use the cross-fade option to avoid this effect.
For anyone reading this who doesn’t know, when you operate a selector prim the schematic gets “recompiled”. This means that if there’s any live audio running there will be a brief stream interruption often causing a click. That doesn’t happen with multiplexor prims.
For this reason, if I have a green bool control, I use 2 stream multipliers with the inverted bool feeding one and the outputs connected together. Then I use a de-zipper for the control to make a smoother cross-fade. Thus the de-zipper, which is linear, produces a cross-fade. I think this is important if the user might want to automate the switching, but if it’s a static setting, to choose a source for example, then it’s not so important.
Selecting at stream rates will also produce clicks unless the instantaneous value of both inputs is exactly the same when the control bool changes. So I would use the cross-fade option to avoid this effect.
-
Spogg - Posts: 3358
- Joined: Thu Nov 20, 2014 4:24 pm
- Location: Birmingham, England
Re: Please evaluate my perception of these selectors in DSP/
Just to add another out-of-bounds 'You can do this in FS Alpha!' helper ...
which in assembler translates into :
Same number of instructions but marginally quicker without the float variable, more so if adding channels to it. Whatever will we do with all these spare nanoSeconds?
H
- Code: Select all
streamin in1, in2;
streamboolin select;
streamout out;
out = in1&select + in2&!select;
which in assembler translates into :
- Code: Select all
streamin in1, in2;
streamboolin select;
streamout out;
movaps xmm0,in1;
andps xmm0,select;
xorps xmm1,xmm1; // sets all xmm1 bits to zero for bool functionality
cmpps xmm1,select,0;
andps xmm1,in2;
addps xmm0,xmm1;
movaps out,xmm0;
Same number of instructions but marginally quicker without the float variable, more so if adding channels to it. Whatever will we do with all these spare nanoSeconds?
H
-
HughBanton - Posts: 265
- Joined: Sat Apr 12, 2008 3:10 pm
- Location: Evesham, Worcestershire
Re: Please evaluate my perception of these selectors in DSP/
As you were .. I imagined streamboolin wasn't included in 3.0.6 but of course it is. Not out-of-bounds after all
-
HughBanton - Posts: 265
- Joined: Sat Apr 12, 2008 3:10 pm
- Location: Evesham, Worcestershire
Re: Please evaluate my perception of these selectors in DSP/
Thanks for the feedback fellas. SO:
Is usage of a bool variable as selector "leaner" cpu than that of the float ?
Is the mulps and cmpps equivalent in terms of CPU in these "7 instruction" variants?
If they are equivalent, the crossfade method offers the most in terms of flexibility while mitigating noise, unless the amount of cpu is much more due to processing a float. it's all ones and zeros but I guess that 1 digit bool int never has to leave it's format, unless you need the gradient.....
This all adds up if you are using a lot of logic to govern complex machines.
Is usage of a bool variable as selector "leaner" cpu than that of the float ?
Is the mulps and cmpps equivalent in terms of CPU in these "7 instruction" variants?
If they are equivalent, the crossfade method offers the most in terms of flexibility while mitigating noise, unless the amount of cpu is much more due to processing a float. it's all ones and zeros but I guess that 1 digit bool int never has to leave it's format, unless you need the gradient.....
This all adds up if you are using a lot of logic to govern complex machines.
-
guyman - Posts: 207
- Joined: Fri Mar 02, 2018 8:27 pm
Re: Please evaluate my perception of these selectors in DSP/
HughBanton wrote:Same number of instructions but marginally quicker without the float variable, more so if adding channels to it.
H
but once you add more channels you have to change that boolin to a streamin to select index values greater than 1, thus returning equivalence to "stream select + 7 instructions" ...
Are all things equal once you shave it down to this? what's the cpu dif of the cmpps methods and the crossfading mulps?
-
guyman - Posts: 207
- Joined: Fri Mar 02, 2018 8:27 pm
Re: Please evaluate my perception of these selectors in DSP/
Just wanted to say "Thanks" for posting the topic- fits the "questions I never knew I had" section of my brain nicely
As an example of why discussions like this helps me, I have never been delighted with my lack of grokking a simple IF/THEN/ELSE for things like switching off modules if there's no audio input...even though I'm the last thing from a natural-born programmer, having grown up on BASIC, there ARE situations where I'm looking at a schematic and thinking "I usually think in pictures- but right now I really just want to define my variables and perform relatively basic operations without Rube Goldberg being involved".
Sometimes it takes a thread like this, with a few example variations, to remove the wool from my eyes.
As an example of why discussions like this helps me, I have never been delighted with my lack of grokking a simple IF/THEN/ELSE for things like switching off modules if there's no audio input...even though I'm the last thing from a natural-born programmer, having grown up on BASIC, there ARE situations where I'm looking at a schematic and thinking "I usually think in pictures- but right now I really just want to define my variables and perform relatively basic operations without Rube Goldberg being involved".
Sometimes it takes a thread like this, with a few example variations, to remove the wool from my eyes.
We have to train ourselves so that we can improvise on anything... a bird, a sock, a fuming beaker! This, too, can be music. Anything can be music. -Biff Debris
-
Duckett - Posts: 132
- Joined: Mon Dec 14, 2015 12:39 am
Re: Please evaluate my perception of these selectors in DSP/
guyman wrote:HughBanton wrote:Same number of instructions but marginally quicker without the float variable, more so if adding channels to it.
H
but once you add more channels you have to change that boolin to a streamin to select index values greater than 1, thus returning equivalence to "stream select + 7 instructions" ...
Are all things equal once you shave it down to this? what's the cpu dif of the cmpps methods and the crossfading mulps?
Ah .. I was thinking 'channels' as in stereo - more ins & outs. But for replicating a selector component, with an int 'select' value, you would as you say need to use a regular streamin input so there might be little in it speed-wise. You could of course decode the select number externally in green, producing a number of bools, and then provide a streamboolin input for each of them into dsp/assem, but that would only suit if you don't need to to switch at sample rate. Depends on the application.
Re comparing the cpu demands of the various intructions, Trog warned me recently not to rely too heavily on this when trying to estimate the speed of different dsp approaches; nowadays modern processors do endless behind-the-scenes speed tricks, caching and predicting, that will very likely confound any human guesstimate!
But fwiw, I believe it runs something like this (won't be strictly accurate I'm sure):
Fastest : mov(aps), and, nand, or, etc.
Next : addps, subps, any compare type, xorps
Slower : mulps
Considerably slower : divps
Worst (in dsp) : %(modulo), ^(power)
The more unpredictable bit is anything that involves a memory lookup, i.e. any declared floats and ints, (but applies also to all the dsp/assem inputs and outputs .. anything that comes up in green), because how long it takes to read/write from/to memory depends on what else is going on, and there we really are in the lap of the cpu. Always well worth optimising, as little green in the code as possible.
While I'm here, I discovered that Adam's code using andnps can be streamlined even further, like this :
- Code: Select all
streamin in1, in2;
streamboolin switch;
streamout out;
xorps xmm0,xmm0;
movaps xmm0,switch;
movaps xmm1,xmm0; //copy switch to xmm1
andps xmm0,in1;
andnps xmm1,in2;
orps xmm0,xmm1;
movaps out,xmm0;
Should be about the most efficient method.
H
-
HughBanton - Posts: 265
- Joined: Sat Apr 12, 2008 3:10 pm
- Location: Evesham, Worcestershire
Re: Please evaluate my perception of these selectors in DSP/
Nice find Hugh! Although I dont know why it works, but you can remove the xorps line, since anything you do for xmm0 you are overwriting it with the movaps xmm0 switch.
- adamszabo
- Posts: 667
- Joined: Sun Jul 11, 2010 7:21 am
12 posts
• Page 1 of 2 • 1, 2
Who is online
Users browsing this forum: No registered users and 76 guests