Use Ringbuffer for Input&output Buffer #157

howard0su · 2021-01-14T13:45:36Z

No description provided.

howard0su · 2021-01-14T13:46:02Z

The perf is not good after this change. Maybe we need change output to a ringbuffer as well. Not sure. please review.

howard0su · 2021-01-14T23:40:52Z

i am thinking we may want to use callback (IOCompletionPort) in USB stack.

ik1xpv

Look fine to me. I tested the code on I7-3770 . It runs faster by some units%
There is still the #158 problem that does not depend on this code

howard0su · 2021-01-19T09:43:28Z

unfortunately, it is slower than before on my laptop, which is a I7-8665. I need more testings. and also waiting the fix of dynamic extio_len. Oscar Steila <[email protected]>于2021年1月19日周二下午5:30写道：

…

***@***.**** approved this pull request. Look fine to me. I tested the code on I7-3770 . It runs faster by some units% — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#157 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAF3GRFHYMQGXFM3OIQWXTDS2VGNBANCNFSM4WCOO5UQ> .

howard0su · 2021-01-21T14:57:23Z

Please review again. Now it doesn't use dynamic ext_blocklen. so it should work with the current HDSDR. I need help to validate the performance. On my laptop, i cannot see much difference as 32M cannot work well anyway.

ik1xpv · 2021-01-21T17:24:41Z

I made some test on I7-3770 ( my Laptop has still temperature problem :-)
I tested 64M adc clock , USB 12k audio LO to frequency not exact bin multiplier to activate shift
Both are compiled in release
IF sample rate 32M 16M 8M 4M 2M 1M 0.5M
this branch#4f54221 CPU% 27 17 14 13 12 12 12
ver 1.1.0 CPU% 16 12 10 9 8 - -
The old version still seems faster :-(

ik1xpv · 2021-01-21T17:48:26Z

I made a test on my laptop with SR 8 M. The v1.1.0 is faster 23-24 % vs 29-30% of this branch

howard0su · 2021-04-11T15:33:11Z

no intention to commit as perf regression.

ik1xpv · 2021-04-25T16:24:20Z

I made a comparison of the action compiled windows vs the master. I'm using the old I7-3770 :-)

howard0su · 2021-04-26T00:43:58Z

Thank you for the testing. The result is actually expected. The goal for this PR is having the input and output decoupled so that I can add the functions to support multi channels in a cleaner way. Since we have more threads here and in order to coordinate between threads, I added some busy loop in the code. I defined the following: The number may need to adjust to reduce CPU usage. As far as we don't see the overall perf slow down, it is fine to use a bit more CPU. const int spin_count= 1000000;

…

On Mon, Apr 26, 2021 at 12:24 AM Oscar Steila ***@***.***> wrote: I made a comparison of the action compiled windows vs the master. I'm using the old I7-3770 :-) [image: justarun] <https://user-images.githubusercontent.com/9883800/116001110-3c2ab580-a5f3-11eb-8719-479d76dcb6c9.jpg> — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#157 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAF3GRHXXIQVZIBDUY6MYK3TKQ64DANCNFSM4WCOO5UQ> .

-- -Howard

howard0su · 2021-04-26T10:46:23Z

Hi Oscar, I added one more change to reduce spin_count to 100 which seems to help CPU usage a bit. It will be great if you can test the 64M bandwidth to see if there is any regression. The focus here is getting the current implementation to a pipeline solution so that I can add more processing into the pipeline without complicating the code too much. I plan to add more channels as the next step so that one channel will be processed by one thread for iFFT. and also supporting the sample rate down to 48Khz with some software decimate after the current fft approach. The decimate will be processed in another thread as well. This requires the whole process in the pipeline fashion.

…

On Mon, Apr 26, 2021 at 8:43 AM Howard Su ***@***.***> wrote: Thank you for the testing. The result is actually expected. The goal for this PR is having the input and output decoupled so that I can add the functions to support multi channels in a cleaner way. Since we have more threads here and in order to coordinate between threads, I added some busy loop in the code. I defined the following: The number may need to adjust to reduce CPU usage. As far as we don't see the overall perf slow down, it is fine to use a bit more CPU. const int spin_count= 1000000; On Mon, Apr 26, 2021 at 12:24 AM Oscar Steila ***@***.***> wrote: > I made a comparison of the action compiled windows vs the master. I'm > using the old I7-3770 :-) > [image: justarun] > <https://user-images.githubusercontent.com/9883800/116001110-3c2ab580-a5f3-11eb-8719-479d76dcb6c9.jpg> > > — > You are receiving this because you modified the open/close state. > Reply to this email directly, view it on GitHub > <#157 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAF3GRHXXIQVZIBDUY6MYK3TKQ64DANCNFSM4WCOO5UQ> > . > -- -Howard

-- -Howard

ik1xpv · 2021-04-26T17:17:37Z

Yes. CMake 198 is faster than version CMake 196. It seems still a little slower than master CMake 194

I will continue testing tomorrow. Thanks for the new code architecture !

ik1xpv · 2021-04-27T08:31:03Z

I made a comparison using Open Hardware Monitor to trace the CPU load. I disabled wifi.

I repeated the test some times and the CMake198 looks a little better that master CMake 194.
The time windows used is 8 minutes.

howard0su · 2021-04-27T11:03:00Z

Please also focus on the performance in additional to CPU usage. This version still cannot play 64M well on my laptop. If no objection, i will first commit this version and start break current r2iq into 3 stages (or maybe 3).
Stage 1: Convert samples into freq domain samples
Stage 2: Shift freq domain samples into the right LO and apply filter, decimate, and do iFFT
Stage 3: do finetune

ik1xpv · 2021-04-27T13:45:43Z

Howard,
I made some play with CMake198 and seem to me equal or better than CMake194
Here a tone reception of my 20MHz reference.

It looks fine with no phase discontinuity.
I made a comparison test with my laptop (ADC clk 64M, IF 32M).

The two releases have very similar performance.
Be free to merge and test the new architecture.
Thanks :-)

ik1xpv

Looks great! Bravo

howard0su · 2021-04-27T14:47:09Z

thank you for all your testing.

My next PR will be even bigger in terms of changes.

howard0su requested review from hayguen and ik1xpv January 14, 2021 13:46

ik1xpv approved these changes Jan 19, 2021

View reviewed changes

howard0su changed the title ~~Use Ringbuffer for InputBuffer~~ Use Ringbuffer for Input&output Buffer Jan 19, 2021

howard0su force-pushed the ringbuffer branch from 72aa27f to 1dfb5e3 Compare January 21, 2021 13:15

howard0su added 3 commits January 21, 2021 22:43

use Ringbuffer for input and output

7104ce8

Still use fixed EXT_BLOCKLEN

326e456

Fix Linux build

4f54221

howard0su force-pushed the ringbuffer branch from 1dfb5e3 to 4f54221 Compare January 21, 2021 14:44

howard0su marked this pull request as ready for review January 21, 2021 14:56

howard0su requested a review from ik1xpv January 21, 2021 14:56

howard0su mentioned this pull request Jan 21, 2021

Add more sample rates #156

Closed

howard0su closed this Apr 11, 2021

howard0su reopened this Apr 20, 2021

howard0su added 2 commits April 25, 2021 21:47

Merge remote-tracking branch 'origin/ringbuffer' into ringbuffer

3b66438

Merge remote-tracking branch 'upstream/master' into ringbuffer

109fe4c

howard0su added 2 commits April 26, 2021 09:58

Fix linux build

0e6acc3

reduce spin count to 100

a851c26

ik1xpv approved these changes Apr 27, 2021

View reviewed changes

howard0su merged commit 92323c1 into ik1xpv:master Apr 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Ringbuffer for Input&output Buffer #157

Use Ringbuffer for Input&output Buffer #157

howard0su commented Jan 14, 2021

howard0su commented Jan 14, 2021

howard0su commented Jan 14, 2021

ik1xpv left a comment •

edited

Loading

howard0su commented Jan 19, 2021 via email

howard0su commented Jan 21, 2021

ik1xpv commented Jan 21, 2021

ik1xpv commented Jan 21, 2021

howard0su commented Apr 11, 2021

ik1xpv commented Apr 25, 2021

howard0su commented Apr 26, 2021 via email

howard0su commented Apr 26, 2021 via email

ik1xpv commented Apr 26, 2021

ik1xpv commented Apr 27, 2021 •

edited

Loading

howard0su commented Apr 27, 2021

ik1xpv commented Apr 27, 2021

ik1xpv left a comment

howard0su commented Apr 27, 2021

Use Ringbuffer for Input&output Buffer #157

Use Ringbuffer for Input&output Buffer #157

Conversation

howard0su commented Jan 14, 2021

howard0su commented Jan 14, 2021

howard0su commented Jan 14, 2021

ik1xpv left a comment • edited Loading

Choose a reason for hiding this comment

howard0su commented Jan 19, 2021 via email

howard0su commented Jan 21, 2021

ik1xpv commented Jan 21, 2021

ik1xpv commented Jan 21, 2021

howard0su commented Apr 11, 2021

ik1xpv commented Apr 25, 2021

howard0su commented Apr 26, 2021 via email

howard0su commented Apr 26, 2021 via email

ik1xpv commented Apr 26, 2021

ik1xpv commented Apr 27, 2021 • edited Loading

howard0su commented Apr 27, 2021

ik1xpv commented Apr 27, 2021

ik1xpv left a comment

Choose a reason for hiding this comment

howard0su commented Apr 27, 2021

ik1xpv left a comment •

edited

Loading

ik1xpv commented Apr 27, 2021 •

edited

Loading