Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Ringbuffer for Input&output Buffer #157

Merged
merged 7 commits into from
Apr 27, 2021
Merged

Conversation

howard0su
Copy link
Collaborator

No description provided.

@howard0su
Copy link
Collaborator Author

The perf is not good after this change. Maybe we need change output to a ringbuffer as well. Not sure. please review.

@howard0su howard0su requested review from hayguen and ik1xpv January 14, 2021 13:46
@howard0su
Copy link
Collaborator Author

i am thinking we may want to use callback (IOCompletionPort) in USB stack.

Copy link
Owner

@ik1xpv ik1xpv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look fine to me. I tested the code on I7-3770 . It runs faster by some units%
There is still the #158 problem that does not depend on this code

@howard0su
Copy link
Collaborator Author

howard0su commented Jan 19, 2021 via email

@howard0su howard0su changed the title Use Ringbuffer for InputBuffer Use Ringbuffer for Input&output Buffer Jan 19, 2021
@howard0su howard0su marked this pull request as ready for review January 21, 2021 14:56
@howard0su howard0su requested a review from ik1xpv January 21, 2021 14:56
@howard0su
Copy link
Collaborator Author

Please review again. Now it doesn't use dynamic ext_blocklen. so it should work with the current HDSDR. I need help to validate the performance. On my laptop, i cannot see much difference as 32M cannot work well anyway.

@howard0su howard0su mentioned this pull request Jan 21, 2021
@ik1xpv
Copy link
Owner

ik1xpv commented Jan 21, 2021

I made some test on I7-3770 ( my Laptop has still temperature problem :-)
I tested 64M adc clock , USB 12k audio LO to frequency not exact bin multiplier to activate shift
Both are compiled in release
IF sample rate 32M 16M 8M 4M 2M 1M 0.5M
this branch#4f54221 CPU% 27 17 14 13 12 12 12
ver 1.1.0 CPU% 16 12 10 9 8 - -
The old version still seems faster :-(

@ik1xpv
Copy link
Owner

ik1xpv commented Jan 21, 2021

I made a test on my laptop with SR 8 M. The v1.1.0 is faster 23-24 % vs 29-30% of this branch

@howard0su
Copy link
Collaborator Author

no intention to commit as perf regression.

@howard0su howard0su closed this Apr 11, 2021
@howard0su howard0su reopened this Apr 20, 2021
@ik1xpv
Copy link
Owner

ik1xpv commented Apr 25, 2021

I made a comparison of the action compiled windows vs the master. I'm using the old I7-3770 :-)
justarun

@howard0su
Copy link
Collaborator Author

howard0su commented Apr 26, 2021 via email

@howard0su
Copy link
Collaborator Author

howard0su commented Apr 26, 2021 via email

@ik1xpv
Copy link
Owner

ik1xpv commented Apr 26, 2021

Yes. CMake 198 is faster than version CMake 196. It seems still a little slower than master CMake 194
justarun2

I will continue testing tomorrow. Thanks for the new code architecture !

@ik1xpv
Copy link
Owner

ik1xpv commented Apr 27, 2021

I made a comparison using Open Hardware Monitor to trace the CPU load. I disabled wifi.
justarun3
I repeated the test some times and the CMake198 looks a little better that master CMake 194.
The time windows used is 8 minutes.

@howard0su
Copy link
Collaborator Author

Please also focus on the performance in additional to CPU usage. This version still cannot play 64M well on my laptop. If no objection, i will first commit this version and start break current r2iq into 3 stages (or maybe 3).
Stage 1: Convert samples into freq domain samples
Stage 2: Shift freq domain samples into the right LO and apply filter, decimate, and do iFFT
Stage 3: do finetune

@ik1xpv
Copy link
Owner

ik1xpv commented Apr 27, 2021

Howard,
I made some play with CMake198 and seem to me equal or better than CMake194
Here a tone reception of my 20MHz reference.
20MHz_DIG
It looks fine with no phase discontinuity.
I made a comparison test with my laptop (ADC clk 64M, IF 32M).
LaptopRun5
The two releases have very similar performance.
Be free to merge and test the new architecture.
Thanks :-)

Copy link
Owner

@ik1xpv ik1xpv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Bravo

@howard0su
Copy link
Collaborator Author

thank you for all your testing.

My next PR will be even bigger in terms of changes.

@howard0su howard0su merged commit 92323c1 into ik1xpv:master Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants