-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proto: optimize global (un)marshal lock using RWMutex #1004
Conversation
Signed-off-by: TennyZhuang <[email protected]>
@dsnet PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most computational taxing thing here is acquiring the write lock. Rechecking your critical conditions is insignificant overhead, prevents overhead from reinitialization.
Eliding the recheck of critical conditions not only goes against good locking sanitation, it is at best a micro–optimization, which I guarantee is not saving more than a few nanoseconds every run of the program. The code under the write lock is not hot–loop code that needs to be microoptimized.
Signed-off-by: TennyZhuang <[email protected]>
OK, I've added a double check. |
A simple reproduce code can be found at https://github.com/TennyZhuang/protobuf-lock-reproduce The PR also resolve #888 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now.
Any more reviewers? |
|
@dsnet in our scenario, I add counter log at the beginning of getMarshalInfo, and about 1000000 calls in 1 second. Is there some other bug to cause the function is called too many times? |
Possibly? That's the question that is more interesting to figure out. This code here: protobuf/proto/table_marshal.go Lines 147 to 159 in ed6926b
atomically caches the computed marshalInfo , so it shouldn't happen again and again.
|
Have you used the most recent |
I reproduce it in https://github.com/TennyZhuang/protobuf-lock-reproduce (very high latency), protoc-gen-go 1.3.2. I will try to inspect into it later. |
Sorry, the reproduce demo is not correct, I will try to create a correct reproduce case. |
Sorry, this is a bug of gogo/protobuf#656 |
Got it. I'm going to close this then. In v2, we use |
Signed-off-by: TennyZhuang [email protected]
Thie PR Use
RWMutex
to optimizegetMarshalInfo
andgetUnmarshalInfo
, for these functions, only n (number of message type) will hit Write, and m(number of message) - n will hit Read, it's the best case to use RWMutex instead of Mutex.This optimization introduce huge improvement in our scenario.
We have 1000 worker and 1 controller, and work and controller keep heartbeat with gRPC. They also exchange the job info with each other.
The message is like
About 10000 jobs in every Heartbeat, and the heartbeat QPS in controller is about 1000.
The controller handle the Heartbeat in about 10ms, and the network latency is about 10ms, but the client will use about 30s in maximum to finish a RPC call.
We use golang pprof block profile, and it seems that almost all block is caused by one global Mutex in protobuf package.
After optimization, in our use case, the rpc call from client will only use about 30ms, as our expected.