-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there an input queue in ingress port to storage packets arriving at line rate? #729
Comments
I am not sure whether BMv2 simple_switch will always preserve the value of the ingress_port metadata field when recirculating or resubmitting a packet. It may replace it with 0 instead, by default, unless you request in the recirculate/resubmit operation to preserve the ingress_port value. That may not be the important part of your question, though. Someone like Antonin Bas can answer more authoritatively how this is implemented in BMv2, but pretty much every other implementation of recirculate/resubmit I have seen in ASICs has at least a small buffer, and some kind of packet scheduler to choose whether a newly arriving packet or a recirculated/resubmitted packet is scheduled next for processing. If that buffer overflows, then something gets dropped or back-pressured. You have to be careful with how back-pressure is done, to avoid deadlock, so drop is inevitable at some point, but usually the buffers contain at least a few dozen packets, so dropping does not become likely unless you recirculate/resubmit many packets at a high enough rate, while new packets are arriving at a high enough rate at the same time. |
I'll answer this question specifically for bmv2 simple_switch (every bmv2 target is free to adopt a different queuing scheme). simple_switch has an If this doesn't work for you, feel free to use different settings for There isn't really a notion of "line rate" here. The simple_switch ingress pipeline will handle packets as fast as your CPU can handle. |
I got it. I'm starting to understand now. A follow up question: Let's suppose new packets are arriving at a very high rate in a way that input_buffer has 1K packets (maximum limit) in a time k. What's happen in this scenario when, one moment later (time k+1), we resubmit/recirculate a packet? As you mentioned, the resubmitted/recirculated packet can never be droped, right? Assuming we're using FIFO to drain input_buffer, does the resubmitted/recirculated packet will always be processed before any packets in input_buffer? Or maybe does the original packets (I mean non resubmitted/recirculated packets) in input_buffer has priority and all of them will be served first? |
That's a very good question :), and it made me realize that there is a deadlock in the code for resubmit, but because of the limited used of resubmit / recirculate somehow I never observed it and no one reported it. If the input_buffer is full and a packet needs to be resubmitted, the enqueue operation will block and the input_buffer will not drain anymore because it's the same thread blocking and servicing input_buffer. |
Cycles plus backpressure does not always result in a system where deadlock is possible, but it certainly often does, unless you are careful to avoid it somehow. Not just in ASICs :-) |
I read your first comment and I was like "no way I have a deadlock" :) |
Resubmit packets were written to the input_buffer with a blocking call by the ingress thread. Since the ingress thread is also in charge of draining the input_buffer, this could have lead to a deadlock. Resubmit packets can now be dropped if the buffer is full. To limit the number of resubmit packets being lost, we place them is a higher priority queue than "normal" packets. Fixes #729
I have opened a PR to fix this potential deadlock issue. As a side effect, resubmit packets could now be dropped, although it is unlikely for most P4 programs. |
The same possibility of dropping should also exist for recirculated and egress-to-egress cloned packets, too, yes? I would guess that ingress-to-egress cloned packets already had the possibility of dropping before, just as normal ingress-to-ingress unicast and multicast packets could, if the "packet buffer" queues got too deep. |
The egress buffers are not blocking for any type of packet, and packets are just dropped immediately if the they are full. |
It is much more clear for me right now. I'm glad the topic was helpful for fixing a possible deadlock that might occurs. I didn't mean to do that actually :) I will study the code for better understanding. Thanks @antoninbas and @jafingerhut for clarify me. |
Resubmit packets were written to the input_buffer with a blocking call by the ingress thread. Since the ingress thread is also in charge of draining the input_buffer, this could have lead to a deadlock. Resubmit packets can now be dropped if the buffer is full. To limit the number of resubmit packets being lost, we place them is a higher priority queue than "normal" packets. Fixes #729
Resubmit packets were written to the input_buffer with a blocking call by the ingress thread. Since the ingress thread is also in charge of draining the input_buffer, this could have lead to a deadlock. Resubmit packets can now be dropped if the buffer is full. To limit the number of resubmit packets being lost, we place them is a higher priority queue than "normal" packets. Fixes #729
Resubmit packets were written to the input_buffer with a blocking call by the ingress thread. Since the ingress thread is also in charge of draining the input_buffer, this could have lead to a deadlock. Resubmit packets can now be dropped if the buffer is full. To limit the number of resubmit packets being lost, we place them is a higher priority queue than "normal" packets. Fixes #729
Hi there.
I'm a begginer in P4 and BMv2. As far I know when we resubmit or recirculate a specific packet in bmv2, it will be reinjected at same ingress port that it arrived, right? My question is, what's happen when we resubmit or recirculate a packet in bmv2 but at the same time the same ingress port is receiving another packet? Which one would be parsed first, for example? Is there an input queue at ingress port to storage packets when bmv2 is receiving or parsing a resubmited/recirculated packet?
Thanks in advance.
The text was updated successfully, but these errors were encountered: