Is there an input queue in ingress port to storage packets arriving at line rate? #729

castilhorosa · 2019-03-06T17:41:48Z

Hi there.

I'm a begginer in P4 and BMv2. As far I know when we resubmit or recirculate a specific packet in bmv2, it will be reinjected at same ingress port that it arrived, right? My question is, what's happen when we resubmit or recirculate a packet in bmv2 but at the same time the same ingress port is receiving another packet? Which one would be parsed first, for example? Is there an input queue at ingress port to storage packets when bmv2 is receiving or parsing a resubmited/recirculated packet?

Thanks in advance.

jafingerhut · 2019-03-06T17:55:43Z

I am not sure whether BMv2 simple_switch will always preserve the value of the ingress_port metadata field when recirculating or resubmitting a packet. It may replace it with 0 instead, by default, unless you request in the recirculate/resubmit operation to preserve the ingress_port value. That may not be the important part of your question, though.

Someone like Antonin Bas can answer more authoritatively how this is implemented in BMv2, but pretty much every other implementation of recirculate/resubmit I have seen in ASICs has at least a small buffer, and some kind of packet scheduler to choose whether a newly arriving packet or a recirculated/resubmitted packet is scheduled next for processing. If that buffer overflows, then something gets dropped or back-pressured. You have to be careful with how back-pressure is done, to avoid deadlock, so drop is inevitable at some point, but usually the buffers contain at least a few dozen packets, so dropping does not become likely unless you recirculate/resubmit many packets at a high enough rate, while new packets are arriving at a high enough rate at the same time.

antoninbas · 2019-03-06T18:06:54Z

I'll answer this question specifically for bmv2 simple_switch (every bmv2 target is free to adopt a different queuing scheme). simple_switch has an input_buffer queue (https://github.com/p4lang/behavioral-model/blob/master/targets/simple_switch/simple_switch.h#L172). By default this queue can hold 1K packets and is blocking, meaning a write operation will block until there is an available slot. In particular this means that a resubmitted / recirculated packet can never be dropped. For incoming packets read though libpcap (for example) the write is blocking, so back-pressure will build at the Linux interface and I expect packets to be dropped there if there is more incoming traffic than simple_switch can handle.

If this doesn't work for you, feel free to use different settings for input_buffer or change the input queueing scheme altogether.

There isn't really a notion of "line rate" here. The simple_switch ingress pipeline will handle packets as fast as your CPU can handle.

castilhorosa · 2019-03-06T21:32:10Z

I got it. I'm starting to understand now. A follow up question: Let's suppose new packets are arriving at a very high rate in a way that input_buffer has 1K packets (maximum limit) in a time k. What's happen in this scenario when, one moment later (time k+1), we resubmit/recirculate a packet? As you mentioned, the resubmitted/recirculated packet can never be droped, right? Assuming we're using FIFO to drain input_buffer, does the resubmitted/recirculated packet will always be processed before any packets in input_buffer? Or maybe does the original packets (I mean non resubmitted/recirculated packets) in input_buffer has priority and all of them will be served first?

antoninbas · 2019-03-06T21:55:46Z

That's a very good question :), and it made me realize that there is a deadlock in the code for resubmit, but because of the limited used of resubmit / recirculate somehow I never observed it and no one reported it. If the input_buffer is full and a packet needs to be resubmitted, the enqueue operation will block and the input_buffer will not drain anymore because it's the same thread blocking and servicing input_buffer.

jafingerhut · 2019-03-06T22:16:43Z

Cycles plus backpressure does not always result in a system where deadlock is possible, but it certainly often does, unless you are careful to avoid it somehow. Not just in ASICs :-)

antoninbas · 2019-03-06T22:33:32Z

I read your first comment and I was like "no way I have a deadlock" :)

Resubmit packets were written to the input_buffer with a blocking call by the ingress thread. Since the ingress thread is also in charge of draining the input_buffer, this could have lead to a deadlock. Resubmit packets can now be dropped if the buffer is full. To limit the number of resubmit packets being lost, we place them is a higher priority queue than "normal" packets. Fixes #729

antoninbas · 2019-03-07T01:14:14Z

I have opened a PR to fix this potential deadlock issue. As a side effect, resubmit packets could now be dropped, although it is unlikely for most P4 programs.

jafingerhut · 2019-03-07T01:21:56Z

The same possibility of dropping should also exist for recirculated and egress-to-egress cloned packets, too, yes?

I would guess that ingress-to-egress cloned packets already had the possibility of dropping before, just as normal ingress-to-ingress unicast and multicast packets could, if the "packet buffer" queues got too deep.

antoninbas · 2019-03-07T01:23:01Z

The egress buffers are not blocking for any type of packet, and packets are just dropped immediately if the they are full.

castilhorosa · 2019-03-07T02:43:10Z

It is much more clear for me right now. I'm glad the topic was helpful for fixing a possible deadlock that might occurs. I didn't mean to do that actually :) I will study the code for better understanding. Thanks @antoninbas and @jafingerhut for clarify me.

Resubmit packets were written to the input_buffer with a blocking call by the ingress thread. Since the ingress thread is also in charge of draining the input_buffer, this could have lead to a deadlock. Resubmit packets can now be dropped if the buffer is full. To limit the number of resubmit packets being lost, we place them is a higher priority queue than "normal" packets. Fixes #729

antoninbas mentioned this issue Mar 7, 2019

Fix ingress thread deadlock for resubmit packets #730

Merged

antoninbas closed this as completed in #730 Mar 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there an input queue in ingress port to storage packets arriving at line rate? #729

Is there an input queue in ingress port to storage packets arriving at line rate? #729

castilhorosa commented Mar 6, 2019

jafingerhut commented Mar 6, 2019 •

edited

Loading

antoninbas commented Mar 6, 2019

castilhorosa commented Mar 6, 2019

antoninbas commented Mar 6, 2019

jafingerhut commented Mar 6, 2019 •

edited

Loading

antoninbas commented Mar 6, 2019

antoninbas commented Mar 7, 2019

jafingerhut commented Mar 7, 2019

antoninbas commented Mar 7, 2019

castilhorosa commented Mar 7, 2019

Is there an input queue in ingress port to storage packets arriving at line rate? #729

Is there an input queue in ingress port to storage packets arriving at line rate? #729

Comments

castilhorosa commented Mar 6, 2019

jafingerhut commented Mar 6, 2019 • edited Loading

antoninbas commented Mar 6, 2019

castilhorosa commented Mar 6, 2019

antoninbas commented Mar 6, 2019

jafingerhut commented Mar 6, 2019 • edited Loading

antoninbas commented Mar 6, 2019

antoninbas commented Mar 7, 2019

jafingerhut commented Mar 7, 2019

antoninbas commented Mar 7, 2019

castilhorosa commented Mar 7, 2019

jafingerhut commented Mar 6, 2019 •

edited

Loading

jafingerhut commented Mar 6, 2019 •

edited

Loading