Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there an input queue in ingress port to storage packets arriving at line rate? #729

Closed
castilhorosa opened this issue Mar 6, 2019 · 10 comments

Comments

@castilhorosa
Copy link

Hi there.

I'm a begginer in P4 and BMv2. As far I know when we resubmit or recirculate a specific packet in bmv2, it will be reinjected at same ingress port that it arrived, right? My question is, what's happen when we resubmit or recirculate a packet in bmv2 but at the same time the same ingress port is receiving another packet? Which one would be parsed first, for example? Is there an input queue at ingress port to storage packets when bmv2 is receiving or parsing a resubmited/recirculated packet?

Thanks in advance.

@jafingerhut
Copy link
Contributor

jafingerhut commented Mar 6, 2019

I am not sure whether BMv2 simple_switch will always preserve the value of the ingress_port metadata field when recirculating or resubmitting a packet. It may replace it with 0 instead, by default, unless you request in the recirculate/resubmit operation to preserve the ingress_port value. That may not be the important part of your question, though.

Someone like Antonin Bas can answer more authoritatively how this is implemented in BMv2, but pretty much every other implementation of recirculate/resubmit I have seen in ASICs has at least a small buffer, and some kind of packet scheduler to choose whether a newly arriving packet or a recirculated/resubmitted packet is scheduled next for processing. If that buffer overflows, then something gets dropped or back-pressured. You have to be careful with how back-pressure is done, to avoid deadlock, so drop is inevitable at some point, but usually the buffers contain at least a few dozen packets, so dropping does not become likely unless you recirculate/resubmit many packets at a high enough rate, while new packets are arriving at a high enough rate at the same time.

@antoninbas
Copy link
Member

I'll answer this question specifically for bmv2 simple_switch (every bmv2 target is free to adopt a different queuing scheme). simple_switch has an input_buffer queue (https://github.com/p4lang/behavioral-model/blob/master/targets/simple_switch/simple_switch.h#L172). By default this queue can hold 1K packets and is blocking, meaning a write operation will block until there is an available slot. In particular this means that a resubmitted / recirculated packet can never be dropped. For incoming packets read though libpcap (for example) the write is blocking, so back-pressure will build at the Linux interface and I expect packets to be dropped there if there is more incoming traffic than simple_switch can handle.

If this doesn't work for you, feel free to use different settings for input_buffer or change the input queueing scheme altogether.

There isn't really a notion of "line rate" here. The simple_switch ingress pipeline will handle packets as fast as your CPU can handle.

@castilhorosa
Copy link
Author

I got it. I'm starting to understand now. A follow up question: Let's suppose new packets are arriving at a very high rate in a way that input_buffer has 1K packets (maximum limit) in a time k. What's happen in this scenario when, one moment later (time k+1), we resubmit/recirculate a packet? As you mentioned, the resubmitted/recirculated packet can never be droped, right? Assuming we're using FIFO to drain input_buffer, does the resubmitted/recirculated packet will always be processed before any packets in input_buffer? Or maybe does the original packets (I mean non resubmitted/recirculated packets) in input_buffer has priority and all of them will be served first?

@antoninbas
Copy link
Member

That's a very good question :), and it made me realize that there is a deadlock in the code for resubmit, but because of the limited used of resubmit / recirculate somehow I never observed it and no one reported it. If the input_buffer is full and a packet needs to be resubmitted, the enqueue operation will block and the input_buffer will not drain anymore because it's the same thread blocking and servicing input_buffer.

@jafingerhut
Copy link
Contributor

jafingerhut commented Mar 6, 2019

Cycles plus backpressure does not always result in a system where deadlock is possible, but it certainly often does, unless you are careful to avoid it somehow. Not just in ASICs :-)

@antoninbas
Copy link
Member

I read your first comment and I was like "no way I have a deadlock" :)

antoninbas added a commit that referenced this issue Mar 7, 2019
Resubmit packets were written to the input_buffer with a blocking call
by the ingress thread. Since the ingress thread is also in charge of
draining the input_buffer, this could have lead to a deadlock. Resubmit
packets can now be dropped if the buffer is full. To limit the number of
resubmit packets being lost, we place them is a higher priority queue
than "normal" packets.

Fixes #729
@antoninbas
Copy link
Member

I have opened a PR to fix this potential deadlock issue. As a side effect, resubmit packets could now be dropped, although it is unlikely for most P4 programs.

@jafingerhut
Copy link
Contributor

The same possibility of dropping should also exist for recirculated and egress-to-egress cloned packets, too, yes?

I would guess that ingress-to-egress cloned packets already had the possibility of dropping before, just as normal ingress-to-ingress unicast and multicast packets could, if the "packet buffer" queues got too deep.

@antoninbas
Copy link
Member

The egress buffers are not blocking for any type of packet, and packets are just dropped immediately if the they are full.

@castilhorosa
Copy link
Author

It is much more clear for me right now. I'm glad the topic was helpful for fixing a possible deadlock that might occurs. I didn't mean to do that actually :) I will study the code for better understanding. Thanks @antoninbas and @jafingerhut for clarify me.

antoninbas added a commit that referenced this issue Mar 7, 2019
Resubmit packets were written to the input_buffer with a blocking call
by the ingress thread. Since the ingress thread is also in charge of
draining the input_buffer, this could have lead to a deadlock. Resubmit
packets can now be dropped if the buffer is full. To limit the number of
resubmit packets being lost, we place them is a higher priority queue
than "normal" packets.

Fixes #729
antoninbas added a commit that referenced this issue Mar 7, 2019
Resubmit packets were written to the input_buffer with a blocking call
by the ingress thread. Since the ingress thread is also in charge of
draining the input_buffer, this could have lead to a deadlock. Resubmit
packets can now be dropped if the buffer is full. To limit the number of
resubmit packets being lost, we place them is a higher priority queue
than "normal" packets.

Fixes #729
antoninbas added a commit that referenced this issue Mar 11, 2019
Resubmit packets were written to the input_buffer with a blocking call
by the ingress thread. Since the ingress thread is also in charge of
draining the input_buffer, this could have lead to a deadlock. Resubmit
packets can now be dropped if the buffer is full. To limit the number of
resubmit packets being lost, we place them is a higher priority queue
than "normal" packets.

Fixes #729
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants