Probabilistic Forwarding - Using Heuristic Analysis of Network Conditions to Reduce Load #5629

medentem · 2024-12-21T01:58:25Z

Nodes are continuously logging information about the state of the mesh around them which can be used to create a probabilistic forwarding scheme that mitigates unneeded and unwanted packet traffic without impacting reliability.

There are three core data points under study:

The number of direct neighbors observed in the last X minutes
The number of nodes that have sent the same packet
The number of packets received in a given look-back period

These three data points can be used to determine the likelihood repeating the packet is unnecessary. Of course it can never be a certainty which is why a probabilistic forwarding scheme is used, and the degree to which each factor influences the forwarding probability can be tuned for typical meshtastic levels of traffic.

In any case, even if the influence of these factors are reduced significantly to only conservatively reduce traffic, it would be a traffic reduction nonetheless.

To model the effect of this probablistic forwarder, you can use this js fiddle which mirrors the calculations in FloodRouter.cpp - https://jsfiddle.net/1ufmhry6/4/

Key lines of code:

Line 61-63 of FloodingRouter.cpp -> calls probability calculation function and tests value against a random number to determine if packet will be forwarded.

Line 110 of FloodingRouter.cpp -> calculates probability based on data noted above

… auto position updates

…tions

GUVWAF · 2024-12-21T10:11:06Z

Thanks for this, it’s interesting. While I like that it’s using existing metrics obtained from the mesh without utilizing more airtime or RAM, I’m not sure about the metrics and the idea in general.

First of all, non-routers/repeaters (except non-rebroadcasters like CLIENT_MUTE) will always try to rebroadcast after a small delay based on SNR. If within that window, they hear a packet starting and it appears to be the packet they’re trying to rebroadcast, they will cancel this rebroadcast:

firmware/src/mesh/FloodingRouter.cpp

Lines 28 to 29 in f39a9c5

    
           // cancel rebroadcast of this message *if* there was already one, unless we're a router/repeater! 
        
           if (Router::cancelSending(p->from, p->id))

So this already drastically limits the number of rebroadcasts. Based on your distinctSources metric, it looks like you didn’t take this into account, because this logic actually is similar but then for more than 1 distinct source, the probability becomes 0.
In your case, you also base the probability on historical metrics, which might cause it to not try rebroadcasting at all even if you’re the only one that can serve a certain set of receivers.

Now onto the metrics.
First, why would having more neighbors decrease the chance of rebroadcasting? Depending on where the packet came from, if you have more nodes to serve than another node, shouldn’t the chance of rebroadcasting be higher for you? For example here, for a packet originating from either 0 or 5, it’s better that node 1 is rebroadcasting, and not node 5 or 0.

Next, the distinctSources metric won’t work for several reasons. First, you determine based on the first time you receive a packet whether you will put it in the transmit queue, at which point this will always be 1. Subsequent packets will not enter perhapsRebroadcast() because of the wasSeenRecently() check. Furthermore, the sender in recentPackets is the from of a packet, which does not change when a different node rebroadcasts. It’s the original transmitter of a packet -- we don’t know who is rebroadcasting.

Lastly, while the recentUniquePacketRate will decrease channel utilization in case the mesh has more traffic to transfer, in my opinion this is not something that should influence the probability of rebroadcasting.
There is already throttling in place at high channel utilization for periodic broadcasts like DeviceTelemetry (and the interval scales with the number of nodes in the mesh). Furthermore, the delay before originating a packet scales with channel utilization in order to limit the chance of collisions. However, in case an event happens and everybody starts transmitting text messages, this is not a reason to lower the chance of rebroadcasting. If you were a good rebroadcaster, it doesn’t make you a worse rebroadcaster in case there are more packets to be delivered.
Also, this metric does not take the actual airtime into account. Using modem preset SHORT_TURBO you can tolerate much more packets than on LONG_SLOW.

medentem · 2024-12-22T02:43:48Z

Thank you for such thorough feedback!

First of all, non-routers/repeaters (except non-rebroadcasters like CLIENT_MUTE) will always try to rebroadcast after a small delay based on SNR. If within that window, they hear a packet starting and it appears to be the packet they’re trying to rebroadcast, they will cancel this rebroadcast:

firmware/src/mesh/FloodingRouter.cpp

Lines 28 to 29 in f39a9c5

// cancel rebroadcast of this message *if* there was already one, unless we're a router/repeater!

if (Router::cancelSending(p->from, p->id))

So this already drastically limits the number of rebroadcasts. Based on your distinctSources metric, it looks like you didn’t take this into account, because this logic actually is similar but then for more than 1 distinct source, the probability becomes 0. In your case, you also base the probability on historical metrics, which might cause it to not try rebroadcasting at all even if you’re the only one that can serve a certain set of receivers.

Good point. I see that now. But I don't follow this part -> "for more than 1 distinct source, the probability becomes 0"
Psuedo code here

    float REDUNDANCY_INFLUENCE_FACTOR = 0.05f;
    int distinctSources = getDistinctSourcesCount(p->id);
    float redundancyFactor = 1.0f / (1.0f + distinctSources * REDUNDANCY_INFLUENCE_FACTOR);

    redundancyFactor = 1 / (1 + 1 * .05) = .952
    float probability = 1 * .952 = .952

To your point though, it seems like you're deduplicating using a somewhat similar mechanism (ie. random broadcast delay in the packet pool). Probably not a good metric to utilize.

Next, the distinctSources metric won’t work for several reasons. First, you determine based on the first time you receive a packet whether you will put it in the transmit queue, at which point this will always be 1. Subsequent packets will not enter perhapsRebroadcast() because of the wasSeenRecently() check. Furthermore, the sender in recentPackets is the from of a packet, which does not change when a different node rebroadcasts. It’s the original transmitter of a packet -- we don’t know who is rebroadcasting.

Yes... you're right. We'd have to add something to the packet to indicate the original from vs. the last repeating node. Wouldn't this be useful though? If we had that information, any node could more effectively understand how packets were traversing the mesh.

Now onto the metrics. First, why would having more neighbors decrease the chance of rebroadcasting? Depending on where the packet came from, if you have more nodes to serve than another node, shouldn’t the chance of rebroadcasting be higher for you? For example here, for a packet originating from either 0 or 5, it’s better that node 1 is rebroadcasting, and not node 5 or 0.

I think it depends actually. The number of neighbors may mean this node should broadcast more in the case that this node is the only node that is capable of serving the other nearby nodes. But it could also mean that this node is one of many in a dense mesh where everyone can essentially serve everyone and therefore you'd want to lower its propensity to rebroadcast.

What about something like a bloom filter, where the top N strongest immediate neighbors are added to the filter and passed along in the packet. When the next node receives the packet, it compares its own top N strongest immediate neighbors and the number of likely unique nodes it serves is an input to the probabilistic forwarding computation.

Let's assume we'd use 2 hashes per node (k = 2), a 64 bit total field size (m = 64) and a total of 20 entries max (n = 21), you'd end up with a false positive rate of ~30%. With 21 entries on a 3 hop configuration, you'd be able to record each nodes top 7 neighbors. And that assumes each is holding fully unique neighbor lists.

GUVWAF · 2024-12-22T07:14:24Z

But I don't follow this part -> "for more than 1 distinct source, the probability becomes 0"

With "this logic" I was referring to how it currently works in Meshtastic.

(ie. random broadcast delay in the packet pool). Probably not a good metric to utilize.

The random delay is taking from a contention window, which scales with SNR (after giving router/repeaters priority). Meaning that nodes further away will generally rebroadcast first, in order to minimize the amount of hops used to spread a packet.

We'd have to add something to the packet to indicate the original from vs. the last repeating node. Wouldn't this be useful though?

I propose to add this for the Next-Hop Router (#2856), although only the last byte of the relayer's node number, because we only have 2 bytes left in the header.

where the top N strongest immediate neighbors are added to the filter and passed along in the packet.

When you consider to add any overhead, in my opinion we should first simulate (https://github.com/meshtastic/Meshtasticator) whether it gives significant improvements above the current method in all kinds of scenarios.

medentem · 2024-12-22T11:48:57Z

Very interesting. I'll review that PR. And I'll also utilize the simulator. Thank you.

medentem and others added 13 commits December 15, 2024 19:06

adding userpref option for decoupling the auto discovery channel from…

2dc00ea

… auto position updates

fixed comparison

6172975

switched to channel hash for durability

89a3d3e

hash pref

dd1698a

fallback to parameter channel value

3af8316

revert

651c246

updated var name

f2eb049

Merge branch 'master' into master

4d372ce

Merge branch 'master' into master

04b658e

Merge branch 'master' into master

1301218

initial pass at probalistic forwarding based on current network condi…

2c74cef

…tions

added comments

02d6324

tuning params

bd57c82

garthvh requested review from GUVWAF and thebentern December 21, 2024 02:01

medentem and others added 2 commits December 20, 2024 20:39

routers and repeaters always forward

a61df99

Merge branch 'master' into probablistic-rebroacasting

30e4975

medentem closed this Dec 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probabilistic Forwarding - Using Heuristic Analysis of Network Conditions to Reduce Load #5629

Probabilistic Forwarding - Using Heuristic Analysis of Network Conditions to Reduce Load #5629

medentem commented Dec 21, 2024

GUVWAF commented Dec 21, 2024

medentem commented Dec 22, 2024

GUVWAF commented Dec 22, 2024

medentem commented Dec 22, 2024

Probabilistic Forwarding - Using Heuristic Analysis of Network Conditions to Reduce Load #5629

Probabilistic Forwarding - Using Heuristic Analysis of Network Conditions to Reduce Load #5629

Conversation

medentem commented Dec 21, 2024

GUVWAF commented Dec 21, 2024

medentem commented Dec 22, 2024

GUVWAF commented Dec 22, 2024

medentem commented Dec 22, 2024