Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bizarre StayRTR behavior #84

Merged
merged 1 commit into from
Feb 3, 2023
Merged

Fix bizarre StayRTR behavior #84

merged 1 commit into from
Feb 3, 2023

Conversation

job
Copy link
Member

@job job commented Feb 1, 2023

Previously StayRTR would copy the client's Session ID back into the Cache Response to the client, even though the server's internal Session ID was something different.

The purpose of the Session ID is to help both client and server understand whether they are synchronized or not. There are two opportunities to fix desyncs: if the server recognises the client is desynced, the server informs the client (through an Error Report) to reconnect and and send a Reset Query. If the client recognises it is out of sync with the server, the client can send a Reset Query.

Related #82

@job job requested a review from benjojo February 1, 2023 14:37
@ties ties mentioned this pull request Feb 1, 2023
Copy link
Collaborator

@ties ties left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Strict local improvement. This is also a hard class of bug to catch.

Does it fix #82?

…d report an error

Previously StayRTR would copy the client's Session ID back into the Cache
Response send to the router, even though the cache's internal Session ID
was something different.

The purpose of the Session ID is to help both router and cache understand
whether they are synchronized or not. There are two opportunities to fix
desyncs: if the cache recognises the router is desynced, the cache informs
the router (through an Error Report) to reconnect and send a Reset Query.
If the router recognises it is out of sync with the cache, the router can
send a Reset Query.

According to RFC 8210 section 5.1 the cache should send "Corrupt Data" when
a router sends a Serial Query with an unknown Session ID:

```
  Session ID:  A 16-bit unsigned integer.  When a cache server is
    started, it generates a Session ID to identify the instance of the
    cache and to bind it to the sequence of Serial Numbers that cache
    instance will generate.  This allows the router to restart a
    failed session knowing that the Serial Number it is using is
    commensurate with that of the cache.  If, at any time after the
    protocol version has been negotiated (Section 7), either the
    router or the cache finds that the value of the Session ID is not
    the same as the other's, the party which detects the mismatch MUST
    immediately terminate the session with an Error Report PDU with
    code 0 ("Corrupt Data"), and the router MUST flush all data
    learned from that cache.
```

Reformat with gofmt from Ties
@job job force-pushed the send_error_to_desync_clients branch from 450f548 to d5be698 Compare February 3, 2023 21:41
@job
Copy link
Member Author

job commented Feb 3, 2023

Looks good. Strict local improvement. This is also a hard class of bug to catch.

Thanks

Does it fix #82?

#82 appears to be a different unrelated issue. @lamehost and myself are still working to collect sufficient data.

@job job merged commit 1319bef into master Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants