Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust handling of version-session_id-serial touple #110

Merged
merged 3 commits into from
Aug 7, 2024

Conversation

cjeker
Copy link
Contributor

@cjeker cjeker commented Dec 24, 2023

RFC 8210 has this in Section 5.1

Note that sessions are specific to a particular protocol version.
That is, if a cache server supports multiple versions of this protocol,
happens to use the same Session ID value for multiple protocol
versions, and further happens to use the same Serial Number
values for two or more sessions using the same Session ID but
different Protocol Version values, the Serial Numbers are not
commensurate. The full test for whether Serial Numbers are
commensurate requires comparing Protocol Version, Session ID,
and Serial Number. To reduce the risk of confusion, cache servers
SHOULD NOT use the same Session ID across multiple protocol
versions, but even if they do, routers MUST treat sessions with
different Protocol Version fields as separate sessions even if they
do happen to have the same Session ID.

This changes the code to create session ids and spacing them 100 apart from each other.
With this the Serial Query will fail for a system that does a session down/upgrade.

@job
Copy link
Member

job commented Dec 24, 2023

So we were not reading the RFC carefully enough all this time?

I guess this mitigates the problem I outlined to some degree?

@cjeker
Copy link
Contributor Author

cjeker commented Dec 24, 2023

Yes but too be honest it did not really matter until now since nothing uses router keys which was the data addition in v1.

I'm not surprised; the RTR RFC are IMO sub-standard since stuff is hidden in unrelated sections that people just glance over. Also a lot is left out giving implementations too much wiggle room for unexpected behaviour.

@ties
Copy link
Collaborator

ties commented Dec 24, 2023

I'm not surprised; the RTR RFC are IMO sub-standard since stuff is hidden in unrelated sections that people just glance over. Also a lot is left out giving implementations too much wiggle room for unexpected behaviour.

And it has some choices where the reason may have been abundantly clear when written if you went through the process, but are vague now. This is one.

When I read this originally another idea came up though: if you can extract a (session, revision) tuple in the metadata of an RP, you can have multiple rtr servers be in sync when using the same data as input.

@cjeker
Copy link
Contributor Author

cjeker commented Dec 25, 2023

Even if the session_id is stored in the metadata it would require one session id by RTR version supported.
This metadata is normally generated by a tool like rpki-client and would require rpki-client to run with a well
defined session_id so that a stable number can be put into the metadata. In short this just moves the problem but does not solve it.

Tu run with a fixed session_id it is easier to use a command line argument in stayrtr. Even then I think the system should create 3 session_ids based on this initial number.
The reason for this is to prevent a router to keep stale data on a session downgrade. E.g. router starts with version 2 but then opens a new session downgraded to version 1. If the cache is not flushed the ASPA entries would stick around forever.

One could argue that the version-session_id tuple should be used instead but that requires the server to keep client information around after a RTR session is closed. This is a lot of complexity that can be avoided by using different session_ids per version.

@@ -829,7 +823,7 @@ func run() error {

if *Bind != "" {
go func() {
sessid := server.GetSessionId()
sessid := server.GetSessionId(protoverToLib[*RTRVersion])
log.Infof("StayRTR Server started (sessionID:%d, refresh:%d, retry:%d, expire:%d)", sessid, sc.RefreshInterval, sc.RetryInterval, sc.ExpireInterval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sessionID per version should be printed in the log.Info() line

@ties
Copy link
Collaborator

ties commented Dec 25, 2023

Even if the session_id is stored in the metadata it would require one session id by RTR version supported.
This metadata is normally generated by a tool like rpki-client and would require rpki-client to run with a well
defined session_id so that a stable number can be put into the metadata. In short this just moves the problem but does not solve it.

It does move the problem: But you can have n instances reading their (rpki-client-disk-state-uuid, sequence-number) from the same source. And then they would not have historic deltas unless they were running. But I think this architecture would work w/ tcp loadbalancers.

Even then I think the system should create 3 session_ids based on this initial number.
The reason for this is to prevent a router to keep stale data on a session downgrade. E.g. router starts with version 2 but > then opens a new session downgraded to version 1. If the cache is not flushed the ASPA entries would stick around forever.

I think this is a legitimate improvement, LGTM.

sessids := make([]uint16, 0, int(configuration.ProtocolVersion) + 1)
s := GenerateSessionId()
for i := 0; i <= int(configuration.ProtocolVersion); i++ {
sessids = append(sessids, s + uint16(100 * i))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this overflow if s for example is 0xFFFF? I guess this is an explicit property of Go, right? https://stackoverflow.com/questions/34704843/on-purpose-int-overflow/34704898#34704898

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overflow is handled as expected since we use uint16 as type for all session ids variables.
When you run the code below the output is:
65280
65380
65480
44
144
244

Code

package main

import "fmt"

func main() {
	s := uint16(0xff00)
	for i := 0; i < 6; i++ {
		n := s + uint16(100 * i)
		fmt.Println(n)
	}
}

@job
Copy link
Member

job commented Aug 6, 2024

@cjeker can you rebase?

Rand is randomly initalized on startup so no need to reshuffle it.
Surprised this code did not xor the PID into the seed for more profit
Just because there are slices and maps you don't need to use them for
something as simple (myserial - theirserial) to know which delta to send.
The version-session_id-serial touple defines the cache state.
When a client connects with a different version its cache is no
longer in sync and this is the simplest way to enforce this.
@cjeker
Copy link
Contributor Author

cjeker commented Aug 6, 2024

Too tired to rebase without fail. I double check in the morning but I think this is the right bits.

@cjeker
Copy link
Contributor Author

cjeker commented Aug 7, 2024

Too tired to rebase without fail. I double check in the morning but I think this is the right bits.

Rechecked the rebase and all is fine in the last version.

@job job merged commit deaee7e into bgp:master Aug 7, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants