Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter local addresses (for WAN) and localhost addresses (for LAN) #839

Merged
merged 2 commits into from
Jun 5, 2023

Conversation

marten-seemann
Copy link
Contributor

Filters are only set for the Dual DHT. Is that correct? Do they also need to be set for other ways to initialize the DHT?

@marten-seemann marten-seemann requested review from a team and guillaumemichel as code owners May 17, 2023 08:52
Copy link
Contributor

@Jorropo Jorropo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems duplicated with PrivateRoutingTableFilter and PublicRoutingTableFilter. it looks duplicated but this filtering records sent

@Jorropo Jorropo self-requested a review May 17, 2023 09:07
@guillaumemichel
Copy link
Contributor

What do you think of setting the Public IP Addr Filter as the default address filter?

// Defaults are the default DHT options. This option will be automatically
// prepended to any options you pass to the DHT constructor.
var Defaults = func(o *Config) error {
o.Validator = record.NamespacedValidator{}
o.Datastore = dssync.MutexWrap(ds.NewMapDatastore())
o.ProtocolPrefix = DefaultPrefix
o.EnableProviders = true
o.EnableValues = true
o.QueryPeerFilter = EmptyQueryFilter
o.RoutingTable.LatencyTolerance = time.Minute
o.RoutingTable.RefreshQueryTimeout = 1 * time.Minute
o.RoutingTable.RefreshInterval = 10 * time.Minute
o.RoutingTable.AutoRefresh = true
o.RoutingTable.PeerFilter = EmptyRTFilter
o.MaxRecordAge = providers.ProvideValidity
o.BucketSize = defaultBucketSize
o.Concurrency = 10
o.Resiliency = 3
// MAGIC: It makes sense to set it to a multiple of OptProvReturnRatio * BucketSize. We chose a multiple of 4.
o.OptimisticProvideJobsPoolSize = 60
return nil
}

The private DHT could then override this default filter with the IP Loopback filter.

@marten-seemann
Copy link
Contributor Author

@guillaumemichel, yes, that makes sense.

I'm wondering if we should be more careful which addresses we send out as well. We probably shouldn't even tell other nodes about private addresses, if we're connected to them over the public internet. I suggest to do that in a follow-up PR though.

@guillaumemichel
Copy link
Contributor

Many tests are failing when setting the Public IP address filter as a default option (probably because the tests create IpfsDHTs using private IP addresses).

It is fine for now to leave it on the dual DHT only (no default address filter). This way we won't break anyone that may use a (non-dual) DHT with private addresses.

I added basic tests for the addrFilter, please have a look and add more checks if some are missing. I am good to merge after that

Copy link
Member

@sukunrt sukunrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guillaumemichel the tests look good to me.

@guillaumemichel
Copy link
Contributor

Thanks @sukunrt!

@Jorropo are you good to merge?

@Jorropo
Copy link
Contributor

Jorropo commented Jun 5, 2023

@guillaumemichel I'm good if you are.

@Jorropo Jorropo removed their request for review June 5, 2023 15:51
@guillaumemichel guillaumemichel merged commit 8d07d57 into master Jun 5, 2023
@guillaumemichel guillaumemichel deleted the addr-filter branch June 5, 2023 15:53
@BigLep BigLep mentioned this pull request Jun 5, 2023
@sukunrt
Copy link
Member

sukunrt commented Jun 26, 2023

@Jorropo @guillaumemichel
What's required to get this landed in kubo?

@hsanjuan
Copy link
Contributor

Hey, so this is causing my tests in cluster to break because I create a swarm of dual-dht hosts on localhost. It was very hard to arrive here.

I don't fully understand the rationale on why Dual DHT will ignore localhost nodes now. Seems a bit arbitrary (localhost becomes second class versus everyone else in the LAN?). Probably not many people hit this, but when they hit it they will have a hard time figuring out why those dht lookups fail.

@Jorropo
Copy link
Contributor

Jorropo commented Aug 10, 2023

The LAN dht ignores localhost in order to not announce Localhost IPs to other nodes on your LAN (which cause libp2p to first attempt all localhost addresses before trying the LAN ones).
I think what happen in your case is that you have exclusively localhost addresses so all addresses are being filtered out which is a usecase we overlooked.

As a workaround you can add LAN ips to all the Kubo processes on the same machine, which has similar performance as running through localhost (MTU will be worst but QUIC does not use the higher MTU of localhost anyway)

I guess you want a triple DHT setup where you have a localhost only DHT (since we would like to avoid advertising localhost addresses on the DHT),
ideally Dual wouldn't exists anymore and you could do that with a custom Kubo config using the composable routing helpers but we never finished the composable refactor.

@hsanjuan
Copy link
Contributor

Understood. The fix is very simple for me (just tests setting dual.LanDHTOption(dht.AddressFilter(nil)),).

Given the explanation, it's better the way it is now.

@hsanjuan
Copy link
Contributor

It's also an issue coming from test-local-hosts being setup to explicitly need DHT lookups for dials, coming from a time when testing the rest of the ipfs stack was more important. I don't think anyone is going to do it this way nowadays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants