Skip to content
This repository has been archived by the owner on May 5, 2023. It is now read-only.

Use the request's configured lookup if client-side lookup is enabled #72

Merged
merged 1 commit into from
Feb 15, 2022

Conversation

pimterry
Copy link
Contributor

HTTP requests can be configured with a lookup option, which provides a method that should be used instead of dns.lookup to resolve names. This is useful for lots of cases including DNS caching and custom resolution.

When using the Socks proxy agent with client-side DNS lookups, this was ignored, and dns.lookup was always called instead. In some environments (e.g. my use case, where I use custom resolution to resolve certain otherwise-unresolvable hostnames) this means that requests will fail to resolve only when using a socks proxy, but work fine the rest of the time.

This PR is a small change to fix that, by using the lookup param in preference to the default lookup method if it's set.

@pimterry pimterry changed the title Use the requests configured lookup if client-side lookup is enabled Use the request's configured lookup if client-side lookup is enabled Oct 29, 2021
@Kikobeats
Copy link
Collaborator

Hey @pimterry thanks for this PR, I want to integrate it 🙂

Just a comment: do you think we can use cacheable-lookup rather than the native dns.lookup?

Essentially the module is wrapping dns.lookup providing cache capabilities.

Also, we can use dnsCache parameter to provide one by the user or just new CacheableLookup() if not present

@pimterry
Copy link
Contributor Author

Just a comment: do you think we can use cacheable-lookup rather than the native dns.lookup?

Hi @Kikobeats - it's possible to do that, up to you, but honestly I wouldn't recommend it.

In my case, I am actually using a cacheable-lookup instance as the lookup option for my requests, and I've also previously written an article talking about why more people should use it - I fully agree that it's useful!

Even though it can be useful however, adding another layer of DNS caching in-process can be really confusing if you're not expecting it, and there are some subtle differences between the internal lookup behaviour of cacheable-lookup & dns.lookup, because cacheable-lookup primarily uses dns.resolve internally (the node docs has some context on why this isn't the same as dns.lookup). There are cases where cacheable-lookup will give you different results to dns.lookup, and cases where the performance will be consistently worse than just calling dns.lookup.

I do think lots of Node.js apps should cache their DNS requests, but given the tradeoffs I don't think it should enabled for everybody by default in unrelated contexts (e.g. when enabling a proxy server). Notably even Got (the library cacheable-lookup was originally built for) doesn't seem to enable it by default - it just offers it via an option for people who need DNS caching.

Does that make sense? Either way, with this PR in place, anybody who does want to use it can easily choose to so for themselves.

@Kikobeats
Copy link
Collaborator

Oh wow, thanks for the awesome explanation!

I think you're right, there are some scenarios where adding another caching layer is not worth it, although to be honest the point of my suggestion was in the way to bring to users the ability to customize the DNS layer, since there is no possible right now.

During the weekend, I released v6.2.0-beta.0 that it introduces dnsCache for that purpose:

https://github.com/TooTallNate/node-socks-proxy-agent/blob/master/src/index.ts#L138

What can be better there is to use the default DNS behavior rather than create a cacheable-lookup instance. WDYT?

Although now I have a more high level question: why this library should to take care about DNS resolution? I mean, under which circumstances make sense to sets shouldLookup: true?

@pimterry
Copy link
Contributor Author

why this library should to take care about DNS resolution? I mean, under which circumstances make sense to sets shouldLookup: true?

Sorry, just to be clear: the option this PR is introducing is just called lookup, and it's a callback, not a boolean.

shouldLookup here is internal state, not an option, and it's preexisting logic. It was a field called lookup, but I've renamed it because now that we have the option with the same name that becomes very confusing! This shouldLookup field is required because for some kinds of proxying (socks4a and socks5h) we don't do any lookup at all, ever, because that should happen on the remote end of the proxy not our end.

Hope that's clearer. Anyway, why would you want to configure a custom lookup callback? A few reasons:

  • To use custom DNS servers, e.g. inside enterprise office environments you often need to use the company DNS for all lookups, or for internal intranet sites. Alternatively, inside some microservice production environments you'll want to customize DNS servers to use tools like Consul for automatic discovery of other services.
  • To override some hostnames just inside your application, without using DNS at all. Because lookup is just a callback, you can add your own logic to match and override the results (or block lookup completely) for specific known hostnames, and then fall back to DNS like normal for everything else.
  • To support DNS lookup in offline airgapped environments, by using an internal name to IP mapping (either hardcoded, or just a file somewhere)
  • To add DNS timeouts, when you know you have very slow DNS servers and you'd rather fail requests than wait for them.
  • To add DNS caching, maybe using cacheable-lookup or maybe using any other library of your choice.
  • To rotate addresses used when there are multiple servers available, so that every request goes to a different server to even out the load they receive (note that this is the opposite of caching - every result must be different!)

In my specific example, some of the requests that I'm making are going to Docker containers on the machine, but I'm running code outside of Docker. To make this work, I run my own DNS server which connects to Docker to get all the names and possible addresses, and then I use this server to lookup IPs during requests to Docker containers, and use a SOCKS proxy that's hosted inside Docker to forward these requests (because containers are not externally routeable in virtualized Docker environments like Windows & Mac).

the point of my suggestion was in the way to bring to users the ability to customize the DNS layer, since there is no possible right now.

I released v6.2.0-beta.0 that it introduces dnsCache for that purpose:

I see what you're going for, and being able to configure DNS is definitely useful, but I don't think this library needs lots of options like this to do it for you. DNS logic is almost entirely independent from proxying logic, and it's not necessary or helpful to mix the two together.

I would recommend looking at the standard Node.js APIs and how they handle this. Node has a standard option that's used everywhere to configure DNS settings: you always just pass a custom lookup function. This is the same option that's supported on socket.connect, tls.connect, http.request, http2.connect and everywhere else.

With this PR, that same option works automatically here too. This is a great design by the Node team imo, because it allows users to do anything. By passing a callback, they can provide a lookup function that does caching, or a callback with custom overrides for some hostnames, or a callback that blocks all lookups, or a callback that uses custom DNS servers, or any other combination of that logic. It's just a callback that takes a name and returns an address.

I would keep it simple. Don't add a dnsCache option, don't add dnsResolver or dnsRotate or dnsOverride options or anything else - just let users pass a callback and leave it up to them.

@Kikobeats
Copy link
Collaborator

agree with all 👏

can you resolve the conflicts? I can do it as well

Client-side lookup is used for socks4 and socks5, but not socks,
socks4a or socks5h, which delegate lookup to the remote proxy itself.

This removes the dnsCache option that was briefly added in 6.2.0-beta.0
in favour of a more generic solution using the standard Node options,
but includes a test that adds caching using this 'lookup' option as an
alternative to using dnsCache.
@pimterry
Copy link
Contributor Author

@Kikobeats this is now updated. It has the same changes as before, but it now replaces the dnsCache option, and includes an extra test showing how you can use lookup with cachable-lookup to configure your own custom DNS caching any way you like.

As part of that, cacheable-lookup has moved to now be a dev-dependency, and there's a new dev dependency on dns2 which we use to create a quick demo DNS server in the tests.

@Kikobeats Kikobeats merged commit 1d327b2 into TooTallNate:master Feb 15, 2022
@Kikobeats
Copy link
Collaborator

What a great PR, thanks for this!

Shipped as [email protected], I want to test it for a while before land stable channel, it looks promising 🙂

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants