Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth_getFilterChanges returns "filter not found" #11589

Closed
5 of 11 tasks
juliangruber opened this issue Jan 22, 2024 · 11 comments
Closed
5 of 11 tasks

eth_getFilterChanges returns "filter not found" #11589

juliangruber opened this issue Jan 22, 2024 · 11 comments
Labels

Comments

@juliangruber
Copy link
Member

juliangruber commented Jan 22, 2024

Checklist

  • This is not a security-related bug/issue. If it is, please follow please follow the security policy.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
  • I am running the Latest release, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to lotus.

Lotus component

  • lotus daemon - chain sync
  • lotus fvm/fevm - Lotus FVM and FEVM interactions
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt/WinningPoSt)
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

Lotus Version

lotus deployed to glif

Repro Steps

$ curl https://api.node.glif.io/rpc/v0 -d'{"jsonrpc":"2.0","id":1,"method":"eth_newFilter","params":[{"topics":["0x2e84339036b9caef6da03565dd37a42d041d8af759ccfddc01625856146ce473"],"addresses":["0x811765acce724cd5582984cb35f5de02d587ca12"]}]}'
{"jsonrpc":"2.0","result":"0x43baae26e5514378adc824ca03b261c100000000000000000000000000000000","id":1}
$ sleep 10 # `sleep 0` and `sleep 5` also don't work
$ curl https://api.node.glif.io/rpc/v0 -d'{"jsonrpc":"2.0","id":1,"method":"eth_getFilterChanges","params":["0x43baae26e5514378adc824ca03b261c100000000000000000000000000000000"]}'
{"jsonrpc":"2.0","id":1,"error":{"code":1,"message":"filter not found"}}

Describe the Bug

After upgrading to ethers@6, it's now failing to subscribe to events. See repro steps above. It responds with "filter not found" although the id returned from eth_newFilter was used.

Logging Information

This was on glif. Same results on chain.love.

I tried reproducing locally, but failed on this:

{"jsonrpc":"2.0","id":1,"error":{"code":-32601,"message":"method 'eth_newFilter' not found"}}

I did already set EnableEthRPC = true

@juliangruber
Copy link
Member Author

For anyone else having this issue, https://github.com/filecoin-station/on-contract-event/tree/main is a temporary workaround

@juliangruber
Copy link
Member Author

Thanks to @dumikau for finding this code path in lotus-gateway, which is most likely the problem. When connecting to lotus directly, everything works as expected.

/* FILTERS: Those are stateful.. figure out how to properly either bind them to users, or time out? */

func (gw *Node) EthGetFilterChanges(ctx context.Context, id ethtypes.EthFilterID) (*ethtypes.EthFilterResult, error) {
	if err := gw.limit(ctx, stateRateLimitTokens); err != nil {
		return nil, err
	}

	ft := statefulCallFromContext(ctx)
	ft.lk.Lock()
	_, ok := ft.userFilters[id]
	ft.lk.Unlock()

	if !ok {
		return nil, filter.ErrFilterNotFound
	}

	return gw.target.EthGetFilterChanges(ctx, id)
}

@rvagg
Copy link
Member

rvagg commented May 6, 2024

lotus/gateway/handler.go

Lines 89 to 96 in 1b2dde1

func (h RateLimiterHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
r = r.WithContext(context.WithValue(r.Context(), perConnLimiterKey, h.limiter))
// also add a filter tracker to the context
r = r.WithContext(context.WithValue(r.Context(), statefulCallTrackerKey, newStatefulCallTracker()))
h.handler.ServeHTTP(w, r)
}

Every HTTP request gets its own new statefulCallTracker, which is apparently by design and intended only for websocket connections:

lotus/gateway/proxy_eth.go

Lines 647 to 648 in 1b2dde1

// called per request (ws connection)
func newStatefulCallTracker() *statefulCallTracker {

The problem is that filters are long-lived inside a Lotus node and it's perfectly valid to do this via non-websocket requests.

It seems to me that the desire here is to partition the filter and subscription space per-user, but that's not really possible to achieve with the way this all works.

However, filter IDs are generated via UUIDv4, so we have some guarantees about uniqueness and guess-ability already. I'm not sure what other leakage we would try and protect against in a public gateway. So, we could either share a statefulCallTracker across all requests, or just do away with it entirely since it just proxies to the original calls which do essentially the same map look-up operation.

@magik6k am I missing something from 22231dc and 1286d76? Is there a reason I'm missing that we can't just pass these through without checking?

@bajtos
Copy link

bajtos commented May 16, 2024

FWIW, it's easy to configure Ethers v6 ethers.JsonRpcProvider to use the old polling-based approach that uses the well-supported RPC method eth_getLogs:

const provider = new ethers.JsonRpcProvider(fetchRequest, undefined, {
  polling: true
})

@rvagg
Copy link
Member

rvagg commented May 17, 2024

IMO the action item here is to remove the stateful call tracker from this call path and just pass it through to the node; I don't see a good reason it's gated.

@rvagg
Copy link
Member

rvagg commented Jul 26, 2024

Looking at this again; the tracking was originally introduced in #9863, and then extended in #10027 to cover subscribe.

  • userFilters is only used to track the number of filters applied per connection. EthMaxFiltersPerConn is fixed to 16, and when the number of filters reaches this number for a particular connection then they'll be rejected.
  • userSubscriptions is only used to track the number of Subscribe calls and also check it against EthMaxFiltersPerConn.

It seems to me that the desire here is to partition the filter and subscription space per-user

My original comment from above is wrong. The purpose of these checks is to limit the number of filters installed on a lotus node for each "user", which is an appropriate thing for a gateway to do because of the cost of having active filters.

This works find when using websockets, but we currently don't have any per-IP tracking, and even if we did we'd have to deal with people using reverse proxies in front of lotus-gateway (like glif does). We're then in the realm of deciding whether to accept X-Forwarded-For or not (fine if you have a reverse proxy, dangerous if you don't). We can't give cookies because people are using this from curl or libraries that don't support cookies (making an assumption here about ethers).

It seems like glif doesn't expose websockets, but api.chain.love does, so this ~works (at leas it doesn't error, I don't know an address to use to get something more active):

import { ethers } from 'ethers'

const provider = new ethers.WebSocketProvider('wss://api.chain.love/rpc/v1')

console.log('provider:', provider)

const filterId = await provider.send('eth_newFilter', [{
    address: ['0x811765acce724cd5582984cb35f5de02d587ca12'],
    topics: []
}])

console.log('filterId:', filterId)

provider.on('block', async() => {
    const logs = await provider.send('eth_getFilterChanges', [filterId])
    console.log('logs:', logs)
})

I think that we might be forced to just block these stateful API endpoints from HTTP like suggested in #11153 unless we want to go down the rabbit hole of per-IP tracking. We could also be encouraging public API providers to offer websockets option.

I'd really like to know how this is handled in Ethereum-land. How do public providers offer this normally?

@rvagg
Copy link
Member

rvagg commented Jul 26, 2024

I was thinking that something like the Arbitrum option gets us around the limit problems with this. We get rid of the per-connection limit entirely but setup a liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.

I wouldn't mind offering more options for public API providers, but this is something we could evolve over time. And already now they have the option of excluding these APIs from what they offer with a reverse proxy and they could even do API key gating too.

@rvagg
Copy link
Member

rvagg commented Jul 27, 2024

liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.

Alas we already have that with FilterTTL in the lotus node itself, which defaults to 24 hours. We probably want to document that this should be reduced dramatically for multi-tenant nodes.

@rvagg
Copy link
Member

rvagg commented Jul 31, 2024

After some discussion on Slack I think that the way forward here is to:

@rvagg
Copy link
Member

rvagg commented Aug 1, 2024

This should be resolved in #12327

@rjan90
Copy link
Contributor

rjan90 commented Sep 12, 2024

Closing as completed as this should be resolved in #12327, which has been shipped in Lotus v1.29.0 which most RPC-providers has updated to now. Please reopen if you still encounter this issue @juliangruber

@rjan90 rjan90 closed this as completed Sep 12, 2024
@github-project-automation github-project-automation bot moved this to 🎉 Done in FilOz Sep 12, 2024
@rjan90 rjan90 moved this from 🎉 Done to ☑️Done (Archive) in FilOz Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ☑️ Done (Archive)
Development

No branches or pull requests

4 participants