Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transactions not being broadcasted #14669

Closed
vogelito opened this issue Jun 21, 2017 · 92 comments
Closed

Transactions not being broadcasted #14669

vogelito opened this issue Jun 21, 2017 · 92 comments

Comments

@vogelito
Copy link

vogelito commented Jun 21, 2017

System information

Geth version:

Geth
Version: 1.6.5-stable
Git Commit: cf87713dd42162861b7ed227f79f0638a33571df
Architecture: amd64
Protocol Versions: [63 62]
Network Id: 1
Go Version: go1.7
Operating System: linux
GOPATH=
GOROOT=/usr/local/go

OS & Version: Ubuntu 14.04.5 LTS, Trusty Tahr

Expected behaviour

eth.sendTransaction returns a hash and the transaction should be broadcasted to the network.

Actual behaviour

eth.sendTransaction returns a hash but the transaction is never broadcasted to the network.

Steps to reproduce the behaviour

This tends to happen when the network is congested (usually around large ICOs). But calling sendTransaction during this time returns transaction hashes and these seem to just get lost in the ether.

More background

We operate an ETH exchange and stopped withdrawals about 18 hours ago to prevent getting ourselves into a bigger mess and stop adding things to the queue. We had to upgrade and restart geth yesterday as 1.6.0 was refusing to sync and was losing peers. 1.6.5 is now syncing without issues.

When sending transactions we use dynamic gas prices, using the recommended value returned by eth.gasPrice.

Even after upgrading to 1.6.5 we see this behavior. Our node hasn't restarted since we upgraded to 1.6.5, we have txhashes returned by geth when calling sendTransaction, but txpool.content.pending[sending_address] returns undefined.

More questions

  1. When is it safe to credit funds back to users? I understand it's theoretically possible that after calling sendTransaction the txs were broadcasted and are still in some other node's txpool
  2. How do we prevent this from happening in the future?
  3. What else can we do to help you debug this issue?
@holiman
Copy link
Contributor

holiman commented Jun 21, 2017

Thanks for the report.

  1. It's obviously not in txpool.content.pending[sending_address]. How about txpool.content.queued[sending_address] ? If the tx is deemed non-executable, e.g. due to a nonce-gap, it would be wind up in queued instead of pending

Some answers:

  1. If you sign a transaction X to a user A , with nonce N. And then "lose" that transaction, then that transaction can be used in eternity , or until you create a new transaction Y that also has the nonce N, and is included in a block. The transaction Y can be to yourself, to clear that tx and 'burn' that nonce.
  2. You can keep a record of nonce : transaction_info , so you know which transaction has gone missing when you notice that there's a nonce-gap keeping the other transactions from executing.
  3. Provide details on how you submit transactions.
  • Sync or async?
  • Which rpc-method - eth_sendTransaction or eth_sendRawTransaction or personal... ?
  • Do you suffer from nonce-gaps (which you should, if transactions get lost), or not (which implies that transactions are dropped and the next one get's the same nonce).

@vogelito
Copy link
Author

vogelito commented Jun 21, 2017

Thanks @holiman.

I can confirm that the txs are not in txpool.content.queued[sending_address]. We actually monitor the length of txpool.content.queued[sending_address] and can confirm that we never see it increase, so this doesn't seem to be a nonce-gap issue.

We use eth_sendTransaction and I believe that nonces are somehow getting rewritten by geth. We do not manage nonces ourselves, we let geth manage them for us. We do the eth_sendTransaction call synchronous.

I have something that is repeatedly storing the output of txpool.content.pending[sending_address]. I have the output of 20 some transactions that appear on my dump but never made it into the blockchain. However, there are other, completely unrelated transactions, which did make it into the blockchain with the nonces of the 20 some transactions I have a record of.

Geth was not restarted during this time (pid has been up for 77305 seconds at the time of writing).

@holiman
Copy link
Contributor

holiman commented Jun 21, 2017

I have the output of 20 some transactions that appear on my dump but never made it into the blockchain.

What do you mean by "output" - the returned hash or the complete object(s) ?
Were the transactions larger than the others around it ('around' in terms of nonce-space) - I'm wondering if there was maybe too low balance for one, so it was quickly dropped, and then the next came in. Has the balance been hovering close to zero ?

@holiman
Copy link
Contributor

holiman commented Jun 21, 2017

I'd be interested in comparing one that failed (the fulll thing) with the one that replaced it. If you have the full details, you can mail one to me at martin.swende on the domain ethereum.org if you don't want to post it publically.

@vogelito
Copy link
Author

What do you mean by "output" - the returned hash or the complete object(s) ?
Were the transactions larger than the others around it ('around' in terms of nonce-space) - I'm wondering if there was maybe too low balance for one, so it was quickly dropped, and then the next came in. Has the balance been hovering close to zero ?

By output I meant the contents of calling txpool.contents.pending[sending_address]. So the complete objects. The balance was not even close to zero.

Sent you an email with the details.

@vogelito
Copy link
Author

Our node kept some of the transactions that have been invalidated by other transactions with the same nonce. They were obviously not mined, but they don't appear in in the txpool. @holiman I've sent you another email with these additional details.

@ucwong
Copy link
Contributor

ucwong commented Jun 22, 2017

@holiman What you mean about "lose" ? like not in the queue ? And where this txn will go , will it be included by blockchain in the future?

@holiman
Copy link
Contributor

holiman commented Jun 22, 2017

@holiman What you mean about "lose" ? like not in the queue ? And where this txn will go , will it be included by blockchain in the future?

I mean, if you sign a transaction, and later on cannot find that transaction - and it isn't included in a block, but has potentially already been broadcast anyway. That tx is valid forever, or until you replace it with another transaction with the same nonce.

@ucwong
Copy link
Contributor

ucwong commented Jun 22, 2017

@holiman Thank you for your reply. So what I should do is waiting, It will be included by blockchain some days, I think. right?

image

Nothing return with this command, anything dangerous?

@ucwong
Copy link
Contributor

ucwong commented Jun 22, 2017

@holiman how to send transaction with the same nonce ?

@holiman
Copy link
Contributor

holiman commented Jun 22, 2017

@ucwong if you have a similar issue, please either enter some details ( so we know if it's the same) or open a separate ticket.

You can send a transaction with the same nonce simply by specifying nonce:
var tx = { from : eth.accounts[0], to : xxx , nonce : some_nonce} ; eth.sendTransaction(tx). But you need to know the nonce of the transaction you're trying to replace.

@ucwong
Copy link
Contributor

ucwong commented Jun 22, 2017

@holiman Yes, I think I have the similar issue.
I sent eth and receive a txnId and recipient, I think it should be broadcasted to eth network. But I can't find it in blockchain.
https://etherscan.io/tx/0x638d8e91a55226a00f582e571f76cf1b08ee02ad991bf0a58254ef849a5ce46a
I can't confirm what is status of this txn now and what should I do.
btw, I only know the txnId but nonce. where can I find the nonce?

@holiman
Copy link
Contributor

holiman commented Jun 22, 2017

@ucwong I see you have a ticket at #14672 . Let's keep your info there for now, because I still think they are separate, since this ticket deals with massive number of transactions.

@ucwong
Copy link
Contributor

ucwong commented Jun 22, 2017

@holiman thank you

@vogelito
Copy link
Author

vogelito commented Jun 23, 2017

As an update:

We had roughly 70 transactions that disappeared as described above (geth returned a transaction hash, the transactions appeared at some point in the txpool then they were replaced by other transactions with the same nonce). When we send transactions through the json RPC, we send the following way (pseudo code, but all relevant bits in here):

tx = {
    "from": sending_address,
    "to": destination_address,
    "gas": 0xA410, // 42000,
    "gasPrice": eth_gasPrice(), // use whatever geth recommends
    "value": amount_to_send
};
hash = eth_sendTransaction(tx);

To reiterate, we let geth manage our nonces.

The above code is called via a single threaded queue processing system that simply monitors the queue and pops elements when/if there are any transactions waiting to be processed.

Given that we had those 70ish transactions that needed to reprocess (and that we had already built the infrastructure to check for missing ETH transactions and to record nonces, etc...), I decided to write something that would try to send those 70ish transactions as quickly as I could figure out how to via the json RPC interface of geth using an asynchronous, non-blocking, process. 70 transactions is not a whole lot, but every single transaction ended up in the ethereum blockchain without an issue.

The only times we've seen this issue is when the ethereum network is under high stress.

We start geth with the following parameters:
--rpc --rpcapi "personal,eth,web3" --targetgaslimit 1000000 --cache 1024

Happy to try some more parameters before the next ICO and report further.

@etscrivner
Copy link

etscrivner commented Jul 5, 2017

Can confirm that Coinbase is seeing this as well following the upgrade to v1.6.5 and continuing into v1.6.6.

The problem is solved by restarting or re-deploying the nodes and then rebroadcasting the transactions. When the logs are examined the nodes are happily syncing and downloading blocks -
they even broadcast the occasional transaction - but for some reason the majority of transactions are going into a blackhole from which they are never broadcasted.

Happy to collect more data / logs if they would be helpful in getting to the bottom of this. Currently we see this every 1-4 days and have taken to running a group of hot secondaries to quickly recover when this issue occurs. My wild guess would be that this is somehow related to transaction volume? Hence why it doesn't appear to be very common.

@krobertson
Copy link

@holiman do you think #14737 may resolve this issue? I was looking into it at Coinbase and noticed the new logic in master, and saw v1.6.7 was just released with it. Comparing 1.6.6 with master, it seemed like it could be possible that the transaction would be accepted, but then evicted before it is processed due to either gas price or capacity when network traffic is high.

@holiman
Copy link
Contributor

holiman commented Jul 12, 2017 via email

@vogelito
Copy link
Author

vogelito commented Jul 17, 2017

We have migrated to v1.6.7 as of this morning and will report any future findings.

Unfortunately, this morning (while still on v1.6.6) we saw geth return hashes for txs that didn't get broadcasted. However, this happened when the account had a balance close to 0, so it might be related to #14361 instead.

@krobertson
Copy link

We upgraded to v1.6.7 the day it came out and haven't seen the issue since then. We're continuing to monitor.

@vogelito
Copy link
Author

We unfortunately had to downgrade to v1.6.6 due to #14838

@vogelito
Copy link
Author

This issue has gotten worse as our transaction volume has increased.

@krobertson
Copy link

We've still been experiencing the issue as well. We've got in the habit of just regularly cycling our nodes.

@vogelito
Copy link
Author

We just updated to 1.7.0 to see if this solves the issue

@raj-wadhwa
Copy link

I'm also facing the same issue. Can you please confirm if the issue is solved with the latest version?
Thanks in advance.

@phutchins
Copy link

I believe I'm hitting this same issue as well. Sending out 5300 transactions leaves a period of time where my node must stay online for all of the transactions to go out. I'm on 1.7.0. I'm getting hashes from the send but if I shut my node down too soon after, some of the transactions don't go out.

@vogelito
Copy link
Author

We haven't seen this issue since upgrading to 1.7.0

@CockyCat
Copy link

I'm also facing the same issue.

@plavsicm
Copy link

plavsicm commented Dec 5, 2017

Hello,

I think I have the same issue. I wanted to transfer my ETHs from CEX.io to Poloniex and transaction was successfully generated and confirmed on CEX. Unfortunately, it was never seen on Poloniex (even though amount is higher than 0.5 ETH which is minimum for Poloniex). Is there a way to check what happened.

This is the transaction ID:
0xf63b25137e4e1321825275f97da8573f3ed6dcf86c29564aae4cb2ac561f13b5

And on etherscan.io I can't see it.
Can you please help or give guidance what should I do?
Thank you in advance.

Best regards,
Milan

@ApsOps
Copy link

ApsOps commented May 11, 2018

I've hit this multiple times as well.

Geth logs say:

INFO [05-09|13:53:25] Submitted transaction  fullhash=0xREDACTED recipient=0xREDACTED

But this txid doesn't show up on etherscan.

It's also not present in txpool.content.pending[sending_address] or txpool.content.queued[sending_address].

/ # geth version
Geth
Version: 1.8.7-stable
Git Commit: 66432f3821badf24d526f2d9205f36c0543219de
Architecture: amd64
Protocol Versions: [63 62]
Network Id: 1
Go Version: go1.10.1
Operating System: linux
GOPATH=
GOROOT=/usr/local/go

@vogelito
Copy link
Author

They should be present in eth.pendingTransactions.

Our current fix for this is to get the list of pending transactions and manually relay them to some of our other nodes.

@ApsOps
Copy link

ApsOps commented May 11, 2018

@vogelito I checked eth.pendingTransactions as well. They're not present.

Is there a possibility that these pending transactions are cleared when geth node is restarted?

@vogelito
Copy link
Author

Perhaps. The safest thing to do, in my opinion, is to resubmit your transaction by specifying the nonce that your lost transaction should've had.

@adamschmideg
Copy link
Contributor

@vogelito are you still having the original issue?

@vogelito
Copy link
Author

@adamschmideg unclear to be honest. We have a secondary parity node and a script that relays transactions to it.

This has been working fine for months but i’d be happy to turn it off and monitor for any problems

@adamschmideg
Copy link
Contributor

@vogelito I"m going to mark this issue as a nice to have since it's not critical for you and not many other users reported it as a problem. Please, let me know if you disagree.

@vogelito
Copy link
Author

Sounds good to me. Thanks

@fjl
Copy link
Contributor

fjl commented Jun 13, 2019

Related: #19705

@vkuznecovas
Copy link

Can confirm that I'm also encountering this issue. We're currently in test phase on goerli but occasionally are seeing that transactions do not get broadcast to the network.

I'm using go-ethereum/ethclient sendTransaction method.

I'm getting back a transaction hash, but unable to find the transaction afterwards. They just seem to vanish into thin air.

It seems to be related to congestion on the network as it fails with higher probabilities during times of heavy congestion(when the transactions that do appear on the network tend to stay in a pending state for quite a while) although this is not a guarantee.

Also seems more likely to happen if I'm making quite a few transactions in a short period of time but can not reliably reproduce.

@developer2belfrics
Copy link

@holiman The transaction is sent to the geth server. I confirmed by eth.getTransaction(hash). from eth.getTransactionReceipt(), i got null. Even I checked in txpool.content.queued[sending_address] and txpool.content.pending[sending_address] but the return response is undefined. Can you please tell me what went wrong? There are two transactions made on feb 14th and 20th but still not posted to the network.

@pblab-dev
Copy link

Some erro here.

@nnzo
Copy link

nnzo commented Jun 4, 2020

Having exactly this problem. #21167

@rjl493456442
Copy link
Member

It's been fixed in v1.9.16

@coinwalletdev
Copy link

coinwalletdev commented Nov 13, 2020

-- (moved to #21385 (comment))

@zeleniy
Copy link

zeleniy commented Jun 4, 2021

I have the same problem as topic starter. There is a lot of similar issues, but all of them closed without any solution:

After sending transaction i get data structure described in #21167 which contains hash but it never appears on etherscan. Is that ok? What sence to return tx data of transaction never created. Or if it was created where it is?

@konradkonrad
Copy link
Contributor

We see that issue, too. However, I don't think resurrecting this issue is the right way to deal with this...

@zeleniy
Copy link

zeleniy commented Jun 7, 2021

Guys, what are you working on? Distributed data storage which doesn't provide storage ability? Maybe it will save data, maybe no. What is the right way to deal with this? Open one more issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests