Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TBA restarts/transient errors more gracefully #9

Open
nmlorg opened this issue Aug 16, 2024 · 3 comments
Open

Support TBA restarts/transient errors more gracefully #9

nmlorg opened this issue Aug 16, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@nmlorg
Copy link
Owner

nmlorg commented Aug 16, 2024

Despite what I recently suggested, TBA does go wonky every now and then, flooding the logs with both:

2024-08-13 20:16:49,251 ERROR loop.py:55] Ignoring uncaught error while polling:
Traceback (most recent call last):
  File "lib/python3.10/site-packages/ntelebot/loop.py", line 45, in _poll_bot
    updates = bot.get_updates(offset=offset, timeout=timeout)
  File "lib/python3.10/site-packages/ntelebot/bot.py", line 75, in __call__
    raise ntelebot.errors.Error(data)
ntelebot.errors.Error: {'ok': False, 'error_code': 429, 'description': 'Too Many Requests: retry after 5', 'parameters': {'retry_after': 5}}

and:

2024-08-13 20:16:51,556 ERROR loop.py:55] Ignoring uncaught error while polling:
Traceback (most recent call last):
  File "lib/python3.10/site-packages/ntelebot/loop.py", line 45, in _poll_bot
    updates = bot.get_updates(offset=offset, timeout=timeout)
  File "lib/python3.10/site-packages/ntelebot/bot.py", line 75, in __call__
    raise ntelebot.errors.Error(data)
ntelebot.errors.Error: {'ok': False, 'error_code': 502, 'description': 'Bad Gateway'}

mixed together. (I assume the latter causes rapid retries that trigger the former.)

To date, these only seem to have happened during getUpdates polls (which are designed to restart every 10 seconds no matter what), but I think I'd like to go ahead and retry at the ntelebot (versus metabot) level just in case this ever kills an actual mutation.

I'm not sure if this should be done at the requests level (that applies rate limits to things like api.weather.gov too), or just to ntelebot.bot._Request.__call__.

@nmlorg nmlorg added the bug Something isn't working label Aug 16, 2024
@nmlorg nmlorg self-assigned this Aug 16, 2024
nmlorg added a commit that referenced this issue Aug 20, 2024
I am not convinced this is strictly necessary — the underlying urllib3.connectionpool.HTTPConnectionPool is explicitly marked as thread-safe, but lots of pieces along the way seem to have had questions documented online that don't seem to have definitive answers. I might want to sit down and dig through this all myself one day. But, for now, because of the ambiguity and because I am not entirely sure how reusing a connection pool in a situation with potentially more than 10 long-lived connections at a time (one `getUpdates` per bot) might affect things, I think this should gain some benefit without introducing any problems.

I intend for a followup to introduce rate limiting and retries into ntelebot.requests._LOCAL.session (see #9).

I also intend for ntelebot.requests to be more or less a drop-in replacement for the top-level requests module. (You should be able to replace "import requests" with "from ntelebot import requests" with no other code changes, though for now I'm explicitly changing "requests.*" references to "ntelebot.requests.*" to avoid ambiguity.)

Release as 0.4.4.
@nmlorg
Copy link
Owner Author

nmlorg commented Aug 21, 2024

My test bot's 6 a.m. announcement didn't come through until 6:10 today.

2024-08-21 06:00:02,396 INFO reminders.py:42] Running periodic.
⋮
2024-08-21 06:00:14,419 INFO reminders.py:166] Sending reminder to -40….
2024-08-21 06:00:14,419 INFO bot.py:57] 3980803 https://api.telegram.org/bot44…/sendphoto {'chat_id': '-40…', 'photo': 'https://ssl.gstatic.com/calendar/…', 'caption': 'There are a bunch of events coming up:\n\n<b>…</a>', 'parse_mode': 'HTML', 'disable_notification': True}
2024-08-21 06:00:14,420 INFO requests.py:15] 3980803 send https://api.telegram.org/bot44…/sendphoto {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:00:14,580 ERROR loop.py:72] Ignoring uncaught error while dispatching:
Traceback (most recent call last):
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 1374, in getresponse
    response.begin()
  File "/usr/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lib/python3.11/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/util/retry.py", line 474, in increment
    raise reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 1374, in getresponse
    response.begin()
  File "/usr/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ntelebot/ntelebot/loop.py", line 70, in run
    callback()
  File "metabot/metabot/modules/reminders.py", line 45, in periodic
    _daily_messages(multibot, records)
  File "metabot/metabot/modules/reminders.py", line 119, in _daily_messages
    message = reminder_send(bot, groupid, text, url)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "metabot/metabot/modules/reminders.py", line 170, in reminder_send
    return bot.send_photo(chat_id=groupid,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ntelebot/ntelebot/bot.py", line 59, in __call__
    data = ntelebot.requests.post(self.url, timeout=self.timeout, **_prepare(params)).json()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ntelebot/ntelebot/requests.py", line 49, in post
    return _LOCAL.session.post(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ntelebot/ntelebot/requests.py", line 17, in send
    return super().send(request, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/requests/adapters.py", line 682, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
⋮
2024-08-21 06:10:03,509 INFO reminders.py:42] Running periodic.
2024-08-21 06:10:06,357 INFO bot.py:57] 3980804 https://api.telegram.org/bot44…/getupdates {'offset': 654822478, 'timeout': 1200}
2024-08-21 06:10:06,357 INFO requests.py:15] 3980804 send https://api.telegram.org/bot44…/getupdates {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:10:10,510 INFO requests.py:15] 3980803 send https://api.weather.gov/alerts/active?zone=… {'timeout': 10, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:10:10,510 INFO connectionpool.py:291] Resetting dropped connection: api.weather.gov
2024-08-21 06:10:10,977 INFO reminders.py:166] Sending reminder to -40….
2024-08-21 06:10:10,977 INFO bot.py:57] 3980803 https://api.telegram.org/bot44…/sendphoto {'chat_id': '-40…', 'photo': 'https://ssl.gstatic.com/calendar/…', 'caption': 'There are a bunch of events coming up:\n\n<b>…</a>', 'parse_mode': 'HTML', 'disable_notification': True}
2024-08-21 06:10:10,978 INFO requests.py:15] 3980803 send https://api.telegram.org/bot44…/sendphoto {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
2024-08-21 06:10:10,978 INFO connectionpool.py:1051] Starting new HTTPS connection (8): api.telegram.org:443
2024-08-21 06:10:11,785 INFO reminders.py:192] Editing reminder -40…/8137.
2024-08-21 06:10:11,785 INFO bot.py:57] 3980803 https://api.telegram.org/bot44…/editmessagetext {'chat_id': -40…, 'message_id': 8137, 'text': 'There are a bunch of events coming up:\n\n<b>…</a>', 'parse_mode': 'HTML', 'disable_web_page_preview': True}
2024-08-21 06:10:11,786 INFO requests.py:15] 3980803 send https://api.telegram.org/bot44…/editmessagetext {'timeout': 1202, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}

 
This might be a side effect of 0045680 causing HTTPS sessions to be kept alive and reused. (Note "Starting new HTTPS connection" versus "Resetting dropped connection".)

@nmlorg
Copy link
Owner Author

nmlorg commented Aug 21, 2024

It looks like the production instance got the same error a single time during a poll a little earlier:

2024-08-21 05:46:22,780 ERROR loop.py:55] Ignoring uncaught error while polling:
Traceback (most recent call last):
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
  File "lib/python3.10/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lib/python3.10/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "lib/python3.10/site-packages/urllib3/util/retry.py", line 474, in increment
    raise reraise(type(error), error, _stacktrace)
  File "lib/python3.10/site-packages/urllib3/util/util.py", line 38, in reraise
    raise value.with_traceback(tb)
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
  File "lib/python3.10/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
  File "lib/python3.10/site-packages/urllib3/connection.py", line 464, in getresponse
    httplib_response = super().getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lib/python3.10/site-packages/ntelebot/loop.py", line 45, in _poll_bot
    updates = bot.get_updates(offset=offset, timeout=timeout)
  File "lib/python3.10/site-packages/ntelebot/bot.py", line 56, in __call__
    data = ntelebot.requests.post(self.url, timeout=self.timeout, **_prepare(params)).json()
  File "lib/python3.10/site-packages/ntelebot/requests.py", line 38, in post
    return _LOCAL.session.post(*args, **kwargs)
  File "lib/python3.10/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
  File "lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "lib/python3.10/site-packages/requests/adapters.py", line 682, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2024-08-21 05:46:24,116 INFO connectionpool.py:1051] Starting new HTTPS connection (4): api.telegram.org:443

@nmlorg
Copy link
Owner Author

nmlorg commented Aug 22, 2024

(I'm now taking a detour of possibly switching from requests to httpx, both to switch to HTTP/2 and possibly moving from thread-per-poll loop to asyncio. The former would probably be a drop-in replacement, but the latter would be a major rewrite/API change.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant