-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
msgpack - ValueError: 2369781118 exceeds max_bin_len(2147483647 #262
Comments
I recommend updating dask and distributed. This looks like an issue that
was recently fixed and released.
…On Sun, Jan 21, 2024 at 8:50 PM RichardScottOZ ***@***.***> wrote:
Attempted join on around 250K polygons to 3M points on a Localcluster -
the other way around did work a couple of days ago - haven't tried a cold
machine start or cluster restart as yet - I saw an error like this from
2021 for dask that someone said an update fixed for a large graph
C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\client.py:3162: UserWarning: Sending large graph of size 2.21 GiB.This may cause some slowdown.Consider scattering data ahead of time and using futures.
warnings.warn(---------------------------------------------------------------------------CancelledError Traceback (most recent call last)File <timed exec>:1
File ~\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\dask\base.py:379, in DaskMethodsMixin.compute(self, **kwargs)
355 def compute(self, **kwargs):
356 """Compute this dask collection 357 358 This turns a lazy Dask collection into its in-memory equivalent. (...) 377 dask.compute 378 """--> 379 (result,) = compute(self, traverse=False, **kwargs)
380 return result
File ~\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\dask\base.py:665, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
662 postcomputes.append(x.__dask_postcompute__())
664 with shorten_traceback():--> 665 results = schedule(dsk, keys, **kwargs)
667 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
File ~\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\client.py:2244, in Client._gather(self, futures, errors, direct, local_worker)
2242 else:
2243 raise exception.with_traceback(traceback)-> 2244 raise exc
2245 if errors == "skip":
2246 bad_keys.add(key)
CancelledError: ('sjoin-38ea83416d6236ee710acd39e46f004c', 7)
2024-01-22 13:07:30,743 - distributed.core - ERROR - Exception while handling op register-clientTraceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
result = await result
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
await self.handle_stream(comm=comm, extra={"client": client})
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
msgs = await comm.read()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
msg = await from_frames(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
res = _from_frames()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
return protocol.loads(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369813480 exceeds max_bin_len(2147483647)Task exception was never retrievedfuture: <Task finished name='Task-1911149' coro=<Server._handle_comm() done, defined at C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py:875> exception=ValueError('2369813480 exceeds max_bin_len(2147483647)')>Traceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
result = await result
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
await self.handle_stream(comm=comm, extra={"client": client})
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
msgs = await comm.read()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
msg = await from_frames(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
res = _from_frames()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
return protocol.loads(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369813480 exceeds max_bin_len(2147483647)2024-01-22 13:13:07,709 - distributed.protocol.core - CRITICAL - Failed to deserializeTraceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369781116 exceeds max_bin_len(2147483647)2024-01-22 13:13:07,755 - distributed.core - ERROR - Exception while handling op register-clientTraceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
result = await result
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
await self.handle_stream(comm=comm, extra={"client": client})
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
msgs = await comm.read()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
msg = await from_frames(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
res = _from_frames()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
return protocol.loads(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369781116 exceeds max_bin_len(2147483647)Task exception was never retrievedfuture: <Task finished name='Task-2018400' coro=<Server._handle_comm() done, defined at C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py:875> exception=ValueError('2369781116 exceeds max_bin_len(2147483647)')>Traceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
result = await result
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
await self.handle_stream(comm=comm, extra={"client": client})
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
msgs = await comm.read()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
msg = await from_frames(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
res = _from_frames()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
return protocol.loads(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369781116 exceeds max_bin_len(2147483647)2024-01-22 13:17:51,917 - distributed.protocol.core - CRITICAL - Failed to deserializeTraceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369781118 exceeds max_bin_len(2147483647)2024-01-22 13:17:51,965 - distributed.core - ERROR - Exception while handling op register-clientTraceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
result = await result
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
await self.handle_stream(comm=comm, extra={"client": client})
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
msgs = await comm.read()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
msg = await from_frames(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
res = _from_frames()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
return protocol.loads(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369781118 exceeds max_bin_len(2147483647)Task exception was never retrievedfuture: <Task finished name='Task-2222100' coro=<Server._handle_comm() done, defined at C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py:875> exception=ValueError('2369781118 exceeds max_bin_len(2147483647)')>Traceback (most recent call last):
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
result = await result
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
await self.handle_stream(comm=comm, extra={"client": client})
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
msgs = await comm.read()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
msg = await from_frames(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
res = _from_frames()
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
return protocol.loads(
File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
return msgpack.loads(
File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackbValueError: 2369781118 exceeds max_bin_len(2147483647)
—
Reply to this email directly, view it on GitHub
<#262>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTCCYER7MX7THMNHG2TYPXHYFAVCNFSM6AAAAABCEPO55SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA4TEOJWGY3TIOA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Attempted join on around 250K polygons to 900K points on a Localcluster - the other way around did work a couple of days ago - haven't tried a cold machine start or cluster restart as yet - I saw an error like this from 2021 for dask that someone said an update fixed for a large graph
[realised I was only using a smaller one - need to do 3M] - presumably can probably get this to work currently chopping off the extra 10% or so needed to stay under that long int boundary.
The text was updated successfully, but these errors were encountered: