-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Cannot convert data of type str from v3 to v4 using the convert function #2981
Comments
Hey @CorneliaMelon, sorry about experiencing the problem and thanks for sharing the error. Can you please share |
Hey @davidbuniat thanks for the quick response!
|
Hey @CorneliaMelon , thanks for reporting this. We just released |
@khustup2 instruction works, now I get the error: |
@CorneliaMelon any chance you can provide dataset summary on V3? In order to do that, can you please install |
@khustup2 here it is:
obs/front_left_camera image (2111, 512, 512, 3) uint8 jpeg |
@CorneliaMelon thanks for the info! We have this fixed locally and will include it in the upcoming 4.0.2 release. I will let you know once the release is done. |
@khustup2 Awesome, thanks for the update! Looking forward to it. |
Hey @CorneliaMelon , we released |
Severity
P0 - Critical breaking issue or missing functionality
Current Behavior
I am trying to convert my datasets from v3 to v4 with different columns of different data types. My 'instruction' column holds strings of different lengths and I get the error:
File "/opt/miniconda3/lib/python3.11/site-packages/deeplake/init.py", line 164, in convert
dest_ds.append(b)
deeplake._deeplake.InvalidColumnValueError: Invalid value for column 'instruction'. Reason - 'Data must have 2 dimensions provided 1'
Interestingly, when I explicitly append the instruction column in my own script, it works.
If I use the convert function only on the instruction column, I also get an error:
Terminal output about source_ds with breakpoint at the print statement and error message:
source_ds
PyDev console: starting.
Dataset(columns=(instruction), length=2013)
Source size: 2013
convert(source_ds, dest_ds)
Traceback (most recent call last):
File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "", line 1, in
File "/Users/...", line 9, in convert
deeplake.convert(source, target)
File "/opt/miniconda3/lib/python3.11/site-packages/deeplake/init.py", line 156, in convert
source_ds = deeplake.query(f'select * from "{src}"')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The query source - 'Dataset(columns=(instruction), length=2013)', is not found or not supported.
If I query the whole dataset and only get the instruction column, I also get an error:
Error message:
Process finished with exit code 139 (interrupted by signal 11:SIGSEGV)
Is there any way I can still use the automatic deeplake.convert(src='al://org_name/existing_v3_dataset', dst='al://org_name/new_v4_dataset')?
Steps to Reproduce
To reproduce the errors, see code in the problem description above.
Expected/Desired Behavior
Dataset being converted from v3 to v4.
Python Version
Python 3.11.8
OS
No response
IDE
No response
Packages
No response
Additional Context
No response
Possible Solution
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: