Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to apply json.loads to string columns when first element is '[]' #3478

Closed
Yevgnen opened this issue May 23, 2022 · 0 comments · Fixed by #3480
Closed

Failed to apply json.loads to string columns when first element is '[]' #3478

Yevgnen opened this issue May 23, 2022 · 0 comments · Fixed by #3480
Labels
bug Something isn't working

Comments

@Yevgnen
Copy link

Yevgnen commented May 23, 2022

What language are you using?

Python

Which feature gates did you use?

Load JSON object from string with nested array.

Have you tried latest version of polars?

Yes

What version of polars are you using?

0.13.38

What operating system are you using polars on?

macOS 12.3.1

What language version are you using

Python 3.8.13

Describe your bug.

Applying json.loads to string column with texts like ['[]', '[{"x": 1, "y": 2}, {"x": 3, "y": 4}]', '[{"x": 1, "y": 2}]'] gives PanicException. Note that the first element of the string column is '[]'. This is originally mentioned here.

What are the steps to reproduce the behavior?

import polars as pl

df = pl.DataFrame(
    {"text": ['[]', '[{"x": 1, "y": 2}, {"x": 3, "y": 4}]', '[{"x": 1, "y": 2}]']}
)
df.select(pl.col("text").apply(json.loads))

What is the actual behavior?

In [8]: thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: SchemaMisMatch("cannot unpack Series; data types don't match")', /Users/runner/work/polars/polars/polars/polars-core/src/chunked_array/builder/list.rs:155:41
stack backtrace:
   0:        0x13c098ba6 - _ffi_select_with_compiled_path
   1:        0x13b47604b - _BrotliDecoderVersion
   2:        0x13c06d49c - _ffi_select_with_compiled_path
   3:        0x13c099c9d - _ffi_select_with_compiled_path
   4:        0x13c09ac78 - _ffi_select_with_compiled_path
   5:        0x13c09a764 - _ffi_select_with_compiled_path
   6:        0x13c09a6d9 - _ffi_select_with_compiled_path
   7:        0x13c09a695 - _ffi_select_with_compiled_path
   8:        0x13c1c6143 - _ffi_select_with_compiled_path
   9:        0x13c1c63a5 - _ffi_select_with_compiled_path
  10:        0x13b66862b - _ffi_select_with_compiled_path
  11:        0x13ab33e08 - <unknown>
  12:        0x13abde5d8 - <unknown>
  13:        0x13ae10f55 - _PyInit_polars
  14:        0x10c9f9ea6 - _method_vectorcall_VARARGS_KEYWORDS
  15:        0x10cacad6f - _call_function
  16:        0x10cac78e3 - __PyEval_EvalFrameDefault
  17:        0x10cacbc8b - __PyEval_EvalCodeWithName
  18:        0x10c9f135b - __PyFunction_Vectorcall
  19:        0x10c9f3b4c - _method_vectorcall
  20:        0x10cacad6f - _call_function
  21:        0x10cac7a3c - __PyEval_EvalFrameDefault
  22:        0x10cacbc8b - __PyEval_EvalCodeWithName
  23:        0x10c9f135b - __PyFunction_Vectorcall
  24:        0x10c9f0bd4 - _PyVectorcall_Call
  25:        0x13a9a092b - <unknown>
  26:        0x13ab601e1 - <unknown>
  27:        0x13ab610f9 - <unknown>
  28:        0x13a9aa1a1 - <unknown>
  29:        0x13be52954 - _ffi_select_with_compiled_path
  30:        0x13bee1f82 - _ffi_select_with_compiled_path
  31:        0x13bedc199 - _ffi_select_with_compiled_path
  32:        0x13bee2c70 - _ffi_select_with_compiled_path
  33:        0x13c24c5e8 - _ffi_select_with_compiled_path
  34:        0x13bff58ed - _ffi_select_with_compiled_path
  35:        0x13bff5440 - _ffi_select_with_compiled_path
  36:        0x13c09ba4a - _ffi_select_with_compiled_path
  37:     0x7ff80122b4e1 - __pthread_start
--- PyO3 is resuming a panic after fetching a PanicException from Python. ---
Python stack trace below:
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
File ~/.direnv/python-3.8.13/lib/python3.8/site-packages/polars/internals/expr.py:1547, in Expr.apply.<locals>.wrap_f(x)
   1546 def wrap_f(x: "pli.Series") -> "pli.Series":  # pragma: no cover
-> 1547     return x.apply(f, return_dtype=return_dtype)

File ~/.direnv/python-3.8.13/lib/python3.8/site-packages/polars/internals/series.py:2619, in Series.apply(self, func, return_dtype)
   2617 else:
   2618     pl_return_dtype = py_type_to_dtype(return_dtype)
-> 2619 return wrap_s(self._s.apply_lambda(func, pl_return_dtype))

PanicException: called `Result::unwrap()` on an `Err` value: SchemaMisMatch("cannot unpack Series; data types don't match")
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 df.select(pl.col("text").apply(json.loads))

File ~/.direnv/python-3.8.13/lib/python3.8/site-packages/polars/internals/frame.py:4479, in DataFrame.select(self, exprs)
   4437 def select(
   4438     self: DF,
   4439     exprs: Union[
   (...)
   4444     ],
   4445 ) -> DF:
   4446     """
   4447     Select columns from this DataFrame.
   4448 
   (...)
   4476 
   4477     """
   4478     return (
-> 4479         self.lazy()
   4480         .select(exprs)  # type: ignore
   4481         .collect(no_optimization=True, string_cache=False)
   4482     )

File ~/.direnv/python-3.8.13/lib/python3.8/site-packages/polars/internals/lazy_frame.py:586, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, string_cache, no_optimization, slice_pushdown)
    576     projection_pushdown = False
    578 ldf = self._ldf.optimization_toggle(
    579     type_coercion,
    580     predicate_pushdown,
   (...)
    584     slice_pushdown,
    585 )
--> 586 return self._dataframe_class._from_pydf(ldf.collect())

PanicException: Unwrapped panic from Python code

What is the expected behavior?

Should be no error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant