Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object dtype support #2

Open
ForeverWintr opened this issue Nov 22, 2021 · 1 comment
Open

Object dtype support #2

ForeverWintr opened this issue Nov 22, 2021 · 1 comment

Comments

@ForeverWintr
Copy link

I'm seeing some inconsistencies when trying to create frames with object dtype:

ff.parse('v(object)|i(IH,(object,object,object))|c(IH,(object,object,object))|s(4,4)')
<Frame>
<IndexHierarchy>                None     None     None     None     <object>
                                zRKC     zRKC     zaji     zaji     <<U4>
                                -314.34  zDdR     zuVU     zKka     <object>
<IndexHierarchy>
None             zRKC  -314.34  96520    3776.36  True     194224
None             zRKC  zDdR     -88017   -1378.5  False    -2981.64
None             zaji  zuVU     92867    ztJh     105269   3565.34
None             zaji  zKka     3884.98  zQkB     119909   3770.2
<object>         <<U4> <object> <object> <object> <object> <object>

Note that although most arrays do end up with an object dtype, the second level in IH is U4.

@flexatone
Copy link
Contributor

Thanks for posting this issue.

The cause is that the dtype is being evaluated in the narrow context of the values present, and it just happens that at this size that inner level can be represented as U4. If we increase the size to increase the diversity of values, the expected object dtype is found.

>>> ff.parse('v(object)|i(IH,(object,object,object))|c(IH,(object,object,object))|s(5,5)')                                                                                            
<Frame>
<IndexHierarchy>                   None     None     None     None     True     <object>
                                   zRKC     zRKC     zaji     zaji     172133   <object>
                                   -314.34  zDdR     zuVU     zKka     84967    <object>
<IndexHierarchy>
None             zRKC     -314.34  96520    3776.36  True     194224   -314.34
None             zRKC     zDdR     -88017   -1378.5  False    -2981.64 zDdR
None             zaji     zuVU     92867    ztJh     105269   3565.34  zuVU
None             zaji     zKka     3884.98  zQkB     119909   3770.2   zKka
True             172133   84967    -646.86  zvCj     194224   zMmd     84967
<object>         <object> <object> <object> <object> <object> <object> <object>

I see that this is undesirable, however, and will look into forcing the dtype to always matching the requested type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants