Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive Connector. Translate Empty Value in NULL in Text Files #3583

Open
larandvit opened this issue Apr 29, 2020 · 5 comments
Open

Hive Connector. Translate Empty Value in NULL in Text Files #3583

larandvit opened this issue Apr 29, 2020 · 5 comments

Comments

@larandvit
Copy link

Hello,

I can't find a setting to instruct PrestoSQL in CREATE TABLE statement that I want empty values in text files to be interpreted as NULLS.

It's used Hive connector with access to Minio S3 storage.

Create table statement

CREATE TABLE minio.lab.sample_table(
    column1 varchar,
    column2 intger)
WITH (
      external_location = 's3a://bucket-name/folder-name',
      format = 'TEXTFILE',
      skip_header_line_count=1
);

In Hive table, it can accomplished by 'serialization.null.format'='' property in TBLPROPERTIES section.

Thanks.
Vitaly.

@findepi
Copy link
Member

findepi commented Apr 29, 2020

If you set serialization.null.format table property via Hive, are you getting the results you expect from Presto?

@larandvit
Copy link
Author

larandvit commented Apr 29, 2020

Thank you for a quick response.

Data located in Minio storage. It's compatible to S3. PrestoDB connected to Minio with Hive connector. I provided serialization.null.format setting as a sample what we need to get because it's used Hive connector and it should behave similar to Hive.

What a setting do we need to apply to interpret empty values in a text file as NULL values in PrestoDB?

Sample of a text file with empty value in "Col2".

Col1,Col2,Col3
19,,sample test

@findepi
Copy link
Member

findepi commented Apr 30, 2020

While we want to make it very easy to use the most commonly used features, i have to say that serialization.null.format is currently not one of the common.
I think this should be addressed with #954.

@larandvit
Copy link
Author

Thank you, @findepi , for updates.

@JeevansSP
Copy link

@findepi hello, has this been added yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants