Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TDL-12486: Added support of compressed files #32

Merged
merged 15 commits into from
Jun 1, 2021

Conversation

dbshah1212
Copy link
Contributor

@dbshah1212 dbshah1212 commented May 20, 2021

Description of change

Compressed File Support

  • Nested Compression (ZIP, GZ) is not supported

  • file having extension .tar.gz not supported

  • Throwing error if file has error in gz file

    • Possible Error(s):
      • Magic 2 bytes are empty
      • Not a gzipped file
      • Unknown compression method
      • The name is not in the header according to the flag
  • Max 5 files are taken for sampling (including inner files of ZIP)

    E.g.
    1. sample_compressed_file.zip (has 5 files inside it)
    |__ sample_csv_file_1.csv
    |__ sample_csv_file_2.csv
    |__ sample_exe_file_1.exe
    |__ sample_exe_file_2.exe
    |__ sample_gz_file.gz
    2. sample_csv_file_3.csv
    3. sample_jsonl_file.jsonl
    4. sample_csv_file_4.csv

    • In Above scenario the first 5 supporting files are taken

        - New List of Files:
      
      	1. sample_csv_file_1.csv
      	2. sample_csv_file_2.csv
      	3. sample_gz_file.gz
      	4. sample_csv_file_3.csv
      	5. sample_jsonl_file.jsonl
      	6. sample_csv_file_4.csv
      
         - Taken 5 first files for sampling
      
      	1. sample_csv_file_1.csv
      	2. sample_csv_file_2.csv
      	3. sample_gz_file.gz
      	4. sample_csv_file_3.csv
      	5. sample_jsonl_file.jsonl
      

Manual QA steps

  • Testes all above-mentioned scenarios and ran all integration and unit tests.

Risks

Rollback steps

  • revert this branch

setup.py Outdated Show resolved Hide resolved
@KAllan357 KAllan357 merged commit 61f11c9 into master Jun 1, 2021
@KAllan357 KAllan357 deleted the TDL-12486-Support-Compressed-Files branch June 1, 2021 14:29
mjdoor referenced this pull request in symon-ai/tap-s3-csv Feb 28, 2022
* Fix/config parsing (#21)

* allow search_prefix to be None

* handle both list and string for key_properties and date_overrides

* pylint

* Bump to v1.2.2 (#22)

* Bump to v1.2.2

* Changelog

* Check if search_prefix is present before popping (#23)

* Bump to v1.2.3 (#24)

* TDL-13258 move tests from tap-tester to tap-s3-csv (#29)

* TDL-13258:Added integration tests and resources to tap-s3-csv from tap-tester

* Add context and triggers to circleci config

* Run nosetests on the correct folder

* Remove nose tests because there are no unit tests

* Fix test properties

* TDL-13258:Updated non_rectangular_files test case in types_and_data

* Combine related tests into one

Co-authored-by: Savan Chovatiya <[email protected]>
Co-authored-by: Collin Simon <[email protected]>

* TDL-12589: Added the support of JSONL files (#31)

* TDL-12589: Added the support of JSONL files

* TDL-12589: Formated code

* TDL-12589: test updated

* TDL-12589: Updated config.yml to expect failures

* TDL-12589: added stitch api tocken

* TDL-12589: Updated config and conversion of datatype

* TDL-12589: Updated priority of datatype like:
list
date-time
dict
integer
number
null - default in evenryone
string - default in evenryone

* TDL-12589: Updated as per priority

* TDL-12589: removed pylint failures

* TDL-12589: replaced

* TDL-12589: Added warning message for list inside list

* TDL-12589: Optimized code

* TDL-12589: Removed white space

* TDL-12589: Skipping row of JOSNL file if it is empty instaid of raising error.

* TDL: Rmoved extra white space

* TDL-12589: Updated test files

* TDL-12589: Updated code as per review comments changes

* TDL-12589: Added Unittests for the same

* TDL-12589: Pylint error resolved

* TDL-12589: Changed remove fields log from info to debug

* TDL-12589: Updated conversion code to support + sign

Co-authored-by: dbshah1212 <[email protected]>

* TDL-12464: Added support for handling the duplicate headers in the CS… (#30)

* TDL-12464: Added support for handling the duplicate headers in the CSV file

* Changed warning message

* Updated unit tests according to the warning message

* TDL-12464: Adding code to leverage duplicate headers support provided in simger-encoding library

* TDL-12464: Removed the unwanted code and made compatible with master repo

* TDL-12464: Upgraded singer-encodings library to fetch the latest version

* TDL-12464: Changing the data type of 'sdc_extra' key in the event

* TDL-12464: Updating test cases as per the code optimization

* TDL-12464: Updating version of singer-encoding library

* TDL-12464: Updating version of singer-python and backoff modules

Co-authored-by: Karan Panchal (C) <[email protected]>
Co-authored-by: harshpatel4_crest <[email protected]>

* TDL-12486: Added support of compressed files (#32)

* TDL-12486: Added support of compressed files

* TDL-12486: Updated singer encoding dependency

* TDL-12486: Added more doc strings.

* TDL-12486: Upgraded dependencies changed the logic of taking samples from zip

* TDL-12486: Increase coverage to test compressed files

* TDL-12486: Upgraded the singer-encoding version to 0.1.0

* TDL-12486: Removed trailing-whitespace

* TDL-12486: Updated test case of S3AllFilesSupport

* TDL-12486: Removed comman self.conn_id

* TDL-12486: Changes reverted.

* TDL-12486: Changed start date format

* TDL-12486: Updated date format in test_All_supported_files.

* TDL-12486: Change in logger messages

Co-authored-by: dbshah1212 <[email protected]>

* Tdl 12589 change sdc extra logs from debug to warn (#33)

* TDL-12589: Changed sdc_extra log from debug to warn

* TDL-12589: Changed message to sync with csv message

* TDL-12589: Updated message

Co-authored-by: dbshah1212 <[email protected]>

* version bump to 1.3.0 (#34)

* Strictly enforce the ordering of type checking for integer vs number (#35)

* Strictly enforce the ordering of type checking for integer vs number

* Bump to v1.3.1 (#36)

* TDL-14068:fixed key-error exception (#38)

* TDL-14068:fixed key-error exception

* Added unit test cases and integration tests

* Running one integration test for debugging

* Debugging integration test case

* Updated integration test

* Updated integration test expected output

* Updated config.yml for running all integration test again

* Fix/tdl 14038 filename issue (#37)

* TLD-14038: Skipping the .gz which gzip using --no-name

* TDL-14038: Added final count of total skipped files for discover mode and sync mode

* tdl-14038: Updated warning message and added unit test for the same

* TDL-14038: Removed global variable and added integration test

* TDL-14038: Updated comments

* TDL-14038: Added blank line

* TDL-14038: Removed: trailing-whitespace

* TDL-14038: Added comment of pylint disable

* TDL-14038: Updated pylint comment

* TDL-14038: Updated the test file class name

* TDL-14038: Removed self file call and added global.

* TDL: Remove warning message for 0 file skipped

* TDL-14038: Removed trailing white space

* TDL-14068: Fixed key error exception.

* TDL-14038: Reverted another bug changes

* TDL-14038: updated skipped_files_count

* TDL-14038: Updated message, comments and counts

* TDL-14038: Removed trailing-whitespace

* TDL-14038: Updated unit test cases

* TDL-14038: Updated sync file code.

* Resolved: use-maxsplit-arg

* Refactor how we handle nameless files

* Fix comment placement

* Mention tar as a problem too

* Make pylint happy

Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: Andy Lu <[email protected]>

* Bump to v1.3.2, update changelog (#39)

* Bump to v1.3.2, update changelog

* Update changelog

* bump singer-encodings 0.1.1 (#41)

* bump 1.3.3 (#42)

* TDL-14228: Generate catalog file with the properties key if no samples found for sampling. (#40)

* Updated sampled schema when no samples found

* Running one integration test for debugging

* Debugging integration test

* Debugging integration test

* Updated integration test for catalog_with_empty_properties

* Running all integration test again

* Fix/wrong file extention error handling (#43)

* fix: Handled Unicode and JsonDecoder Error for wrong extention file.

* fix: Updated sync code and test case

* Fix: Handled StopIteration error for empty csv file.

* fix: Added unit test of StopIteration code handling

* fix: Resolved pylint errors

* Fix: removed trailing white space

* fix: disabled use-maxsplit-arg as we haven't change the code as part of this branch

* fix: Removed exception and added Warning for empty Jsonl file.

* fix: Handled pylint error

* fix: Skipping records with empty json

* fix: Added unit tests and integration tests for empty json jsonl file.

* fix: Skipping Empty Josn whily syncing as well

* Skipping empty lines of CSV in sampling and sync

* fix: Upgraded latest version of singer-encoding.

* fix: Added some test files

* fix: Removed unused variable declaration

* fix: Added UnicodeDecodeError and JSONDecodeError handling scenario in comment.

* fix: Final touch

* Update spell mistake

* Corrected typo

* Updated warning messages and empty jsonl file in skip count

* fix: Put warning of skipping empty jsonl files.

* fix: Updated comment

Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Kyle Allan <[email protected]>

* Bump to version 1.3.4 (#45)

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

Co-authored-by: KrishnanG <[email protected]>

* WP-7630 Reintroduce role assumption capabilities

* WP-7630 Specify config for external source

* WP-7630 Test

* WP-7630 Undo test

* WP-7630 Resolve merge issues

* WP-7630 Try with setup.py file

* WP-7630 Modify setup.py

* WP-7630 Add recursive_search parameter

* WP-7630 Fix recursive_search

* WP-7630 Use appropriate version number

* WP-7630 Fix recursive_search with blank prefix

* WP-7630 Update readme, changelog

Co-authored-by: Nick McCoy <[email protected]>
Co-authored-by: cosimon <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Savan Chovatiya <[email protected]>
Co-authored-by: Collin Simon <[email protected]>
Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: karanpanchal-crest <[email protected]>
Co-authored-by: Karan Panchal (C) <[email protected]>
Co-authored-by: harshpatel4_crest <[email protected]>
Co-authored-by: Leslie VanDeMark <[email protected]>
Co-authored-by: Andy Lu <[email protected]>
Co-authored-by: zachharris1 <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Kyle Allan <[email protected]>
Co-authored-by: KrisPersonal <[email protected]>
Co-authored-by: KrishnanG <[email protected]>
smayerv referenced this pull request in symon-ai/tap-s3-csv May 10, 2022
* Fix/config parsing (#21)

* allow search_prefix to be None

* handle both list and string for key_properties and date_overrides

* pylint

* Bump to v1.2.2 (#22)

* Bump to v1.2.2

* Changelog

* Check if search_prefix is present before popping (#23)

* Bump to v1.2.3 (#24)

* TDL-13258 move tests from tap-tester to tap-s3-csv (#29)

* TDL-13258:Added integration tests and resources to tap-s3-csv from tap-tester

* Add context and triggers to circleci config

* Run nosetests on the correct folder

* Remove nose tests because there are no unit tests

* Fix test properties

* TDL-13258:Updated non_rectangular_files test case in types_and_data

* Combine related tests into one

Co-authored-by: Savan Chovatiya <[email protected]>
Co-authored-by: Collin Simon <[email protected]>

* TDL-12589: Added the support of JSONL files (#31)

* TDL-12589: Added the support of JSONL files

* TDL-12589: Formated code

* TDL-12589: test updated

* TDL-12589: Updated config.yml to expect failures

* TDL-12589: added stitch api tocken

* TDL-12589: Updated config and conversion of datatype

* TDL-12589: Updated priority of datatype like:
list
date-time
dict
integer
number
null - default in evenryone
string - default in evenryone

* TDL-12589: Updated as per priority

* TDL-12589: removed pylint failures

* TDL-12589: replaced

* TDL-12589: Added warning message for list inside list

* TDL-12589: Optimized code

* TDL-12589: Removed white space

* TDL-12589: Skipping row of JOSNL file if it is empty instaid of raising error.

* TDL: Rmoved extra white space

* TDL-12589: Updated test files

* TDL-12589: Updated code as per review comments changes

* TDL-12589: Added Unittests for the same

* TDL-12589: Pylint error resolved

* TDL-12589: Changed remove fields log from info to debug

* TDL-12589: Updated conversion code to support + sign

Co-authored-by: dbshah1212 <[email protected]>

* TDL-12464: Added support for handling the duplicate headers in the CS… (#30)

* TDL-12464: Added support for handling the duplicate headers in the CSV file

* Changed warning message

* Updated unit tests according to the warning message

* TDL-12464: Adding code to leverage duplicate headers support provided in simger-encoding library

* TDL-12464: Removed the unwanted code and made compatible with master repo

* TDL-12464: Upgraded singer-encodings library to fetch the latest version

* TDL-12464: Changing the data type of 'sdc_extra' key in the event

* TDL-12464: Updating test cases as per the code optimization

* TDL-12464: Updating version of singer-encoding library

* TDL-12464: Updating version of singer-python and backoff modules

Co-authored-by: Karan Panchal (C) <[email protected]>
Co-authored-by: harshpatel4_crest <[email protected]>

* TDL-12486: Added support of compressed files (#32)

* TDL-12486: Added support of compressed files

* TDL-12486: Updated singer encoding dependency

* TDL-12486: Added more doc strings.

* TDL-12486: Upgraded dependencies changed the logic of taking samples from zip

* TDL-12486: Increase coverage to test compressed files

* TDL-12486: Upgraded the singer-encoding version to 0.1.0

* TDL-12486: Removed trailing-whitespace

* TDL-12486: Updated test case of S3AllFilesSupport

* TDL-12486: Removed comman self.conn_id

* TDL-12486: Changes reverted.

* TDL-12486: Changed start date format

* TDL-12486: Updated date format in test_All_supported_files.

* TDL-12486: Change in logger messages

Co-authored-by: dbshah1212 <[email protected]>

* Tdl 12589 change sdc extra logs from debug to warn (#33)

* TDL-12589: Changed sdc_extra log from debug to warn

* TDL-12589: Changed message to sync with csv message

* TDL-12589: Updated message

Co-authored-by: dbshah1212 <[email protected]>

* version bump to 1.3.0 (#34)

* Strictly enforce the ordering of type checking for integer vs number (#35)

* Strictly enforce the ordering of type checking for integer vs number

* Bump to v1.3.1 (#36)

* TDL-14068:fixed key-error exception (#38)

* TDL-14068:fixed key-error exception

* Added unit test cases and integration tests

* Running one integration test for debugging

* Debugging integration test case

* Updated integration test

* Updated integration test expected output

* Updated config.yml for running all integration test again

* Fix/tdl 14038 filename issue (#37)

* TLD-14038: Skipping the .gz which gzip using --no-name

* TDL-14038: Added final count of total skipped files for discover mode and sync mode

* tdl-14038: Updated warning message and added unit test for the same

* TDL-14038: Removed global variable and added integration test

* TDL-14038: Updated comments

* TDL-14038: Added blank line

* TDL-14038: Removed: trailing-whitespace

* TDL-14038: Added comment of pylint disable

* TDL-14038: Updated pylint comment

* TDL-14038: Updated the test file class name

* TDL-14038: Removed self file call and added global.

* TDL: Remove warning message for 0 file skipped

* TDL-14038: Removed trailing white space

* TDL-14068: Fixed key error exception.

* TDL-14038: Reverted another bug changes

* TDL-14038: updated skipped_files_count

* TDL-14038: Updated message, comments and counts

* TDL-14038: Removed trailing-whitespace

* TDL-14038: Updated unit test cases

* TDL-14038: Updated sync file code.

* Resolved: use-maxsplit-arg

* Refactor how we handle nameless files

* Fix comment placement

* Mention tar as a problem too

* Make pylint happy

Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: Andy Lu <[email protected]>

* Bump to v1.3.2, update changelog (#39)

* Bump to v1.3.2, update changelog

* Update changelog

* bump singer-encodings 0.1.1 (#41)

* bump 1.3.3 (#42)

* TDL-14228: Generate catalog file with the properties key if no samples found for sampling. (#40)

* Updated sampled schema when no samples found

* Running one integration test for debugging

* Debugging integration test

* Debugging integration test

* Updated integration test for catalog_with_empty_properties

* Running all integration test again

* Fix/wrong file extention error handling (#43)

* fix: Handled Unicode and JsonDecoder Error for wrong extention file.

* fix: Updated sync code and test case

* Fix: Handled StopIteration error for empty csv file.

* fix: Added unit test of StopIteration code handling

* fix: Resolved pylint errors

* Fix: removed trailing white space

* fix: disabled use-maxsplit-arg as we haven't change the code as part of this branch

* fix: Removed exception and added Warning for empty Jsonl file.

* fix: Handled pylint error

* fix: Skipping records with empty json

* fix: Added unit tests and integration tests for empty json jsonl file.

* fix: Skipping Empty Josn whily syncing as well

* Skipping empty lines of CSV in sampling and sync

* fix: Upgraded latest version of singer-encoding.

* fix: Added some test files

* fix: Removed unused variable declaration

* fix: Added UnicodeDecodeError and JSONDecodeError handling scenario in comment.

* fix: Final touch

* Update spell mistake

* Corrected typo

* Updated warning messages and empty jsonl file in skip count

* fix: Put warning of skipping empty jsonl files.

* fix: Updated comment

Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Kyle Allan <[email protected]>

* Bump to version 1.3.4 (#45)

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

Co-authored-by: KrishnanG <[email protected]>

* WP-7630 Reintroduce role assumption capabilities

* WP-7630 Specify config for external source

* WP-7630 Test

* WP-7630 Undo test

* WP-7630 Resolve merge issues

* WP-7630 Try with setup.py file

* WP-7630 Modify setup.py

* WP-7630 Add recursive_search parameter

* WP-7630 Fix recursive_search

* WP-7630 Use appropriate version number

* WP-7630 Fix recursive_search with blank prefix

* WP-7630 Update readme, changelog

Co-authored-by: Nick McCoy <[email protected]>
Co-authored-by: cosimon <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Savan Chovatiya <[email protected]>
Co-authored-by: Collin Simon <[email protected]>
Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: karanpanchal-crest <[email protected]>
Co-authored-by: Karan Panchal (C) <[email protected]>
Co-authored-by: harshpatel4_crest <[email protected]>
Co-authored-by: Leslie VanDeMark <[email protected]>
Co-authored-by: Andy Lu <[email protected]>
Co-authored-by: zachharris1 <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Kyle Allan <[email protected]>
Co-authored-by: KrisPersonal <[email protected]>
Co-authored-by: KrishnanG <[email protected]>
smayerv referenced this pull request in symon-ai/tap-s3-csv May 11, 2022
* Fix/config parsing (#21)

* allow search_prefix to be None

* handle both list and string for key_properties and date_overrides

* pylint

* Bump to v1.2.2 (#22)

* Bump to v1.2.2

* Changelog

* Check if search_prefix is present before popping (#23)

* Bump to v1.2.3 (#24)

* TDL-13258 move tests from tap-tester to tap-s3-csv (#29)

* TDL-13258:Added integration tests and resources to tap-s3-csv from tap-tester

* Add context and triggers to circleci config

* Run nosetests on the correct folder

* Remove nose tests because there are no unit tests

* Fix test properties

* TDL-13258:Updated non_rectangular_files test case in types_and_data

* Combine related tests into one

Co-authored-by: Savan Chovatiya <[email protected]>
Co-authored-by: Collin Simon <[email protected]>

* TDL-12589: Added the support of JSONL files (#31)

* TDL-12589: Added the support of JSONL files

* TDL-12589: Formated code

* TDL-12589: test updated

* TDL-12589: Updated config.yml to expect failures

* TDL-12589: added stitch api tocken

* TDL-12589: Updated config and conversion of datatype

* TDL-12589: Updated priority of datatype like:
list
date-time
dict
integer
number
null - default in evenryone
string - default in evenryone

* TDL-12589: Updated as per priority

* TDL-12589: removed pylint failures

* TDL-12589: replaced

* TDL-12589: Added warning message for list inside list

* TDL-12589: Optimized code

* TDL-12589: Removed white space

* TDL-12589: Skipping row of JOSNL file if it is empty instaid of raising error.

* TDL: Rmoved extra white space

* TDL-12589: Updated test files

* TDL-12589: Updated code as per review comments changes

* TDL-12589: Added Unittests for the same

* TDL-12589: Pylint error resolved

* TDL-12589: Changed remove fields log from info to debug

* TDL-12589: Updated conversion code to support + sign

Co-authored-by: dbshah1212 <[email protected]>

* TDL-12464: Added support for handling the duplicate headers in the CS… (#30)

* TDL-12464: Added support for handling the duplicate headers in the CSV file

* Changed warning message

* Updated unit tests according to the warning message

* TDL-12464: Adding code to leverage duplicate headers support provided in simger-encoding library

* TDL-12464: Removed the unwanted code and made compatible with master repo

* TDL-12464: Upgraded singer-encodings library to fetch the latest version

* TDL-12464: Changing the data type of 'sdc_extra' key in the event

* TDL-12464: Updating test cases as per the code optimization

* TDL-12464: Updating version of singer-encoding library

* TDL-12464: Updating version of singer-python and backoff modules

Co-authored-by: Karan Panchal (C) <[email protected]>
Co-authored-by: harshpatel4_crest <[email protected]>

* TDL-12486: Added support of compressed files (#32)

* TDL-12486: Added support of compressed files

* TDL-12486: Updated singer encoding dependency

* TDL-12486: Added more doc strings.

* TDL-12486: Upgraded dependencies changed the logic of taking samples from zip

* TDL-12486: Increase coverage to test compressed files

* TDL-12486: Upgraded the singer-encoding version to 0.1.0

* TDL-12486: Removed trailing-whitespace

* TDL-12486: Updated test case of S3AllFilesSupport

* TDL-12486: Removed comman self.conn_id

* TDL-12486: Changes reverted.

* TDL-12486: Changed start date format

* TDL-12486: Updated date format in test_All_supported_files.

* TDL-12486: Change in logger messages

Co-authored-by: dbshah1212 <[email protected]>

* Tdl 12589 change sdc extra logs from debug to warn (#33)

* TDL-12589: Changed sdc_extra log from debug to warn

* TDL-12589: Changed message to sync with csv message

* TDL-12589: Updated message

Co-authored-by: dbshah1212 <[email protected]>

* version bump to 1.3.0 (#34)

* Strictly enforce the ordering of type checking for integer vs number (#35)

* Strictly enforce the ordering of type checking for integer vs number

* Bump to v1.3.1 (#36)

* TDL-14068:fixed key-error exception (#38)

* TDL-14068:fixed key-error exception

* Added unit test cases and integration tests

* Running one integration test for debugging

* Debugging integration test case

* Updated integration test

* Updated integration test expected output

* Updated config.yml for running all integration test again

* Fix/tdl 14038 filename issue (#37)

* TLD-14038: Skipping the .gz which gzip using --no-name

* TDL-14038: Added final count of total skipped files for discover mode and sync mode

* tdl-14038: Updated warning message and added unit test for the same

* TDL-14038: Removed global variable and added integration test

* TDL-14038: Updated comments

* TDL-14038: Added blank line

* TDL-14038: Removed: trailing-whitespace

* TDL-14038: Added comment of pylint disable

* TDL-14038: Updated pylint comment

* TDL-14038: Updated the test file class name

* TDL-14038: Removed self file call and added global.

* TDL: Remove warning message for 0 file skipped

* TDL-14038: Removed trailing white space

* TDL-14068: Fixed key error exception.

* TDL-14038: Reverted another bug changes

* TDL-14038: updated skipped_files_count

* TDL-14038: Updated message, comments and counts

* TDL-14038: Removed trailing-whitespace

* TDL-14038: Updated unit test cases

* TDL-14038: Updated sync file code.

* Resolved: use-maxsplit-arg

* Refactor how we handle nameless files

* Fix comment placement

* Mention tar as a problem too

* Make pylint happy

Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: Andy Lu <[email protected]>

* Bump to v1.3.2, update changelog (#39)

* Bump to v1.3.2, update changelog

* Update changelog

* bump singer-encodings 0.1.1 (#41)

* bump 1.3.3 (#42)

* TDL-14228: Generate catalog file with the properties key if no samples found for sampling. (#40)

* Updated sampled schema when no samples found

* Running one integration test for debugging

* Debugging integration test

* Debugging integration test

* Updated integration test for catalog_with_empty_properties

* Running all integration test again

* Fix/wrong file extention error handling (#43)

* fix: Handled Unicode and JsonDecoder Error for wrong extention file.

* fix: Updated sync code and test case

* Fix: Handled StopIteration error for empty csv file.

* fix: Added unit test of StopIteration code handling

* fix: Resolved pylint errors

* Fix: removed trailing white space

* fix: disabled use-maxsplit-arg as we haven't change the code as part of this branch

* fix: Removed exception and added Warning for empty Jsonl file.

* fix: Handled pylint error

* fix: Skipping records with empty json

* fix: Added unit tests and integration tests for empty json jsonl file.

* fix: Skipping Empty Josn whily syncing as well

* Skipping empty lines of CSV in sampling and sync

* fix: Upgraded latest version of singer-encoding.

* fix: Added some test files

* fix: Removed unused variable declaration

* fix: Added UnicodeDecodeError and JSONDecodeError handling scenario in comment.

* fix: Final touch

* Update spell mistake

* Corrected typo

* Updated warning messages and empty jsonl file in skip count

* fix: Put warning of skipping empty jsonl files.

* fix: Updated comment

Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Kyle Allan <[email protected]>

* Bump to version 1.3.4 (#45)

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

* Bump to version 1.3.4

Co-authored-by: KrishnanG <[email protected]>

* Bump csv field width (#47)

* Maybe increase the field width we can handle

* Fix typo

* Just use sys.maxsize

* Make pylint happy

* Version bump to `v1.3.5` (#48)

* Make entry consistent with others

* Bump to v1.3.5, update changelog

Co-authored-by: Nick McCoy <[email protected]>
Co-authored-by: cosimon <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Savan Chovatiya <[email protected]>
Co-authored-by: Collin Simon <[email protected]>
Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: dbshah1212 <[email protected]>
Co-authored-by: karanpanchal-crest <[email protected]>
Co-authored-by: Karan Panchal (C) <[email protected]>
Co-authored-by: harshpatel4_crest <[email protected]>
Co-authored-by: Leslie VanDeMark <[email protected]>
Co-authored-by: Andy Lu <[email protected]>
Co-authored-by: zachharris1 <[email protected]>
Co-authored-by: savan-chovatiya <[email protected]>
Co-authored-by: Kyle Allan <[email protected]>
Co-authored-by: KrisPersonal <[email protected]>
Co-authored-by: KrishnanG <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants