Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-2343: Fixes NPE when rewriting file with multiple rowgroups #1136

Merged
merged 2 commits into from
Sep 4, 2023

Conversation

ConeyLiu
Copy link
Contributor

@ConeyLiu ConeyLiu commented Aug 31, 2023

Currently, the ParquetRewiter creates the ColumnReadStoreImpl crStore and reuses it for all the blocks rewriting. This should be incorrect and we should create the crStore for each block that needs to be rewritten. Otherwise, we will fail as the following:

java.lang.NullPointerException
at org.apache.parquet.column.impl.ColumnReaderBase.readPage(ColumnReaderBase.java:620)
at org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:594)
at org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:735)
at org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30)
at org.apache.parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:47)
at org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:82)
at org.apache.parquet.hadoop.rewrite.ParquetRewriter.processBlocksFromReader(ParquetRewriter.java:316)
at org.apache.parquet.hadoop.rewrite.ParquetRewriter.processBlocks(ParquetRewriter.java:250)

Jira

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Adds testRewriteFileWithMultipleBlocks in ParquetRewriterTest.

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

@ConeyLiu
Copy link
Contributor Author

Hi @wgtmac, could you help to review this? Thanks a lot.

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! @ConeyLiu

@wgtmac wgtmac merged commit 9b5a962 into apache:master Sep 4, 2023
wgtmac pushed a commit to wgtmac/parquet-mr that referenced this pull request Sep 4, 2023
@ConeyLiu
Copy link
Contributor Author

ConeyLiu commented Sep 4, 2023

Thanks @wgtmac for merging this.

@ConeyLiu ConeyLiu deleted the fixes-rewrite branch September 4, 2023 08:48
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Nov 6, 2023
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Nov 7, 2023
* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Nov 8, 2023
Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Nov 8, 2023
Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Replace Timestamp with Instant

Also reduces length of ETL Tests by decreasing amount of data created.

Upgrade junit to junit 5.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Nov 28, 2023
Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Replace Timestamp with Instant

Also reduces length of ETL Tests by decreasing amount of data created.

Upgrade junit to junit 5.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jan 11, 2024
Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Replace Timestamp with Instant

Also reduces length of ETL Tests by decreasing amount of data created.

Upgrade junit to junit 5.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jan 11, 2024
Use constants for mgmt and other urls

Add parquet file extension to integration tests.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Replace Timestamp with Instant

Also reduces length of ETL Tests by decreasing amount of data created.

Upgrade junit to junit 5.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jan 11, 2024
Use constants for mgmt and other urls

Add parquet file extension to integration tests.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Replace Timestamp with Instant

Also reduces length of ETL Tests by decreasing amount of data created.

Upgrade junit to junit 5.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jan 12, 2024
Use constants for mgmt and other urls

Add parquet file extension to integration tests.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Add support for parquet storage.

Add support for using parquet files

Adds support in the plainpbstorage plugin for storing pb
messages in parquet files.

Parquet fix recompress (#20)

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

* Run workflows on all pull requests

Parquet more test (#19)

* Create more test cases

Adds the parquet backend to more tests
Creates more variations in test setup
Removes repeated tests
removes need for single forks tests tag
Removes need for slow tests tag

* Test different compression options in the timetest

Parquet multi file (#18)

* Tidy up pb/parquet packages

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb

* Improve efficiency of using Parquet library

* Make the parquet file stream able to work with multiple files.

Replace Timestamp with Instant

Also reduces length of ETL Tests by decreasing amount of data created.

Upgrade junit to junit 5.
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 1, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 1, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 1, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 2, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 5, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 7, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 7, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 7, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 7, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 7, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 21, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 21, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 22, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Feb 22, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Apr 24, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Apr 25, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Apr 25, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request May 13, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request May 13, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request May 29, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jul 5, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jul 5, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jul 5, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jul 5, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Aug 5, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Aug 5, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Aug 6, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Nov 21, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Nov 29, 2024
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jan 22, 2025
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
jacomago added a commit to jacomago/epicsarchiverap that referenced this pull request Jan 22, 2025
Adds support in the plainstorage plugin for storing pb
messages in parquet files.

* Only recompress the parquet rows when the compression mode
changes.

This is unnecessary when PR apache/parquet-java#1136 is in the parquet release.

* Fix ETLTimeTest to handle src made up of ZIP files.

* Fix NamedFlagETLTest to use a single configservice so as
to not conflict test with each other.

* Fix ETLPostProcessorTest to use a single configservice so as
to not conflict test with each other.

removes need for single forks tests tag
Removes need for slow tests tag

* Combine parquet and pb compression mode

Adds a new wrapper class for CompressionCodec from parquet
and CompressionMode in pb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants