diff --git a/CHANGELOG.md b/CHANGELOG.md index 4ecdf628355ea..413103bed1bb6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,4 +1,483 @@ +# Apache Arrow 14.0.0 (2023-10-19) + +## Bug Fixes + +* [GH-15017](https://github.com/apache/arrow/issues/15017) - [Python] Harden test_memory.py for use with ARROW_USE_GLOG=ON (#36901) +* [GH-15281](https://github.com/apache/arrow/issues/15281) - [C++] Replace bytes_view alias with span (#36334) +* [GH-31621](https://github.com/apache/arrow/issues/31621) - [JS] Fix Union null bitmaps (#37122) +* [GH-32439](https://github.com/apache/arrow/issues/32439) - [Python] Fix off by one bug when chunking nested structs (#37376) +* [GH-32483](https://github.com/apache/arrow/issues/32483) - [Docs][Python] Clarify you need to use conda-forge for installing nightly conda package (#37948) +* [GH-33807](https://github.com/apache/arrow/issues/33807) - [R] Add a message if we detect running under emulation (#37777) +* [GH-34567](https://github.com/apache/arrow/issues/34567) - [JS] Improve build and do not generate `bin/bin` directory (#36607) +* [GH-34640](https://github.com/apache/arrow/issues/34640) - [R] Can't read in partitioning column in CSV datasets when both (non-hive) partition and schema supplied (#37658) +* [GH-34909](https://github.com/apache/arrow/issues/34909) - [C++] Avoid mean overflow on large integer inputs (#37243) +* [GH-35095](https://github.com/apache/arrow/issues/35095) - [C++] Prevent write after close in arrow::ipc::IpcFormatWriter (#37783) +* [GH-35167](https://github.com/apache/arrow/issues/35167) - [Docs][C++] Use new API for arrow::json::TableReader (#37301) +* [GH-35292](https://github.com/apache/arrow/issues/35292) - [Release] Retry "apt install" (#36836) +* [GH-35328](https://github.com/apache/arrow/issues/35328) - [Go][FlightSQL] Fix flaky test for FlightSql driver (#38044) +* [GH-35450](https://github.com/apache/arrow/issues/35450) - [C++] Return error when `RecordBatch::ToStructArray` called with mismatched column lengths (#36654) +* [GH-35581](https://github.com/apache/arrow/issues/35581) - [C++] Store offsets in scalars (#36018) +* [GH-35641](https://github.com/apache/arrow/issues/35641) - [CI][C++] Disable precompiled headers (#37502) +* [GH-35658](https://github.com/apache/arrow/issues/35658) - [Packaging] Sync conda recipes with feedstocks (#35637) +* [GH-35770](https://github.com/apache/arrow/issues/35770) - [Go][Documentation] Update TimestampType zero value as seconds in comment (#37905) +* [GH-35942](https://github.com/apache/arrow/issues/35942) - [C++] Improve Decimal ToReal accuracy (#36667) +* [GH-36069](https://github.com/apache/arrow/issues/36069) - [Java] Ensure S3 is finalized on shutdown (#36934) +* [GH-36154](https://github.com/apache/arrow/issues/36154) - [JS][CI] Use `jest` cache in CI (#36373) +* [GH-36189](https://github.com/apache/arrow/issues/36189) - [C++][Parquet] StreamReader::SkipRows() skips to incorrect place in multi-row-group files (#36191) +* [GH-36318](https://github.com/apache/arrow/issues/36318) - [Go] only decode lengths for the number of existing values, not for all nvalues. (#36322) +* [GH-36323](https://github.com/apache/arrow/issues/36323) - [Python] Fix Timestamp scalar repr error on values outside datetime range (#36942) +* [GH-36332](https://github.com/apache/arrow/issues/36332) - [CI][Java] Integration jobs with Spark fail with NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator +* [GH-36371](https://github.com/apache/arrow/issues/36371) - [Java] CycloneDX Unable to load the mojo 'makeBom' +* [GH-36379](https://github.com/apache/arrow/issues/36379) - [C++] Bundled dependency include paths should override system include dirs (#37612) +* [GH-36502](https://github.com/apache/arrow/issues/36502) - [C++] Add run-end encoded array support to ReferencedByteRanges (#36521) +* [GH-36610](https://github.com/apache/arrow/issues/36610) - [CI][C++] Don't enable ARROW_ACERO by default (#36611) +* [GH-36619](https://github.com/apache/arrow/issues/36619) - [Python] Parquet statistics string representation misleading (#36626) +* [GH-36634](https://github.com/apache/arrow/issues/36634) - [Dev] Ensure merge script goes over all pages when requesting info from GitHub (#36637) +* [GH-36638](https://github.com/apache/arrow/issues/36638) - [R] Error with create_package_with_all_dependencies() on Windows (#37226) +* [GH-36645](https://github.com/apache/arrow/issues/36645) - [Go] returns writer.Close error to caller when writing parquet (#36646) +* [GH-36655](https://github.com/apache/arrow/issues/36655) - [Dev] Fix fury command to upload nightly wheels (#36657) +* [GH-36663](https://github.com/apache/arrow/issues/36663) - [C++] Fix the default value information for enum options (#36684) +* [GH-36680](https://github.com/apache/arrow/issues/36680) - [Python] Add missing pytest.mark.acero (#36683) +* [GH-36685](https://github.com/apache/arrow/issues/36685) - [R][C++] Fix illegal opcode failure with Homebrew (#36705) +* [GH-36688](https://github.com/apache/arrow/issues/36688) - [C#] Fix dereference error (#36691) +* [GH-36692](https://github.com/apache/arrow/issues/36692) - [CI][Packaging] Pin gemfury to 0.12.0 due to issue with faraday dependency (#36693) +* [GH-36708](https://github.com/apache/arrow/issues/36708) - [C++] Fully calculate null-counts so the REE allocations make sense (#36740) +* [GH-36712](https://github.com/apache/arrow/issues/36712) - [CI] Also update issue components when it's updated (#36723) +* [GH-36720](https://github.com/apache/arrow/issues/36720) - [R] stringr modifier functions cannot be called with namespace prefix (#36758) +* [GH-36726](https://github.com/apache/arrow/issues/36726) - [R] calling read_parquet on S3 connections results in error message being ignored (#37024) +* [GH-36730](https://github.com/apache/arrow/issues/36730) - [Python] Add support for Cython 3.0.0 (#37097) +* [GH-36771](https://github.com/apache/arrow/issues/36771) - [R] stringr helper functions drop calling environment when evaluating (#36784) +* [GH-36776](https://github.com/apache/arrow/issues/36776) - [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls (#36780) +* [GH-36787](https://github.com/apache/arrow/issues/36787) - [R] lintr update leads to failing tests on main (#36788) +* [GH-36809](https://github.com/apache/arrow/issues/36809) - [Python] MapScalar.as_py with custom field name (#36830) +* [GH-36819](https://github.com/apache/arrow/issues/36819) - [R] Use RunWithCapturedR for reading Parquet files (#37274) +* [GH-36828](https://github.com/apache/arrow/issues/36828) - [C++][Parquet] Make buffered RowGroupSerializer using BufferedPageWriter (#36829) +* [GH-36850](https://github.com/apache/arrow/issues/36850) - [Go] Arrow Concatenate fix, ensure allocations are Free'd (#36854) +* [GH-36856](https://github.com/apache/arrow/issues/36856) - [C++] Remove needless braces from BasicDecimal256FromLE() arguments (#36987) +* [GH-36858](https://github.com/apache/arrow/issues/36858) - [Go] Fix dictionary builder leak (#36859) +* [GH-36860](https://github.com/apache/arrow/issues/36860) - [C++] Report CMake error when system Protobuf exists but system gRPC doesn't exist (#36904) +* [GH-36863](https://github.com/apache/arrow/issues/36863) - [C#] Remove unnecessary applied fix to not shutdown PythonEngine on CDataInterfacePythonTests if .NET is > 5.0 (#36872) +* [GH-36863](https://github.com/apache/arrow/issues/36863) - [C#][Packaging] Do not shutdown PythonEngine on CDataInterfacePythonTests if .NET is > 5.0 (#36868) +* [GH-36883](https://github.com/apache/arrow/issues/36883) - [R] Remove version number which triggers CRAN warning (#36884) +* [GH-36920](https://github.com/apache/arrow/issues/36920) - [Java][Docs] Add ARROW_JSON var to maven build profile (#36921) +* [GH-36922](https://github.com/apache/arrow/issues/36922) - [CI][C++][Windows] Search OpenSSL from PATH (#36923) +* [GH-36935](https://github.com/apache/arrow/issues/36935) - [Go] Fix Timestamp to Time dates (#36964) +* [GH-36939](https://github.com/apache/arrow/issues/36939) - [C++][Parquet] Direct put of BooleanArray is incorrect when called several times (#36972) +* [GH-36941](https://github.com/apache/arrow/issues/36941) - [CI][Docs] Use system Protobuf (#36943) +* [GH-36949](https://github.com/apache/arrow/issues/36949) - [C++] Fix KeyColumnArray's buffers array bounds assertion. (#36966) +* [GH-36973](https://github.com/apache/arrow/issues/36973) - [CI][Python] Archery linter integrated with flake8==6.1.0 (#36976) +* [GH-36975](https://github.com/apache/arrow/issues/36975) - [C++][FlightRPC] Skip unknown fields, don't crash (#36979) +* [GH-36981](https://github.com/apache/arrow/issues/36981) - [Go] Fix ipc reader leak (#36982) +* [GH-36983](https://github.com/apache/arrow/issues/36983) - [Python] Different get_file_info behaviour between pyarrow.fs.S3FileSystem and s3fs (#37768) +* [GH-36991](https://github.com/apache/arrow/issues/36991) - [Python][Packaging] Skip tests on Win that require a tz database (#36996) +* [GH-37017](https://github.com/apache/arrow/issues/37017) - [C++] Guard unexpected uses of BMI2 instructions (#37610) +* [GH-37022](https://github.com/apache/arrow/issues/37022) - [CI][Java] Use the official Maven download URL (#37119) +* [GH-37050](https://github.com/apache/arrow/issues/37050) - [Python][Interchange protocol] Add a workaround for empty dataframes (#38037) +* [GH-37056](https://github.com/apache/arrow/issues/37056) - [Java] Fix importing an empty data array from c-data (#37531) +* [GH-37067](https://github.com/apache/arrow/issues/37067) - [C++] Install bundled GoogleTest (#37483) +* [GH-37099](https://github.com/apache/arrow/issues/37099) - [C++] Fix build of Flight-UCX (#37105) +* [GH-37102](https://github.com/apache/arrow/issues/37102) - [Go][Parquet] Encoding: Make BitWriter Reserve when ReserveBytes (#37112) +* [GH-37106](https://github.com/apache/arrow/issues/37106) - [C++] Remove overflowed integer rounding benchmarks (#37109) +* [GH-37107](https://github.com/apache/arrow/issues/37107) - [C++] Suppress an unused variable warning with GCC 7 (#37240) +* [GH-37110](https://github.com/apache/arrow/issues/37110) - [C++] Expression: SmallestTypeFor lost tz for Scalar (#37135) +* [GH-37111](https://github.com/apache/arrow/issues/37111) - [C++][Parquet] Dataset: Fixing Schema Cast (#37793) +* [GH-37116](https://github.com/apache/arrow/issues/37116) - [C++][ORC] Link to absl::log_internal_check_op for ABSL_DCHECK*() (#37117) +* [GH-37120](https://github.com/apache/arrow/issues/37120) - [CI][Docs] Ensure removing existing Node.js (#37121) +* [GH-37129](https://github.com/apache/arrow/issues/37129) - [CI][Docs] Use Ubuntu 22.04 (#37132) +* [GH-37129](https://github.com/apache/arrow/issues/37129) - [CI][Docs] Free up disk space (#37131) +* [GH-37148](https://github.com/apache/arrow/issues/37148) - [C++] Explicitly list the integer values of the Type::type enum (#37149) +* [GH-37173](https://github.com/apache/arrow/issues/37173) - [C++][Go][Format] C-export/import Run-End Encoded Arrays (#37174) +* [GH-37208](https://github.com/apache/arrow/issues/37208) - [R] Use currrently running R binary to compile test program (nix install) (#37225) +* [GH-37213](https://github.com/apache/arrow/issues/37213) - [C#] Updating a reference to FlatBuffers missed due to rebase/merge conflict (#37214) +* [GH-37217](https://github.com/apache/arrow/issues/37217) - [Python] Add missing docstrings to Cython (#37218) +* [GH-37239](https://github.com/apache/arrow/issues/37239) - [Ruby] Updated documentation for ArrowTable#initialize to clarify argument details (#37261) +* [GH-37245](https://github.com/apache/arrow/issues/37245) - [MATLAB] `arrow.internal.proxy.validate` throws `MATLAB:UndefinedFunction` when crafting the message to display when throwing the `arrow:proxy:ProxyNameMismatch` error (#37248) +* [GH-37266](https://github.com/apache/arrow/issues/37266) - [CI][C++] Use ARROW_CMAKE_ARGS not CMAKE_ARGS (#37272) +* [GH-37276](https://github.com/apache/arrow/issues/37276) - [C++] Skip multithread tests on single thread env (#37327) +* [GH-37294](https://github.com/apache/arrow/issues/37294) - [C++] Use std::string for HasSubstr matcher (#37314) +* [GH-37299](https://github.com/apache/arrow/issues/37299) - [C++] Fix clang-format version mismatch error with Homebrew's clang-format (#37300) +* [GH-37303](https://github.com/apache/arrow/issues/37303) - [Python] Update test_option_class_equality due to CumulativeSumOptions refactor (#37305) +* [GH-37308](https://github.com/apache/arrow/issues/37308) - [C++][Docs] Change name for CPP tutorial and minor fixes to the job (#37311) +* [GH-37325](https://github.com/apache/arrow/issues/37325) - [R] Update NEWS.md with missing changes for 13.0.0 (#37326) +* [GH-37329](https://github.com/apache/arrow/issues/37329) - [Release][Homebrew] Follow directory structure change (#37349) +* [GH-37340](https://github.com/apache/arrow/issues/37340) - [MATLAB] The `column(index)` method of `arrow.tabular.RecordBatch` errors if `index` refers to an `arrow.array.Time32Array` column (#37347) +* [GH-37352](https://github.com/apache/arrow/issues/37352) - [C++] Don't put all dependencies to ArrowConfig.cmake/arrow.pc (#37399) +* [GH-37373](https://github.com/apache/arrow/issues/37373) - [CI] Make integration build a bit leaner (#37366) +* [GH-37373](https://github.com/apache/arrow/issues/37373) - [CI][Integration] Free up disk space (#37374) +* [GH-37377](https://github.com/apache/arrow/issues/37377) - [C#] Throw OverflowException on overflow in TimestampArray.ConvertTo() (#37388) +* [GH-37386](https://github.com/apache/arrow/issues/37386) - [R] CRAN failures due to "invalid non-character version specification" (#37387) +* [GH-37406](https://github.com/apache/arrow/issues/37406) - [C++][FlightSQL] Add missing ArrowFlight::arrow_flight_{shared,static} dependencies (#37407) +* [GH-37408](https://github.com/apache/arrow/issues/37408) - [C++] Install arrow-compute.pc only when ARROW_COMPUTE=ON (#37409) +* [GH-37410](https://github.com/apache/arrow/issues/37410) - [C++][Gandiva] Add support for using LLVM shared library (#37412) +* [GH-37411](https://github.com/apache/arrow/issues/37411) - [C++][Python] Add string -> date cast kernel (fix python scalar cast) (#38038) +* [GH-37414](https://github.com/apache/arrow/issues/37414) - [Release][CI] Update references to wrong apache-arrow Homebrew formula path (#37415) +* [GH-37419](https://github.com/apache/arrow/issues/37419) - [Go][Parquet] Decimal256 support for pqarrow (#37503) +* [GH-37431](https://github.com/apache/arrow/issues/37431) - [R] Tests failing for R versions < 4.0 because of use of base pipe (|>) in tests (#37432) +* [GH-37433](https://github.com/apache/arrow/issues/37433) - [CI][Release] Increase timeout for macOS (#37530) +* [GH-37437](https://github.com/apache/arrow/issues/37437) - [C++] Fix MakeArrayOfNull for list array with large string values type (#37467) +* [GH-37453](https://github.com/apache/arrow/issues/37453) - [C++][Parquet] Performance fix for WriteBatch (#37454) +* [GH-37456](https://github.com/apache/arrow/issues/37456) - [R] CRAN incoming checks show NOTE due to internal function which isn't documented (#37457) +* [GH-37463](https://github.com/apache/arrow/issues/37463) - [R] CRAN incoming checks fail due to test run length (#37464) +* [GH-37466](https://github.com/apache/arrow/issues/37466) - [C++][Parquet] Fix Valgrind failure in DELTA_BYTE_ARRAY decoder (#37471) +* [GH-37470](https://github.com/apache/arrow/issues/37470) - [Python][Parquet] Add missing arguments to `ParquetFileWriteOptions` (#37469) +* [GH-37480](https://github.com/apache/arrow/issues/37480) - [Python] Bump pandas version that contains regression for pandas issue 50127 (#37481) +* [GH-37485](https://github.com/apache/arrow/issues/37485) - [C++][Skyhook] Don't use deprecated BufferReader API (#37486) +* [GH-37487](https://github.com/apache/arrow/issues/37487) - [C++][Parquet] Dataset: Implement sync `ParquetFileFormat::GetReader` (#37514) +* [GH-37488](https://github.com/apache/arrow/issues/37488) - [C++] Disable unity build for Azure SDK for C++ (#37489) +* [GH-37500](https://github.com/apache/arrow/issues/37500) - [CI][C++] Disable Dataset and Substrait by default (#37501) +* [GH-37507](https://github.com/apache/arrow/issues/37507) - [GLib] Don't use implicit include directories (#37508) +* [GH-37515](https://github.com/apache/arrow/issues/37515) - [C++] Remove memory address optimization from `ChunkedArray::Equals(const std::shared_ptr& other)` if the `ChunkedArray` can have `NaN` values (#37579) +* [GH-37523](https://github.com/apache/arrow/issues/37523) - [C++][CI][CUDA] Don't use newer API and add missing CUDA dependencies (#37497) +* [GH-37535](https://github.com/apache/arrow/issues/37535) - [C++][Parquet] Add missing "thrift" dependency in parquet.pc (#37603) +* [GH-37539](https://github.com/apache/arrow/issues/37539) - [C++][FlightRPC] Fix binding to IPv6 addresses (#37552) +* [GH-37555](https://github.com/apache/arrow/issues/37555) - [Python] Update get_file_info_selector to ignore base directory (#37558) +* [GH-37560](https://github.com/apache/arrow/issues/37560) - [Python][Documentation] Replacing confusing batch size from 128Ki to 128_000 (#37605) +* [GH-37574](https://github.com/apache/arrow/issues/37574) - [Python] Compatibilty with numpy 2.0 (#38040) +* [GH-37576](https://github.com/apache/arrow/issues/37576) - [R] Use `SafeCallIntoR()` to call garbage collector after a failed allocation (#37565) +* [GH-37601](https://github.com/apache/arrow/issues/37601) - [C++][Parquet] Add missing GoogleMock dependency (#37602) +* [GH-37608](https://github.com/apache/arrow/issues/37608) - [C++][Gandiva] TO_DATE function supports YYYY-MM and YYYY (#37609) +* [GH-37614](https://github.com/apache/arrow/issues/37614) - [R][CI] Update CI jobs due to duckdb repo moving (#37615) +* [GH-37621](https://github.com/apache/arrow/issues/37621) - [Packaging][Conda] Sync conda recipes with feedstocks (#37624) +* [GH-37639](https://github.com/apache/arrow/issues/37639) - [CI] Fix checkout on older OSes (#37640) +* [GH-37648](https://github.com/apache/arrow/issues/37648) - [Packaging][Linux] Fix libarrow-glib-dev/arrow-glib-devel dependencies (#37714) +* [GH-37650](https://github.com/apache/arrow/issues/37650) - [Python] Check filter inputs in FilterMetaFunction (#38075) +* [GH-37671](https://github.com/apache/arrow/issues/37671) - [R] legacy timezone symlinks cause CRAN failures (#37672) +* [GH-37712](https://github.com/apache/arrow/issues/37712) - [Go][Parquet] Fix ARM64 assembly for bitmap extract bits (#37785) +* [GH-37715](https://github.com/apache/arrow/issues/37715) - [Packaging][CentOS] Use default g++ on CentOS 9 Stream (#37718) +* [GH-37730](https://github.com/apache/arrow/issues/37730) - [C#] throw OverflowException in DecimalUtility if fractionalPart is too large (#37731) +* [GH-37735](https://github.com/apache/arrow/issues/37735) - [C++][FreeBSD] Suppress a shorten-64-to-32 warning (#38004) +* [GH-37738](https://github.com/apache/arrow/issues/37738) - [Go][CI] Update Go version for verification (#37745) +* [GH-37750](https://github.com/apache/arrow/issues/37750) - [R][C++] Add compatability with IntelLLVM (#37781) +* [GH-37767](https://github.com/apache/arrow/issues/37767) - [C++][CMake] Don't touch .git/index (#38003) +* [GH-37771](https://github.com/apache/arrow/issues/37771) - [Go][Benchmarking] Update Conbench git info (#37772) +* [GH-37803](https://github.com/apache/arrow/issues/37803) - [Python][CI] Pin setuptools_scm to fix release verification scripts (#37930) +* [GH-37803](https://github.com/apache/arrow/issues/37803) - [CI][Dev][Python] Release and merge script errors (#37819) +* [GH-37805](https://github.com/apache/arrow/issues/37805) - [CI][MATLAB] Hard-code `release` to `R2023a` for `matlab-actions/setup-matlab` action in MATLAB CI workflows (#37808) +* [GH-37813](https://github.com/apache/arrow/issues/37813) - [R] add quoted_na argument to open_delim_dataset() (#37828) +* [GH-37829](https://github.com/apache/arrow/issues/37829) - [Java] Avoid resizing data buffer twice when appending variable length vectors (#37844) +* [GH-37834](https://github.com/apache/arrow/issues/37834) - [Gandiva] Migrate to new LLVM PassManager API (#37867) +* [GH-37845](https://github.com/apache/arrow/issues/37845) - [Go][Parquet] Check the number of logical fields instead of physical columns (#37846) +* [GH-37858](https://github.com/apache/arrow/issues/37858) - [Docs][JS] Fix check of remote URL to generate JS docs (#37870) +* [GH-37893](https://github.com/apache/arrow/issues/37893) - [Java] Move Types.proto in a subfolder (#37894) +* [GH-37907](https://github.com/apache/arrow/issues/37907) - [R] Setting rosetta variable is missing (#37961) +* [GH-37927](https://github.com/apache/arrow/issues/37927) - [CI][Dev][Archery] Badges for crossbow jobs always show \`no status\` even when they have failed or succeeded +* [GH-37936](https://github.com/apache/arrow/issues/37936) - [CI] Fix integration testing in rc-verify nightly builds (#37933) +* [GH-37950](https://github.com/apache/arrow/issues/37950) - [R] tests fail on R < 4.0 due to test calling data.frame() without specifying stringsAsFactors=FALSE (#37951) +* [GH-37952](https://github.com/apache/arrow/issues/37952) - [C++] Make unique->shared explicit to fix build failure on at least one compiler (#38136) +* [GH-37993](https://github.com/apache/arrow/issues/37993) - [CI] Fix conda-integration build (#37990) +* [GH-37999](https://github.com/apache/arrow/issues/37999) - [CI][Archery] Install python3-dev on ARM jobs to have access to Python.h (#38009) +* [GH-38011](https://github.com/apache/arrow/issues/38011) - [C++][Dataset] Change force close to tend to close on write (#38030) +* [GH-38014](https://github.com/apache/arrow/issues/38014) - [Python] pyarrow extension type is not converted to pandas properly in 13.0.0 +* [GH-38034](https://github.com/apache/arrow/issues/38034) - [Python] DataFrame Interchange Protocol - correct dtype information for categorical columns (#38065) +* [GH-38039](https://github.com/apache/arrow/issues/38039) - [C++][Parquet] Fix segfault getting compression level for a Parquet column (#38025) +* [GH-38049](https://github.com/apache/arrow/issues/38049) - [R] Prevent `on_rosetta()` from warning (#38052) +* [GH-38057](https://github.com/apache/arrow/issues/38057) - [Python][CI] Fix flaky hypothesis tests (#38058) +* [GH-38059](https://github.com/apache/arrow/issues/38059) - [Python][CI] Upgrade CUDA to 11.2.2 (#38081) +* [GH-38060](https://github.com/apache/arrow/issues/38060) - [Python][CI] Upgrade Spark versions (#38082) +* [GH-38068](https://github.com/apache/arrow/issues/38068) - [C++][CI] Fixing Parquet unittest `arrow_reader_writer_test.cc` compile (#38069) +* [GH-38074](https://github.com/apache/arrow/issues/38074) - [C++] Fix Offset Size Calculation for Slicing Large String and Binary Types in Hash Join (#38147) +* [GH-38076](https://github.com/apache/arrow/issues/38076) - [Java][CI][Java-Jars][MacOS] C++ libraries for MacOS AARCH 64 +* [GH-38077](https://github.com/apache/arrow/issues/38077) - [C++] Output bundled GoogleTest to ${BUILD_DIR}/${CONFIG} (#38132) +* [GH-38084](https://github.com/apache/arrow/issues/38084) - [R] Do not memory map when explicitly checking for file removal (#38085) +* [GH-38193](https://github.com/apache/arrow/issues/38193) - [CI][Java] Free up disk space for "AMD64 manylinux2014 Java JNI" (#38194) +* [GH-38197](https://github.com/apache/arrow/issues/38197) - [R] Update actions that used setup-r@v1 to use setup-r@v2 (#38218) +* [GH-38200](https://github.com/apache/arrow/issues/38200) - [CI][Release][Go] Ensure removing all module caches (#38222) +* [GH-38201](https://github.com/apache/arrow/issues/38201) - [CI][Packaging] Pin zlib 1.2.13 when using thrift on conan (#38202) +* [GH-38206](https://github.com/apache/arrow/issues/38206) - [CI] Remove more pre-installed files (#38233) +* [GH-38226](https://github.com/apache/arrow/issues/38226) - [R] Remove R 3.5 from test-r-versions (#38230) +* [GH-38227](https://github.com/apache/arrow/issues/38227) - [R] Fix non-unicode character errors in nightly builds (#38232) +* [GH-38228](https://github.com/apache/arrow/issues/38228) - [R] Fence examples that need dataset with `examplesIf` (#38229) +* [GH-38239](https://github.com/apache/arrow/issues/38239) - [CI][Python] Disable -W error on Python CI jobs temporarily (#38238) +* [GH-38263](https://github.com/apache/arrow/issues/38263) - [C++] : Prefer to call string_view::data() instead of begin() where a char pointer is expected (#38265) +* [GH-38282](https://github.com/apache/arrow/issues/38282) - [C++] : Implement ReplaceString with the right type signature (#38283) +* [GH-38286](https://github.com/apache/arrow/issues/38286) - [CI][R] Clean GitHub runner disk for ubuntu-r-only-r images (#38287) +* [GH-38293](https://github.com/apache/arrow/issues/38293) - [R] Fix non-deterministic duckdb test (#38294) +* [GH-38295](https://github.com/apache/arrow/issues/38295) - [CI][R] Free up disk space for Azure Pipelines jobs (#38302) +* [GH-38332](https://github.com/apache/arrow/issues/38332) - [CI][Release] Resolve symlinks in RAT lint (#38337) + + +## New Features and Improvements + +* [GH-20086](https://github.com/apache/arrow/issues/20086) - [C++] Cast between fixed size and variable size lists (#37292) +* [GH-21815](https://github.com/apache/arrow/issues/21815) - [JS] Add support for Duration type (#37341) +* [GH-24868](https://github.com/apache/arrow/issues/24868) - [C++] Add a Tensor logical value type with varying dimensions, implemented using ExtensionType (#37166) +* [GH-25659](https://github.com/apache/arrow/issues/25659) - [Java] Add DefaultVectorComparators for Large types (#37887) +* [GH-29184](https://github.com/apache/arrow/issues/29184) - [R] Read CSV with comma as decimal mark (#38002) +* [GH-29238](https://github.com/apache/arrow/issues/29238) - [C++][Dataset][Parquet] Support parquet modular encryption in the new Dataset API (#34616) +* [GH-29847](https://github.com/apache/arrow/issues/29847) - [C++] Build with Azure SDK for C++ (#36835) +* [GH-32863](https://github.com/apache/arrow/issues/32863) - [C++][Parquet] Add DELTA_BYTE_ARRAY encoder to Parquet writer (#14341) +* [GH-33032](https://github.com/apache/arrow/issues/33032) - [C#] Support fixed-size lists (#35716) +* [GH-33749](https://github.com/apache/arrow/issues/33749) - [Ruby] Add Arrow::RecordBatch#each_raw_record (#37137) +* [GH-33985](https://github.com/apache/arrow/issues/33985) - [C++] Add substrait serialization/deserialization for expressions (#34834) +* [GH-34031](https://github.com/apache/arrow/issues/34031) - [Python] Use PyCapsule for communicating C Data Interface pointers at the Python level +* [GH-34105](https://github.com/apache/arrow/issues/34105) - [R] Provide extra output for failed builds (#37727) +* [GH-34213](https://github.com/apache/arrow/issues/34213) - [C++] Use recursive calls without a delimiter if the user is doing a recursive GetFileInfo (#35440) +* [GH-34252](https://github.com/apache/arrow/issues/34252) - [Java] Support ScannerBuilder::Project or ScannerBuilder::Filter as a Substrait proto extended expression (#35570) +* [GH-34588](https://github.com/apache/arrow/issues/34588) - [C++][Python] Add a MetaFunction for "dictionary_decode" (#35356) +* [GH-34620](https://github.com/apache/arrow/issues/34620) - [C#] Support DateOnly and TimeOnly on .NET 6.0+ (#36125) +* [GH-34950](https://github.com/apache/arrow/issues/34950) - [C++][Parquet] Support encryption for page index (#36574) +* [GH-35116](https://github.com/apache/arrow/issues/35116) - [CI][C++] Enable compile-time AVX2 on some CI platforms (#36662) +* [GH-35176](https://github.com/apache/arrow/issues/35176) - [C++] Add support for disabling threading for emscripten (#35672) +* [GH-35243](https://github.com/apache/arrow/issues/35243) - [C#] Implement MapType (#37885) +* [GH-35273](https://github.com/apache/arrow/issues/35273) - [C++] Add integer round kernels (#36289) +* [GH-35287](https://github.com/apache/arrow/issues/35287) - [C++][Parquet] Add CodecOptions to customize the compression parameter (#35886) +* [GH-35296](https://github.com/apache/arrow/issues/35296) - [Go] Add arrow.Table.String() (#35580) +* [GH-35409](https://github.com/apache/arrow/issues/35409) - [Python][Docs] Clarify S3FileSystem Credentials chain for EC2 (#35312) +* [GH-35531](https://github.com/apache/arrow/issues/35531) - [Python] C Data Interface PyCapsule Protocol (#37797) +* [GH-35600](https://github.com/apache/arrow/issues/35600) - [Python] Allow setting path to timezone db through python API (#37436) +* [GH-35623](https://github.com/apache/arrow/issues/35623) - [C++][Python] FixedShapeTensorType.ToString() should print the type's parameters (#36496) +* [GH-35627](https://github.com/apache/arrow/issues/35627) - [Format][Integration] Add string-view to arrow format (#37526) +* [GH-35698](https://github.com/apache/arrow/issues/35698) - [C#] Update FlatBuffers (#35699) +* [GH-35740](https://github.com/apache/arrow/issues/35740) - Add documentation for list arrays' values property (#35865) +* [GH-35775](https://github.com/apache/arrow/issues/35775) - [Go][Parquet] Allow key value file metadata to be written after writing row groups (#37786) +* [GH-35903](https://github.com/apache/arrow/issues/35903) - [C++] Skeleton for Azure Blob Storage filesystem implementation (#35701) +* [GH-35916](https://github.com/apache/arrow/issues/35916) - [Java][arrow-jdbc] Add extra fields to JdbcFieldInfo (#37123) +* [GH-35934](https://github.com/apache/arrow/issues/35934) - [C++][Parquet] PageIndex Read benchmark (#36702) +* [GH-36078](https://github.com/apache/arrow/issues/36078) - [C#] Flight SQL implementation for C# (#36079) +* [GH-36103](https://github.com/apache/arrow/issues/36103) - [C++] Initial device sync API (#37040) +* [GH-36111](https://github.com/apache/arrow/issues/36111) - [C++] Refactor dict_internal.h to use Result (#37754) +* [GH-36124](https://github.com/apache/arrow/issues/36124) - [C++] Export compile_commands.json by default (#37426) +* [GH-36155](https://github.com/apache/arrow/issues/36155) - [C++][Go][Java][FlightRPC] Add support for long-running queries (#36946) +* [GH-36187](https://github.com/apache/arrow/issues/36187) - [C++] Display the name of the problematic field when returning status "Data type ... is not supported in join non-key field" for HashJoin (#36539) +* [GH-36199](https://github.com/apache/arrow/issues/36199) - [Python][CI][Spark] Update spark versions used on our nightly tests (#36347) +* [GH-36240](https://github.com/apache/arrow/issues/36240) - [Python] Refactor CumulativeSumOptions to a separate class for independent deprecation (#36977) +* [GH-36247](https://github.com/apache/arrow/issues/36247) - [R] Add write_csv_dataset (#36436) +* [GH-36326](https://github.com/apache/arrow/issues/36326) - [C++] Remove APIs deprecated in v9.0 or earlier (#36675) +* [GH-36363](https://github.com/apache/arrow/issues/36363) - [MATLAB] Create proxy classes for the DataType class hierarchy (#36419) +* [GH-36417](https://github.com/apache/arrow/issues/36417) - [C++] Add Buffer::data_as, Buffer::mutable_data_as (#36418) +* [GH-36420](https://github.com/apache/arrow/issues/36420) - [C++] Add An Enum Option For SetLookup Options (#36739) +* [GH-36433](https://github.com/apache/arrow/issues/36433) - [C++] Update fast_float version to 3.10.1 (#36434) +* [GH-36469](https://github.com/apache/arrow/issues/36469) - [Java][Packaging] Distribute linux aarch64 libs with mavencentral jars (#36487) +* [GH-36488](https://github.com/apache/arrow/issues/36488) - [C++] Import/Export ArrowDeviceArray (#36489) +* [GH-36511](https://github.com/apache/arrow/issues/36511) - [C++][FlightRPC] Get rid of GRPCPP_PP_INCLUDE (#36679) +* [GH-36512](https://github.com/apache/arrow/issues/36512) - [C++][FlightRPC] Add async GetFlightInfo client call (#36517) +* [GH-36546](https://github.com/apache/arrow/issues/36546) - [Swift] The initial implementation for swift arrow flight (#36547) +* [GH-36570](https://github.com/apache/arrow/issues/36570) - [Dev] Add "Component: Swift" label to PRs (#36571) +* [GH-36573](https://github.com/apache/arrow/issues/36573) - [CI] Remove Travis CI related files and mentions (#36741) +* [GH-36590](https://github.com/apache/arrow/issues/36590) - [Docs] Support Pydata Sphinx Theme 0.14.0 (#36591) +* [GH-36601](https://github.com/apache/arrow/issues/36601) - [MATLAB] Add a MATLAB "type traits" class hierarchy (#36653) +* [GH-36614](https://github.com/apache/arrow/issues/36614) - [MATLAB] Subclass arrow::Buffer to keep MATLAB data backing arrow::Arrays alive (#36615) +* [GH-36618](https://github.com/apache/arrow/issues/36618) - [C++] Add a test for evaluation of ARROW_CHECK payload (#36617) +* [GH-36621](https://github.com/apache/arrow/issues/36621) - [C++] Add documentation for ACERO_ALIGNMENT_HANDLING (#36622) +* [GH-36623](https://github.com/apache/arrow/issues/36623) - [Go] NullType support for csv (#36624) +* [GH-36642](https://github.com/apache/arrow/issues/36642) - [Python][CI] Configure warnings as errors during pytest (#37018) +* [GH-36643](https://github.com/apache/arrow/issues/36643) - [C++][Parquet] Use nested namespace in parquet (#36647) +* [GH-36652](https://github.com/apache/arrow/issues/36652) - [MATLAB] Initialize the `Type` property of `arrow.array.Array` subclasses from existing proxy ids (#36731) +* [GH-36666](https://github.com/apache/arrow/issues/36666) - [Python][CI] Re-enable skipped dask test_pandas_timestamp_overflow_pyarrow test (#38066) +* [GH-36671](https://github.com/apache/arrow/issues/36671) - [Go] BinaryMemoTable optimize allocations of GetOrInsert (#36811) +* [GH-36672](https://github.com/apache/arrow/issues/36672) - [Python][C++] Add support for vector function UDF (#36673) +* [GH-36674](https://github.com/apache/arrow/issues/36674) - [C++] Use anonymous namespace in arrow/ipc/reader.cc (#36937) +* [GH-36696](https://github.com/apache/arrow/issues/36696) - [Go] Improve the MapOf and ListOf helpers (#36697) +* [GH-36698](https://github.com/apache/arrow/issues/36698) - [Go][Parquet] Add a TimestampLogicalType creation function … (#36699) +* [GH-36709](https://github.com/apache/arrow/issues/36709) - [Python] Allow to specify use_threads=False in Table.group_by to have stable ordering (#36768) +* [GH-36734](https://github.com/apache/arrow/issues/36734) - [MATLAB] template arrow::matlab::proxy::NumericArray on ArrowType instead of CType (#36738) +* [GH-36735](https://github.com/apache/arrow/issues/36735) - Add `TimeUnit` and `TimeZone` to the `arrow.type.TimestampType` display (#36871) +* [GH-36750](https://github.com/apache/arrow/issues/36750) - [R] Fix test-r-devdocs on MacOS (#36751) +* [GH-36752](https://github.com/apache/arrow/issues/36752) - [Python] Remove AWS SDK bundling when building wheels (#36925) +* [GH-36762](https://github.com/apache/arrow/issues/36762) - [Dev] Remove only component labels when an issue is updated (#36763) +* [GH-36765](https://github.com/apache/arrow/issues/36765) - [Python][Dataset] Change default of pre_buffer to True for reading Parquet files (#37854) +* [GH-36767](https://github.com/apache/arrow/issues/36767) - [C++][CI] Fix test failure on i386 (#36769) +* [GH-36770](https://github.com/apache/arrow/issues/36770) - [C++] Use custom endpoint for s3 using environment variable AWS_ENDPOINT_URL (#36791) +* [GH-36773](https://github.com/apache/arrow/issues/36773) - [C++][Parquet] Avoid calculating prebuffer column bitmap multiple times (#36774) +* [GH-36789](https://github.com/apache/arrow/issues/36789) - [C++] Support divide(duration, duration) (#36800) +* [GH-36793](https://github.com/apache/arrow/issues/36793) - [Go] Allow NewSchemaFromStruct to skip fields if tagged with parquet:"-" (#36794) +* [GH-36795](https://github.com/apache/arrow/issues/36795) - [C#] Implement support for dense and sparse unions (#36797) +* [GH-36816](https://github.com/apache/arrow/issues/36816) - [C#] Reduce allocations (#36817) +* [GH-36824](https://github.com/apache/arrow/issues/36824) - [C++] Improve the test tracing of CheckWithDifferentShapes in the if-else kernel tests (#36825) +* [GH-36837](https://github.com/apache/arrow/issues/36837) - [CI][RPM] Use multi-cores to install gems (#36838) +* [GH-36843](https://github.com/apache/arrow/issues/36843) - [Python][Docs] Add dict to docstring (#36842) +* [GH-36845](https://github.com/apache/arrow/issues/36845) - [C++][Python] Allow type promotion on `pa.concat_tables` (#36846) +* [GH-36852](https://github.com/apache/arrow/issues/36852) - [MATLAB] Add `arrow.type.Field` class (#36855) +* [GH-36853](https://github.com/apache/arrow/issues/36853) - [MATLAB] Add utility to create proxies from existing `arrow::DataType` objects (#36873) +* [GH-36867](https://github.com/apache/arrow/issues/36867) - [C++] Add a struct_ and schema overload taking a vector of (name, type) pairs (#36915) +* [GH-36874](https://github.com/apache/arrow/issues/36874) - [MATLAB] Move type constructor functions from the `arrow.type` package to `arrow` package (#36875) +* [GH-36882](https://github.com/apache/arrow/issues/36882) - [C++][Parquet] Use RLE as BOOLEAN default encoding when both data page and version is V2 (#38163) +* [GH-36882](https://github.com/apache/arrow/issues/36882) - [C++][Parquet] Default RLE for bool values in the parquet version 2.x (#36955) +* [GH-36885](https://github.com/apache/arrow/issues/36885) - [Java][Docs] Add substrait dependency to maven build profiles (#36899) +* [GH-36886](https://github.com/apache/arrow/issues/36886) - [C++] Configure `azurite` in preparation for testing Azure C++ filesystem (#36988) +* [GH-36893](https://github.com/apache/arrow/issues/36893) - [Go][Flight] Expose underlying protobuf definitions (#36895) +* [GH-36905](https://github.com/apache/arrow/issues/36905) - [C++] Add support for SparseUnion to selection functions (#36906) +* [GH-36927](https://github.com/apache/arrow/issues/36927) - [Java][Docs] Enable Gandiva build as part of Java maven commands (#36929) +* [GH-36931](https://github.com/apache/arrow/issues/36931) - [C++] Add cumulative_mean function (#36932) +* [GH-36933](https://github.com/apache/arrow/issues/36933) - [Python] Pointless ellipsis in array repr (#37168) +* [GH-36936](https://github.com/apache/arrow/issues/36936) - [Go] Make it possible to register custom functions. (#36959) +* [GH-36944](https://github.com/apache/arrow/issues/36944) - [C++] Unify OpenSSL detection for building GCS (#36945) +* [GH-36950](https://github.com/apache/arrow/issues/36950) - [C++] Change std::vector> to use it's alias: FieldVector (#37101) +* [GH-36952](https://github.com/apache/arrow/issues/36952) - [C++][FlightRPC][Python] Add methods to send headers (#36956) +* [GH-36953](https://github.com/apache/arrow/issues/36953) - [MATLAB] Add gateway `arrow.array` function to create Arrow Arrays from MATLAB data (#36978) +* [GH-36961](https://github.com/apache/arrow/issues/36961) - [MATLAB] Add `arrow.tabular.Schema` class and associated `arrow.schema` construction function (#37013) +* [GH-36970](https://github.com/apache/arrow/issues/36970) - [C++][Parquet] Minor style fix for parquet metadata (#36971) +* [GH-36984](https://github.com/apache/arrow/issues/36984) - [MATLAB] Create `arrow.recordbatch` convenience constructor function (#37025) +* [GH-36990](https://github.com/apache/arrow/issues/36990) - [R] Expose Parquet ReaderProperties (#36992) +* [GH-36994](https://github.com/apache/arrow/issues/36994) - [Java] Use JDK 21 in CI (#38219) +* [GH-37012](https://github.com/apache/arrow/issues/37012) - [MATLAB] Remove the private property `ArrowArrays` from `arrow.tabular.RecordBatch` (#37015) +* [GH-37014](https://github.com/apache/arrow/issues/37014) - [C++][Parquet] Preserve some Parquet distinct counts when merging stats (#37016) +* [GH-37021](https://github.com/apache/arrow/issues/37021) - [Java][arrow-jdbc] Pluggable getConsumer (#37085) +* [GH-37028](https://github.com/apache/arrow/issues/37028) - [C++] Add support for duration types to if_else functions (#37064) +* [GH-37041](https://github.com/apache/arrow/issues/37041) - [MATLAB] Implement Feather V1 Reader using new MATLAB Interface APIs (#37044) +* [GH-37042](https://github.com/apache/arrow/issues/37042) - [MATLAB] Implement Feather V1 Writer using new MATLAB Interface APIs (#37043) +* [GH-37045](https://github.com/apache/arrow/issues/37045) - [MATLAB] Implement featherwrite in terms of arrow.internal.io.feather.Writer (#37047) +* [GH-37046](https://github.com/apache/arrow/issues/37046) - [MATLAB] Implement `featherread` in terms of `arrow.internal.io.feather.Reader` (#37163) +* [GH-37049](https://github.com/apache/arrow/issues/37049) - [MATLAB] Update feather `Reader` and `Writer` objects to work directly with `arrow.tabular.RecordBatch`s instead of MATLAB `table`s (#37052) +* [GH-37051](https://github.com/apache/arrow/issues/37051) - [Dev][JS] Add Dependabot configuration for npm (#37053) +* [GH-37073](https://github.com/apache/arrow/issues/37073) - [Java] JDBC: Only use username/pass auth if token is not provided (#37083) +* [GH-37093](https://github.com/apache/arrow/issues/37093) - [Python] Add async Flight client with GetFlightInfo (#36986) +* [GH-37096](https://github.com/apache/arrow/issues/37096) - [MATLAB] Add utility which makes valid MATLAB table variable names from an arbitrary list of strings (#37098) +* [GH-37124](https://github.com/apache/arrow/issues/37124) - [MATLAB] Add utility functions for validating numeric and string index values (#37150) +* [GH-37128](https://github.com/apache/arrow/issues/37128) - [Java] Bump CI job from JDK 18 to JDK 20 (#37125) +* [GH-37141](https://github.com/apache/arrow/issues/37141) - [GLib][FlightRPC] Add more ArrowFlight::ClientOptions properties (#37142) +* [GH-37143](https://github.com/apache/arrow/issues/37143) - [GLib][FlightSQL] Add support for prepared INSERT (#37196) +* [GH-37144](https://github.com/apache/arrow/issues/37144) - [C++] Add RecordBatchFileReader::To{RecordBatches,Table} (#37167) +* [GH-37145](https://github.com/apache/arrow/issues/37145) - [Python] support boolean columns with bitsize 1 in from_dataframe (#37975) +* [GH-37151](https://github.com/apache/arrow/issues/37151) - [MATLAB] Use `makeValidVariableNames` and `makeValidDimensionNames` in implementation of `table` method for `RecordBatch` (#37152) +* [GH-37155](https://github.com/apache/arrow/issues/37155) - [MATLAB] Use `arrow.internal.validate.index.numeric()` in the `column()` method of `arrow.tabular.RecordBatch` (#37156) +* [GH-37157](https://github.com/apache/arrow/issues/37157) - [MATLAB] Use `arrow.internal.validate.index.numericOrString()` in the `field()` method of `arrow.tabular.Schema` (#37162) +* [GH-37160](https://github.com/apache/arrow/issues/37160) - [MATLAB] `arrow.internal.validate.index.string()` should not error if given a string with zero characters (#37161) +* [GH-37170](https://github.com/apache/arrow/issues/37170) - [C++] Support schema rewriting of RecordBatch. (#37171) +* [GH-37175](https://github.com/apache/arrow/issues/37175) - [MATLAB] Support creating `arrow.tabular.RecordBatch` instances from a list of `arrow.array.Array` values (#37176) +* [GH-37179](https://github.com/apache/arrow/issues/37179) - [MATLAB] Add a test utility that creates a MATLAB `table` containing all supported types (#37191) +* [GH-37181](https://github.com/apache/arrow/issues/37181) - [MATLAB] Remove outdated test class` tArrowCppCall.m` (#37185) +* [GH-37182](https://github.com/apache/arrow/issues/37182) - [MATLAB] Add public `Schema` property to MATLAB `arrow.tabular.RecordBatch` class (#37184) +* [GH-37187](https://github.com/apache/arrow/issues/37187) - [MATLAB] Re-implement `tfeathermex.m` tests in terms of new internal Feather Reader and Writer objects (#37189) +* [GH-37188](https://github.com/apache/arrow/issues/37188) - [MATLAB] Move `test/util/featherRoundTrip.m` into a packaged test utility function (#37190) +* [GH-37203](https://github.com/apache/arrow/issues/37203) - [MATLAB] Remove unused feather V1 MEX infrastructure and code (#37204) +* [GH-37209](https://github.com/apache/arrow/issues/37209) - [CI][Docs][MATLAB] Remove support for `MATLAB_ARROW_INTERFACE` flag from CMake build system and build new MATLAB Interface code by default (#37211) +* [GH-37210](https://github.com/apache/arrow/issues/37210) - [Docs][MATLAB] Update MATLAB `README.md` to mention support for new MATLAB APIs (e.g. `RecordBatch`, `Field`, `Schema`, etc.) (#37215) +* [GH-37212](https://github.com/apache/arrow/issues/37212) - [C++] IO: Add FromString to ::arrow::io::BufferReader (#37360) +* [GH-37216](https://github.com/apache/arrow/issues/37216) - [Docs] adding documentation to deal with unreleased allocators (#37498) +* [GH-37222](https://github.com/apache/arrow/issues/37222) - [Docs][MATLAB] Rename `arrow.recordbatch` (all lowercase) to `arrow.recordBatch` (camelCase) (#37223) +* [GH-37228](https://github.com/apache/arrow/issues/37228) - [MATLAB] Add C++ `ARROW_MATLAB_EXPORT` symbol export macro (#37233) +* [GH-37229](https://github.com/apache/arrow/issues/37229) - [MATLAB] Add `arrow.type.Date32Type` class and `arrow.date32` construction function (#37348) +* [GH-37230](https://github.com/apache/arrow/issues/37230) - [MATLAB] Add `arrow.type.Date64Type` class and `arrow.date64` construction function (#37578) +* [GH-37231](https://github.com/apache/arrow/issues/37231) - [MATLAB] Add `arrow.type.Time32Type` class and `arrow.time32` construction function (#37250) +* [GH-37232](https://github.com/apache/arrow/issues/37232) - [MATLAB] Add `arrow.type.Time64Type` class and `arrow.time64` construction function (#37287) +* [GH-37234](https://github.com/apache/arrow/issues/37234) - [MATLAB] Create an abstract `arrow.type.TemporalType` class (#37236) +* [GH-37237](https://github.com/apache/arrow/issues/37237) - [C++] Set extraction time to all downloaded contents timestamp (#37238) +* [GH-37244](https://github.com/apache/arrow/issues/37244) - [Python] Remove support for pickle5 (#37644) +* [GH-37246](https://github.com/apache/arrow/issues/37246) - [Java] expose VectorAppender class to offer support to append vector values (#37247) +* [GH-37251](https://github.com/apache/arrow/issues/37251) - [MATLAB] Make `arrow.type.TemporalType` a "tag" class (#37256) +* [GH-37252](https://github.com/apache/arrow/issues/37252) - [MATLAB] Add `arrow.type.DateUnit` enumeration class (#37280) +* [GH-37253](https://github.com/apache/arrow/issues/37253) - [MATLAB] Add test cases which verify that the `NumFields`, `BitWidth`, and `ID` properties can not be modified to `hFixedWidth` test class (#37316) +* [GH-37254](https://github.com/apache/arrow/issues/37254) - [Python] Parametrize all pickling tests to use both the pickle and cloudpickle modules (#37255) +* [GH-37257](https://github.com/apache/arrow/issues/37257) - [Ruby][FlightSQL] Use the same options for auto prepared statement close request (#37258) +* [GH-37259](https://github.com/apache/arrow/issues/37259) - [Ruby] Add explicit csv gem dependency (#37506) +* [GH-37262](https://github.com/apache/arrow/issues/37262) - [MATLAB] Add an abstract class called `arrow.type.TimeType` (#37279) +* [GH-37268](https://github.com/apache/arrow/issues/37268) - [C++] adding move in some ctor in fs and dataset (#37264) +* [GH-37273](https://github.com/apache/arrow/issues/37273) - [C++] Bump vendored xxhash version (#37275) +* [GH-37290](https://github.com/apache/arrow/issues/37290) - [MATLAB] Add `arrow.array.Time32Array` class (#37315) +* [GH-37293](https://github.com/apache/arrow/issues/37293) - [C++][Parquet] Encoding: Add Benchmark for DELTA_BYTE_ARRAY (#37641) +* [GH-37306](https://github.com/apache/arrow/issues/37306) - [Go] Add binary dictionary unifier (#37309) +* [GH-37307](https://github.com/apache/arrow/issues/37307) - [Python][CI] Manually skip tests with skip_with_pyarrow_strings marker for nightly dask integration tests (#37324) +* [GH-37330](https://github.com/apache/arrow/issues/37330) - [Docs][CI] Increase the Timeout for the Sphinx build (#37331) +* [GH-37334](https://github.com/apache/arrow/issues/37334) - [Packaging][Release][RPM] Don't remove old repodata/* (#37351) +* [GH-37337](https://github.com/apache/arrow/issues/37337) - [MATLAB] Add `arrow.array.Time64Array` class (#37368) +* [GH-37345](https://github.com/apache/arrow/issues/37345) - [MATLAB] Add function handle to `fromMATLAB` static construction methods to `TypeTraits` classes (#37370) +* [GH-37364](https://github.com/apache/arrow/issues/37364) - [C++][GPU] Add CUDA impl of Device Event/Stream (#37365) +* [GH-37367](https://github.com/apache/arrow/issues/37367) - [MATLAB] Add `arrow.array.Date32Array` class (#37445) +* [GH-37379](https://github.com/apache/arrow/issues/37379) - [C++][Parquet] Thrift: Generate movable types (#37461) +* [GH-37384](https://github.com/apache/arrow/issues/37384) - [R] Set _R_CHECK_STOP_ON_INVALID_NUMERIC_VERSION_INPUTS_ = TRUE on CI (#37385) +* [GH-37391](https://github.com/apache/arrow/issues/37391) - [MATLAB] Implement the `isequal()` method on `arrow.array.Array` (#37446) +* [GH-37392](https://github.com/apache/arrow/issues/37392) - [JS] Remove lerna (#37393) +* [GH-37394](https://github.com/apache/arrow/issues/37394) - [C++][S3] Use AWS_SDK_VERSION_* instead of try_compile() (#37395) +* [GH-37416](https://github.com/apache/arrow/issues/37416) - [Go] Allow accessing underlying index builder of dictionary builders (#37417) +* [GH-37434](https://github.com/apache/arrow/issues/37434) - [C++] IO: Refactor BufferedInputStream::Read for small input (#37460) +* [GH-37440](https://github.com/apache/arrow/issues/37440) - [C#][Docs] Add Flight SQL supported functions to status.rst (#37441) +* [GH-37447](https://github.com/apache/arrow/issues/37447) - [C++][Docs] Document `ARROW_SUBSTRAIT` CMake flag (#37451) +* [GH-37448](https://github.com/apache/arrow/issues/37448) - [MATLAB] Add `arrow.array.ChunkedArray` class (#37525) +* [GH-37465](https://github.com/apache/arrow/issues/37465) - [Go] Add Value method to BooleanBuilder (#37459) +* [GH-37472](https://github.com/apache/arrow/issues/37472) - [MATLAB] Implement the `isequal()` method on `arrow.type.Type` (#37474) +* [GH-37473](https://github.com/apache/arrow/issues/37473) - [MATLAB] Add support for indexing `RecordBatch` columns by `Field` name (#37475) +* [GH-37477](https://github.com/apache/arrow/issues/37477) - [MATLAB] Add `AllowNonScalar` name-value pair to arrow.internal.validate.index.* validation functions (#37482) +* [GH-37510](https://github.com/apache/arrow/issues/37510) - [C++] Don't install bundled Azure SDK for C++ (#38176) +* [GH-37532](https://github.com/apache/arrow/issues/37532) - [CI][Docs][MATLAB] Remove `GoogleTest` support from the CMake build system for the MATLAB interface (#37784) +* [GH-37537](https://github.com/apache/arrow/issues/37537) - [Integration][C++] Add C Data Interface integration testing (#37769) +* [GH-37553](https://github.com/apache/arrow/issues/37553) - [Java] Allow FlightInfo#Schema to be nullable for long-running queries (#37528) +* [GH-37562](https://github.com/apache/arrow/issues/37562) - [Ruby] Add support for table.each_raw_record.to_a (#37600) +* [GH-37567](https://github.com/apache/arrow/issues/37567) - [C++] Migrate JSON Integration code to Result<> (#37573) +* [GH-37568](https://github.com/apache/arrow/issues/37568) - [MATLAB] Implement `isequal` for the `arrow.tabular.Schema` MATLAB class (#37619) +* [GH-37569](https://github.com/apache/arrow/issues/37569) - [MATLAB] Implement `isequal` for the `arrow.type.Field` MATLAB class (#37617) +* [GH-37570](https://github.com/apache/arrow/issues/37570) - [MATLAB] Implement `isequal` for the `arrow.tabular.RecordBatch` MATLAB class (#37627) +* [GH-37571](https://github.com/apache/arrow/issues/37571) - [MATLAB] Add `arrow.tabular.Table` MATLAB class (#37620) +* [GH-37572](https://github.com/apache/arrow/issues/37572) - [MATLAB] Add `arrow.array.Date64Array` class (#37581) +* [GH-37584](https://github.com/apache/arrow/issues/37584) - [Go] Add value len function to string array (#37586) +* [GH-37587](https://github.com/apache/arrow/issues/37587) - [C++] Move integration machinery into its own directory and namespace (#37588) +* [GH-37591](https://github.com/apache/arrow/issues/37591) - [MATLAB] Make `arrow.type.Type` inherit from `matlab.mixin.Heterogeneous` (#37593) +* [GH-37597](https://github.com/apache/arrow/issues/37597) - [MATLAB] Add `toMATLAB` method to `arrow.array.ChunkedArray` class (#37613) +* [GH-37628](https://github.com/apache/arrow/issues/37628) - [MATLAB] Implement `isequal` for the `arrow.tabular.Table` MATLAB class (#37629) +* [GH-37635](https://github.com/apache/arrow/issues/37635) - [Format][C++][Go] Add app_metadata to FlightInfo and FlightEndpoint (#37679) +* [GH-37636](https://github.com/apache/arrow/issues/37636) - [Go] Bump minimum go versions (#37637) +* [GH-37643](https://github.com/apache/arrow/issues/37643) - [C++] Enhance arrow::Datum::ToString (#37646) +* [GH-37651](https://github.com/apache/arrow/issues/37651) - [C#] expose ArrowArrayConcatenator.Concatenate (#37652) +* [GH-37653](https://github.com/apache/arrow/issues/37653) - [MATLAB] Add `arrow.array.StructArray` MATLAB class (#37806) +* [GH-37654](https://github.com/apache/arrow/issues/37654) - [MATLAB] Add `Fields` property to `arrow.type.Type` MATLAB class (#37725) +* [GH-37670](https://github.com/apache/arrow/issues/37670) - [C++] IO FileInterface extend from enable_shared_from_this (#37713) +* [GH-37681](https://github.com/apache/arrow/issues/37681) - [R] Update NEWS.md for 13.0.0.1 (#37682) +* [GH-37687](https://github.com/apache/arrow/issues/37687) - [Go] Don't copy in realloc when capacity is sufficient. (#37688) +* [GH-37694](https://github.com/apache/arrow/issues/37694) - [Go] Add SetNull to array builders (#37695) +* [GH-37701](https://github.com/apache/arrow/issues/37701) - [Java] Add default comparators for more types (#37748) +* [GH-37702](https://github.com/apache/arrow/issues/37702) - [Java] Add vector validation consistent with C++ (#37942) +* [GH-37703](https://github.com/apache/arrow/issues/37703) - [Java] Method for setting exact number of records in ListVector (#37838) +* [GH-37704](https://github.com/apache/arrow/issues/37704) - [Java] Add schema IPC serialization methods (#37778) +* [GH-37705](https://github.com/apache/arrow/issues/37705) - [Java] Extra input methods for VarChar writers (#37883) +* [GH-37705](https://github.com/apache/arrow/issues/37705) - [Java] Extra input methods for binary writers (#37791) +* [GH-37706](https://github.com/apache/arrow/issues/37706) - [Java] VarCharWriter should support writing from \`Text\` and \`String\` +* [GH-37722](https://github.com/apache/arrow/issues/37722) - [Java][FlightRPC] Deprecate stateful login methods (#37833) +* [GH-37724](https://github.com/apache/arrow/issues/37724) - [MATLAB] Add `arrow.type.StructType` MATLAB class (#37749) +* [GH-37742](https://github.com/apache/arrow/issues/37742) - [Python] Enable Cython 3 (#37743) +* [GH-37744](https://github.com/apache/arrow/issues/37744) - [Swift] Add test for arrow flight doGet FlightData (#37746) +* [GH-37770](https://github.com/apache/arrow/issues/37770) - [MATLAB] Add CSV `TableReader` and `TableWriter` MATLAB classes (#37773) +* [GH-37779](https://github.com/apache/arrow/issues/37779) - [Go] Link to the pkg.go.dev site for Go reference docs (#37780) +* [GH-37782](https://github.com/apache/arrow/issues/37782) - [C++] Add `CanReferenceFieldsByNames` method to `arrow::StructArray` (#37823) +* [GH-37789](https://github.com/apache/arrow/issues/37789) - [Integration][Go] Go C Data Interface integration testing (#37788) +* [GH-37795](https://github.com/apache/arrow/issues/37795) - [Java][FlightSQL] Add mock FlightSqlProducer and tests (#37837) +* [GH-37799](https://github.com/apache/arrow/issues/37799) - [C++] Compute: CommonTemporal support time32 and time64 casting (#37949) +* [GH-37825](https://github.com/apache/arrow/issues/37825) - [MATLAB] Improve `arrow.type.Field` display (#37826) +* [GH-37835](https://github.com/apache/arrow/issues/37835) - [MATLAB] Improve `arrow.tabular.Schema` display (#37836) +* [GH-37842](https://github.com/apache/arrow/issues/37842) - [R] Implement infer_schema.data.frame() (#37843) +* [GH-37849](https://github.com/apache/arrow/issues/37849) - [C++] Add cpp/src/**/*.cmake to cmake-format targets (#37850) +* [GH-37851](https://github.com/apache/arrow/issues/37851) - [C++] IPC: ArrayLoader style enhancement (#37872) +* [GH-37863](https://github.com/apache/arrow/issues/37863) - [Java] Add typed getters for StructVector (#37916) +* [GH-37864](https://github.com/apache/arrow/issues/37864) - [Java] Remove unnecessary throws from OrcReader (#37913) +* [GH-37873](https://github.com/apache/arrow/issues/37873) - [C++][Parquet] DELTA_BYTE_ARRAY: avoid copying data when possible (#37874) +* [GH-37876](https://github.com/apache/arrow/issues/37876) - [Format] Add list-view specification to arrow format (#37877) +* [GH-37880](https://github.com/apache/arrow/issues/37880) - [CI][Python][Packaging] Add support for Python 3.12 (#37901) +* [GH-37906](https://github.com/apache/arrow/issues/37906) - [Integration][C#] Implement C Data Interface integration testing for C# (#37904) +* [GH-37917](https://github.com/apache/arrow/issues/37917) - [Parquet] Add OpenAsync for FileSource (#37918) +* [GH-37923](https://github.com/apache/arrow/issues/37923) - [R] Move macOS build system to nixlibs.R (#37684) +* [GH-37934](https://github.com/apache/arrow/issues/37934) - [Doc][Integration] Document C Data Interface testing (#37935) +* [GH-37939](https://github.com/apache/arrow/issues/37939) - [C++] Use signed arithmetic for frame of reference when encoding DELTA_BINARY_PACKED (#37940) +* [GH-37941](https://github.com/apache/arrow/issues/37941) - [R][CI][Release] Add checksum verification for pre-compiled binaries (#38115) +* [GH-37945](https://github.com/apache/arrow/issues/37945) - [R] Update developer documentation (#38220) +* [GH-37971](https://github.com/apache/arrow/issues/37971) - [CI][Java] Don't use cache for nightly upload (#37980) +* [GH-37978](https://github.com/apache/arrow/issues/37978) - [C++] Add support for specifying custom Array element delimiter to `arrow::PrettyPrintOptions` (#37981) +* [GH-37984](https://github.com/apache/arrow/issues/37984) - [Release] Use ISO 8601 format for YAML date value (#37985) +* [GH-37994](https://github.com/apache/arrow/issues/37994) - [R] Create wrapper functions for the CSV*Options classes (#37995) +* [GH-37996](https://github.com/apache/arrow/issues/37996) - [MATLAB] Add a static constructor method named `fromMATLAB` to `arrow.array.StructArray` (#37998) +* [GH-38005](https://github.com/apache/arrow/issues/38005) - [Java] disable the debug log when running Java tests (#38006) +* [GH-38015](https://github.com/apache/arrow/issues/38015) - [MATLAB] Add `arrow.buffer.Buffer` class to the MATLAB Interface (#38020) +* [GH-38017](https://github.com/apache/arrow/issues/38017) - [Go][FlightSQL] Increment types handled by internal converter (#38028) +* [GH-38043](https://github.com/apache/arrow/issues/38043) - [R] Enable all features by default on macOS (#38195) +* [GH-38053](https://github.com/apache/arrow/issues/38053) - [C++][Go] Re-generate sources from Schema.fbs (#38054) +* [GH-38055](https://github.com/apache/arrow/issues/38055) - [C++] Don't find/use Threads::Threads with ARROW_ENABLE_THREADING=OFF (#38056) +* [GH-38063](https://github.com/apache/arrow/issues/38063) - [C++] Use absolute path for external project's ar/ranlib (#38064) +* [GH-38071](https://github.com/apache/arrow/issues/38071) - [C++][CI] Fix Overlap column chunk ranges for pre-buffer (#38073) +* [GH-38088](https://github.com/apache/arrow/issues/38088) - [R] Remove outdated references to brew and autobrew (#38089) +* [GH-38138](https://github.com/apache/arrow/issues/38138) - [R] Add curl to suggests for use of `skip_if_offline()` (#38140) +* [GH-38142](https://github.com/apache/arrow/issues/38142) - [R] Add NEWS for 14.0.0 (#38143) +* [GH-38145](https://github.com/apache/arrow/issues/38145) - [Docs][Python] Add tzdata on Windows subsection in Python install docs (#38146) +* [GH-38159](https://github.com/apache/arrow/issues/38159) - [CI][Release] Run only integration tests on integration test mode (#38177) +* [GH-38172](https://github.com/apache/arrow/issues/38172) - [CI][C++] Use system GoogleTest on Ubuntu 22.04 (#38173) +* [GH-38174](https://github.com/apache/arrow/issues/38174) - [C++] Update bundled Azure SDK for C++ to 1.10.3 (#38175) +* [GH-38209](https://github.com/apache/arrow/issues/38209) - [Docs] Reduce width of header items and keep header height default (small) on smaller screens (#38148) +* [GH-38240](https://github.com/apache/arrow/issues/38240) - [Docs] version_match should match the version from versions.json (#38241) +* [GH-38243](https://github.com/apache/arrow/issues/38243) - [CI][Python] Add missing dataset marker for dataset encryption tests (#38244) +* [GH-38285](https://github.com/apache/arrow/issues/38285) - [Go] Slight deps and docs update (#38284) +* [GH-38312](https://github.com/apache/arrow/issues/38312) - [Docs] Add the Arrow C Device data interface page to the sidebar TOC (#38313) +* [PARQUET-2323](https://issues.apache.org/jira/browse/PARQUET-2323) - [C++] Use bitmap to store pre-buffered column chunks (#36649) + + + # Apache Arrow 6.0.1 (2021-11-18) ## Bug Fixes