Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is cargo fmt no longer working properly in parquet crate #6179

Closed
etseidl opened this issue Aug 1, 2024 · 6 comments · Fixed by #6328
Closed

Is cargo fmt no longer working properly in parquet crate #6179

etseidl opened this issue Aug 1, 2024 · 6 comments · Fixed by #6328
Labels
question Further information is requested

Comments

@etseidl
Copy link
Contributor

etseidl commented Aug 1, 2024

Which part is this question about
Code formatter.

Describe your question
I've noticed recently that running cargo fmt while I'm editing files doesn't always seem to catch problems. Running rustfmt directly will work. For instance, running cargo fmt on the parquet crate yields no output.

% cargo +stable fmt -p parquet -v -- --check
[bench (2021)] "/Users/seidl/src/arrow-rs/parquet/benches/arrow_reader.rs"
[bench (2021)] "/Users/seidl/src/arrow-rs/parquet/benches/arrow_statistics.rs"
[bench (2021)] "/Users/seidl/src/arrow-rs/parquet/benches/arrow_writer.rs"
[bench (2021)] "/Users/seidl/src/arrow-rs/parquet/benches/compression.rs"
[bench (2021)] "/Users/seidl/src/arrow-rs/parquet/benches/encoding.rs"
[bench (2021)] "/Users/seidl/src/arrow-rs/parquet/benches/metadata.rs"
[example (2021)] "/Users/seidl/src/arrow-rs/parquet/examples/async_read_parquet.rs"
[example (2021)] "/Users/seidl/src/arrow-rs/parquet/examples/read_parquet.rs"
[example (2021)] "/Users/seidl/src/arrow-rs/parquet/examples/read_with_rowgroup.rs"
[example (2021)] "/Users/seidl/src/arrow-rs/parquet/examples/write_parquet.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-concat.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-fromcsv.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-index.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-layout.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-read.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-rewrite.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-rowcount.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-schema.rs"
[bin (2021)] "/Users/seidl/src/arrow-rs/parquet/src/bin/parquet-show-bloom-filter.rs"
[lib (2021)] "/Users/seidl/src/arrow-rs/parquet/src/lib.rs"
[test (2021)] "/Users/seidl/src/arrow-rs/parquet/tests/arrow_reader/mod.rs"
[test (2021)] "/Users/seidl/src/arrow-rs/parquet/tests/arrow_writer_layout.rs"
rustfmt --edition 2021 --check /Users/seidl/src/arrow-rs/parquet/benches/arrow_reader.rs /Users/seidl/src/arrow-rs/parquet/benches/arrow_statistics.rs /Users/seidl/src/arrow-rs/parquet/benches/arrow_writer.rs /Users/seidl/src/arrow-rs/parquet/benches/compression.rs /Users/seidl/src/arrow-rs/parquet/benches/encoding.rs /Users/seidl/src/arrow-rs/parquet/benches/metadata.rs /Users/seidl/src/arrow-rs/parquet/examples/async_read_parquet.rs /Users/seidl/src/arrow-rs/parquet/examples/read_parquet.rs /Users/seidl/src/arrow-rs/parquet/examples/read_with_rowgroup.rs /Users/seidl/src/arrow-rs/parquet/examples/write_parquet.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-concat.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-fromcsv.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-index.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-layout.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-read.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-rewrite.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-rowcount.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-schema.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-show-bloom-filter.rs /Users/seidl/src/arrow-rs/parquet/src/lib.rs /Users/seidl/src/arrow-rs/parquet/tests/arrow_reader/mod.rs /Users/seidl/src/arrow-rs/parquet/tests/arrow_writer_layout.rs
%

But there are files that when run manually do (running the rustfmt command line above with the addition of parquet/src/compression.rs):

% rustfmt --edition 2021 --check /Users/seidl/src/arrow-rs/parquet/benches/arrow_reader.rs /Users/seidl/src/arrow-rs/parquet/benches/arrow_statistics.rs /Users/seidl/src/arrow-rs/parquet/benches/arrow_writer.rs /Users/seidl/src/arrow-rs/parquet/benches/compression.rs /Users/seidl/src/arrow-rs/parquet/benches/encoding.rs /Users/seidl/src/arrow-rs/parquet/benches/metadata.rs /Users/seidl/src/arrow-rs/parquet/examples/async_read_parquet.rs /Users/seidl/src/arrow-rs/parquet/examples/read_parquet.rs /Users/seidl/src/arrow-rs/parquet/examples/read_with_rowgroup.rs /Users/seidl/src/arrow-rs/parquet/examples/write_parquet.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-concat.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-fromcsv.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-index.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-layout.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-read.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-rewrite.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-rowcount.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-schema.rs /Users/seidl/src/arrow-rs/parquet/src/bin/parquet-show-bloom-filter.rs /Users/seidl/src/arrow-rs/parquet/src/lib.rs /Users/seidl/src/arrow-rs/parquet/tests/arrow_reader/mod.rs /Users/seidl/src/arrow-rs/parquet/tests/arrow_writer_layout.rs /Users/seidl/src/arrow-rs/parquet/src/compression.rs 
Diff in /Users/seidl/src/arrow-rs/parquet/src/compression.rs at line 150:
         CodecType::BROTLI(level) => {
             #[cfg(any(feature = "brotli", test))]
             return Ok(Some(Box::new(BrotliCodec::new(level))));
-            Err(ParquetError::General("Disabled feature at compile time: brotli".into()))
-        },
+            Err(ParquetError::General(
+                "Disabled feature at compile time: brotli".into(),
+            ))
+        }
         CodecType::GZIP(level) => {
             #[cfg(any(feature = "flate2", test))]
             return Ok(Some(Box::new(GZipCodec::new(level))));
Diff in /Users/seidl/src/arrow-rs/parquet/src/compression.rs at line 158:
-            Err(ParquetError::General("Disabled feature at compile time: flate2".into()))
-        },
+            Err(ParquetError::General(
+                "Disabled feature at compile time: flate2".into(),
+            ))
+        }
         CodecType::SNAPPY => {
             #[cfg(any(feature = "snap", test))]
             return Ok(Some(Box::new(SnappyCodec::new())));
Diff in /Users/seidl/src/arrow-rs/parquet/src/compression.rs at line 163:
-            Err(ParquetError::General("Disabled feature at compile time: snap".into()))
-        },
+            Err(ParquetError::General(
+                "Disabled feature at compile time: snap".into(),
+            ))
+        }
         CodecType::LZ4 => {
             #[cfg(any(feature = "lz4", test))]
             return Ok(Some(Box::new(LZ4HadoopCodec::new(
Diff in /Users/seidl/src/arrow-rs/parquet/src/compression.rs at line 168:
                 _options.backward_compatible_lz4,
             ))));
-            Err(ParquetError::General("Disabled feature at compile time: lz4".into()))
-        },
+            Err(ParquetError::General(
+                "Disabled feature at compile time: lz4".into(),
+            ))
+        }
         CodecType::ZSTD(level) => {
             #[cfg(any(feature = "zstd", test))]
             return Ok(Some(Box::new(ZSTDCodec::new(level))));
Diff in /Users/seidl/src/arrow-rs/parquet/src/compression.rs at line 175:
-            Err(ParquetError::General("Disabled feature at compile time: zstd".into()))
-        },
+            Err(ParquetError::General(
+                "Disabled feature at compile time: zstd".into(),
+            ))
+        }
         CodecType::LZ4_RAW => {
             #[cfg(any(feature = "lz4", test))]
             return Ok(Some(Box::new(LZ4RawCodec::new())));
Diff in /Users/seidl/src/arrow-rs/parquet/src/compression.rs at line 180:
-            Err(ParquetError::General("Disabled feature at compile time: lz4".into()))
-        },
+            Err(ParquetError::General(
+                "Disabled feature at compile time: lz4".into(),
+            ))
+        }
         CodecType::UNCOMPRESSED => Ok(None),
         _ => Err(nyi_err!("The codec type {} is not supported yet", codec)),
     }
%

Additional context
There are many reports of fmt silently failing, one such is rust-lang/rustfmt#3008.

Has anyone else noticed this or is it something to do with my setup.

%  cargo --version
cargo 1.80.0 (376290515 2024-07-16)
% rustfmt --version
rustfmt 1.7.0-stable (05147895 2024-07-21)
@etseidl etseidl added the question Further information is requested label Aug 1, 2024
@alamb
Copy link
Contributor

alamb commented Aug 2, 2024

Fascinatingly, I am seeing the same thing

For example I deliberately introduced a formatting issue:

andrewlamb@Andrews-MacBook-Pro-2:~/Software/arrow-rs$ git diff
diff --git a/parquet/src/compression.rs b/parquet/src/compression.rs
index 10560210e4e..119af7e156f 100644
--- a/parquet/src/compression.rs
+++ b/parquet/src/compression.rs
@@ -15,6 +15,10 @@
 // specific language governing permissions and limitations
 // under the License.

+
+
+
+
 //! Contains codec interface and supported codec implementations.
 //!
 //! See [`Compression`](crate::basic::Compression) enum for all available compression

And then when I ran fmt it didn't seem to fix it

andrewlamb@Andrews-MacBook-Pro-2:~/Software/arrow-rs$ cargo fmt
andrewlamb@Andrews-MacBook-Pro-2:~/Software/arrow-rs$ git diff
diff --git a/parquet/src/compression.rs b/parquet/src/compression.rs
index 10560210e4e..119af7e156f 100644
--- a/parquet/src/compression.rs
+++ b/parquet/src/compression.rs
@@ -15,6 +15,10 @@
 // specific language governing permissions and limitations
 // under the License.

+
+
+
+
 //! Contains codec interface and supported codec implementations.
 //!
 //! See [`Compression`](crate::basic::Compression) enum for all available compression
andrewlamb@Andrews-MacBook-Pro-2:~/Software/arrow-rs$ cargo --version
cargo 1.80.0 (376290515 2024-07-16)
andrewlamb@Andrews-MacBook-Pro-2:~/Software/arrow-rs$ rustfmt --version
rustfmt 1.7.0-stable (05147895 2024-07-21)
andrewlamb@Andrews-MacBook-Pro-2:~/Software/arrow-rs$

@alamb
Copy link
Contributor

alamb commented Aug 2, 2024

Interestingly I found that removing this line:

#[rustfmt::skip]

Make cargo fmt come back

Though it looks like (I) added that line 3 months ago: 98784bd 🤔

@Xuanwo
Copy link
Member

Xuanwo commented Aug 2, 2024

It's the first time for me to know #[rustfmt::skip] 😆

@etseidl
Copy link
Contributor Author

etseidl commented Aug 2, 2024

I tried reverting #5727 and still no output :(

@alamb alamb changed the title Is cargo fmt no longer working properly Is cargo fmt no longer working properly in parquet crate Aug 2, 2024
@etseidl
Copy link
Contributor Author

etseidl commented Aug 28, 2024

Still tracking this. Here's a list of non-compliant files in the parquet source (format.rs is expected).

% rustfmt --check -l `find . -name "*.rs"` |sort|uniq
/Users/seidl2/src/arrow-rs/parquet/src/arrow/array_reader/empty_array.rs
/Users/seidl2/src/arrow-rs/parquet/src/arrow/array_reader/fixed_size_list_array.rs
/Users/seidl2/src/arrow-rs/parquet/src/arrow/array_reader/list_array.rs
/Users/seidl2/src/arrow-rs/parquet/src/arrow/array_reader/map_array.rs
/Users/seidl2/src/arrow-rs/parquet/src/arrow/array_reader/struct_array.rs
/Users/seidl2/src/arrow-rs/parquet/src/arrow/schema/complex.rs
/Users/seidl2/src/arrow-rs/parquet/src/arrow/schema/mod.rs
/Users/seidl2/src/arrow-rs/parquet/src/compression.rs
/Users/seidl2/src/arrow-rs/parquet/src/encodings/encoding/dict_encoder.rs
/Users/seidl2/src/arrow-rs/parquet/src/encodings/levels.rs
/Users/seidl2/src/arrow-rs/parquet/src/encodings/rle.rs
/Users/seidl2/src/arrow-rs/parquet/src/format.rs
/Users/seidl2/src/arrow-rs/parquet/src/util/bit_util.rs
/Users/seidl2/src/arrow-rs/parquet/src/util/interner.rs
/Users/seidl2/src/arrow-rs/parquet/src/util/test_common/file_util.rs
/Users/seidl2/src/arrow-rs/parquet/src/util/test_common/mod.rs
/Users/seidl2/src/arrow-rs/parquet/src/util/test_common/rand_gen.rs

@etseidl
Copy link
Contributor Author

etseidl commented Aug 29, 2024

Ah, so apparently rustfmt will not format modules that are declared in macros, such as those in the experimental! macro

arrow-rs/parquet/src/lib.rs

Lines 132 to 137 in a937869

experimental!(#[macro_use] mod util);
#[cfg(feature = "arrow")]
pub mod arrow;
pub mod column;
experimental!(mod compression);
experimental!(mod encodings);

There is an open issue for this rust-lang/rustfmt#3253

One workaround is to tack a list of files to format to the end of the commandline

cargo +stable fmt --all --check -- `find parquet -name "*.rs" \! -name format.rs`

Perhaps something like this could be added to .github/workflows/rust.yml.

Edit: this has also been a problem in DataFusion apache/datafusion#9367. I don't know if a similar solution would work here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants