Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new Performance/StringBytesize cop #474

Merged

Conversation

viralpraxis
Copy link
Contributor

@viralpraxis viralpraxis commented Oct 14, 2024


Before submitting the PR make sure the following are checked:

  • The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences.
  • Wrote good commit messages.
  • Commit message starts with [Fix #issue-number] (if the related issue exists).
  • Feature branch is up-to-date with master (if not - rebase it).
  • Squashed related commits together.
  • Added tests.
  • Ran bundle exec rake default. It executes all tests and runs RuboCop on its own code.
  • Added an entry (file) to the changelog folder named {change_type}_{change_description}.md if the new code introduces user-observable changes. See changelog entry format for details.

@viralpraxis viralpraxis force-pushed the add-new-cop-performance-string-bytesize branch 2 times, most recently from 15a940b to 896ba88 Compare October 22, 2024 06:41
@viralpraxis viralpraxis changed the title Add new Performance#StringBytesize cop Add new Performance/StringBytesize cop Oct 22, 2024
@viralpraxis viralpraxis marked this pull request as ready for review October 22, 2024 06:41
@viralpraxis viralpraxis force-pushed the add-new-cop-performance-string-bytesize branch 2 times, most recently from a29b6dd to 24e33c5 Compare October 22, 2024 16:15
@koic
Copy link
Member

koic commented Oct 23, 2024

I understand the reasoning, but in Rails, replacing 42.bytes.size with 42.bytesize results in NoMethodError. This makes me concerned about its usefulness compared to the risk of false positives.

@viralpraxis
Copy link
Contributor Author

Yes, that's an issue. But isn't Integer#size less common than Array#size? Could we simply disable autocorrection, or only check for the length and count methods?

@viralpraxis
Copy link
Contributor Author

Any thoughts? @koic @dvandersluis

I completely understand that we should avoid false positives (if possible), but in this particular case I believe both true and false positives are quite rate; at the same time unnecessary array allocations are painful in case of huge strings. I was inspired by code like this

class BytesizeValidator < ActiveModel::Validations::LengthValidator
  def validate_each(record, attribute, value)
    super(record, attribute, value&.bytes)
  end
end

which was fixed by me to

class BytesizeValidator < ActiveModel::Validations::LengthValidator
  def validate_each(record, attribute, value)
    super(record, attribute, value&.dup&.force_encoding(Encoding::BINARY))
  end
end

to reduce memory consumption

@dvandersluis
Copy link
Member

I don't think 42.bytes.size is very likely in a Rails app (because it's essentially calling Integer#size), but it could be avoided in most cases by having this cop not report if the receiver is a numeric literal. It won't cover 100% of cases (ie. variables/methods), but would probably cover most.

It'd be good to have a reference and/or benchmarks for this performance suggestion. Your example doesn't actually make a good case for changing bytes.size to bytesize on its own, just avoiding using the bytes array (but that would apply to any unnecessary use of an array, not specifically to bytes.size).

Could we simply disable autocorrection, or only check for the length and count methods?

I wouldn't change this to only check for length and count, but that wouldn't solve the issue anyways, since all three are used in many contexts. I think that'd just diminish the utility of this cop. BTW - If this PR was to be accepted it'd be ideal for the tests to include all three methods (size/length/count).

@koic
Copy link
Member

koic commented Oct 30, 2024

@dvandersluis's opinion looks good. @viralpraxis Can you proceed with this?
However, for safety, it should use Safe: false instead of SafeAutoCorrect: false due to the risk of false positives.

@viralpraxis
Copy link
Contributor Author

@koic I agree with @dvandersluis too. I will apply the necessary changes.

@viralpraxis viralpraxis force-pushed the add-new-cop-performance-string-bytesize branch from 24e33c5 to d3637be Compare October 30, 2024 18:03
@koic koic merged commit 3806fc4 into rubocop:master Nov 2, 2024
14 checks passed
@koic
Copy link
Member

koic commented Nov 2, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants