Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using UTF-8 regex on real header strings raises errors #2

Open
NealJMD opened this issue Apr 26, 2016 · 2 comments
Open

Using UTF-8 regex on real header strings raises errors #2

NealJMD opened this issue Apr 26, 2016 · 2 comments

Comments

@NealJMD
Copy link

NealJMD commented Apr 26, 2016

We're using this gem in production and have started to see an error cropping up 'Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)' with a stack trace leading into this gem. I can re-create the error in the console like this -

irb(main):001:0> require 'mobile_detect'
=> true
irb(main):002:0> device = MobileDetect.new({}, 'Gécko'.force_encoding('ASCII-8BIT'))
=> #<MobileDetect:0x007f943b679130 @http_headers={}, @user_agent="G\xC3\xA9cko">
irb(main):003:0> device.mobile?
Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
    from /Users/nealjmd/.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/mobile-detect-0.2.0/lib/mobile_detect/core.rb:153:in `match'
    from /Users/nealjmd/.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/mobile-detect-0.2.0/lib/mobile_detect/core.rb:162:in `block in match_detection_rules_against_UA'
    from /Users/nealjmd/.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/mobile-detect-0.2.0/lib/mobile_detect/core.rb:161:in `each'
    from /Users/nealjmd/.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/mobile-detect-0.2.0/lib/mobile_detect/core.rb:161:in `match_detection_rules_against_UA'
    from /Users/nealjmd/.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/mobile-detect-0.2.0/lib/mobile_detect/core.rb:33:in `mobile?'
    from (irb):3
    from /Users/nealjmd/.rbenv/versions/2.3.0/bin/irb:11:in `<main>'

This is obviously a toy example, but we're getting the same error from real headers, mostly from people in Spain. For the time being, we're forcing all the header encodings to UTF-8, but it would be great if this was handled internally to mobile_detect.

Thanks!

@ktaragorn
Copy link
Owner

Hi @NealJMD I am honored that you are using this gem in production.

This error likely came from a recent fix where we updated the regex comparison to utf-8. But I am having trouble understanding your issue. As you mentioned this is a toy example. Particularly, force_encoding doesnt actually reencode the string but just makes it pretend to be that encoding. The string you provided cannot be valid ascii-8bit string as seen by

2.3.0 :018 > "Gécko".encode("ASCII-8BIT")
Encoding::UndefinedConversionError: U+00E9 from UTF-8 to ASCII-8BIT
        from (irb):18:in `encode'
        from (irb):18
        from ./bin/mobile_detect:14:in `<main>'

Is it possible for you to provide a legit example? I wonder if the headers at your end are forced into ascii-8bit somewhere in your stack..

The reason I ask is, the way to fix this might be to encode the string into utf-8 before comparing, but I cant do that with the example you provided (after force_encoding)

@ktaragorn
Copy link
Owner

Alternatively you could try .encode('utf-8') on your breaking examples, if that works for you I can add it to the project. Until then you can use the previous version of the gem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants