Skip to content

Commit

Permalink
"Mongolian Vowel Separator" is not considered whitespace after Ruby 2.2
Browse files Browse the repository at this point in the history
[Ruby Issue 9092](https://bugs.ruby-lang.org/issues/9092) updated Ruby from Unicode 6.1 to Unicode 7.0 ([commit here](ruby/ruby@64c81e4#diff-67d181a69374a75e4b8f706fa9b81fbc)).  This patch was included in Ruby 2.2.

From the [Unicode 6.3.0 release notes](http://www.unicode.org/versions/Unicode6.3.0/):

    The General_Category property value of U+180E MONGOLIAN VOWEL SEPARATOR has been changed from Zs to Cf. The values of other related properties such as Bidi_Class, White_Space, and Other_Default_Ignorable_Code_Point have been updated accordingly.

After 2.2 shipped, Ruby's [[:space:]] Regexp expression [no longer matches U+180E](https://github.com/ruby/ruby/blob/9fefa6063797f94704c09663db13cea9e390eaba/enc/unicode/name2ctype.h#L2740-L2753).

This updates fast_blank to treat U+180E properly in blank_as?

Prior to this change, the tests fail on 2.2.2:

    1) String provides a parity with active support function
       Failure/Error: expect("#{i.to_s(16)} #{c.blank_as?}").to eq("#{i.to_s(16)} #{c.blank2?}")
    -
         expected: "180e false"
              got: "180e true"
    -
         (compared using ==)
       # ./spec/fast_blank_spec.rb:22:in `block (3 levels) in <top (required)>'
       # ./spec/fast_blank_spec.rb:19:in `times'
       # ./spec/fast_blank_spec.rb:19:in `block (2 levels) in <top (required)>'

It now passes on all the Rubies in the Travis matrix (RBX, 1.9, 2.0, 2.1, and 2.2).
  • Loading branch information
tjschuck committed Aug 2, 2015
1 parent 7d166e5 commit 9cf94b4
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 1 deletion.
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
language: ruby
rvm:
- 2.2.2
- 2.1.6
- 2.0.0
- 1.9.3
Expand Down
15 changes: 14 additions & 1 deletion ext/fast_blank/fast_blank.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#include <ruby.h>
#include <ruby/encoding.h>
#include <ruby/re.h>
#include <ruby/version.h>

#define STR_ENC_GET(str) rb_enc_from_index(ENCODING_GET(str))

Expand All @@ -13,6 +14,17 @@
#define RSTRING_LEN(s) (RSTRING(s)->len)
#endif

static int
ruby_version_before_2_2()
{
#ifdef RUBY_API_VERSION_MAJOR
if (RUBY_API_VERSION_MAJOR > 2 || (RUBY_API_VERSION_MAJOR == 2 && RUBY_API_VERSION_MINOR >= 2)) {
return 0;
}
#endif
return 1;
}

static VALUE
rb_str_blank_as(VALUE str)
{
Expand All @@ -38,7 +50,6 @@ rb_str_blank_as(VALUE str)
case 0x85:
case 0xa0:
case 0x1680:
case 0x180e:
case 0x2000:
case 0x2001:
case 0x2002:
Expand All @@ -57,6 +68,8 @@ rb_str_blank_as(VALUE str)
case 0x3000:
/* found */
break;
case 0x180e:
if (ruby_version_before_2_2()) break;
default:
return Qfalse;
}
Expand Down

1 comment on commit 9cf94b4

@squarism
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the weirdest commit message (and implementation) I can currently think of. It's a gift on a personal level. 🎁 Thank you.

Please sign in to comment.