What have you found for these years?

2011-04-25

Ruby gotcha from 1.8 to 1.9

  1. Encoding

String has encoding information, concatenating two
different strings with two different incompatible
encodings would raise errors. Be aware of this.

In files which is using UTF-8 literal, we should put
this magic encoding comment in the beginning of the file:

# encoding: utf-8

Otherwise, ruby might complain about invalid byte
sequence in US-ASCII encoding.

  1. Locale and Encoding

Locale would affect default external encoding!
On my computer, my locale is en_US.UTF-8, but in
Heroku, it’s default to POSIX (also called C, or US-ASCII,
or ASCII-7bit). I am not really sure why, but on Heroku
this would give me some errors.

So let’s switch the locale on Heroku to en_US.UTF-8 by:

heroku config:add LC_ALL=en_US.UTF-8 --remote staging
  1. Database, Memcache, outside world Encoding

Still regarding encoding! Data read from outside world
sometimes would have wrong encoding “tag”! So we
might need to use String#force_encoding to change
the tag about encoding.

Some example:

s = [0xE6, 0xB0, 0xB8].pack('C*')
s.encoding #<Encoding:ASCII-8BIT>
s.force_encoding('utf-8') # "永"
s.encoding #<Encoding:UTF-8>
s # "永"
s.force_encoding('binary') # "\xE6\xB0\xB8"
s # "\xE6\xB0\xB8"
  1. Rails ERB template source Encoding.

In normal Ruby source, we can simply use the build-in
magic encoding comment, while in Rails’ ERB templates,
there’s a hack that we can put this as the magic comment:

<%# encoding: utf-8 %>

Note, this is provided by Rails’ hack, the format is parsed
by regexp in Rails’ source code. Usually, as far as I can tell,
we don’t really need this as long as the locale and
Encoding.default_internal (I’ve set this in apps) is setup.
Mentioning it is just for in case we need to test something,
or when something is wrong, this might be a workaround.



Here are some comprehensive documents about ruby 1.9.2’s
complex encoding system… For now I think it’s ok to skip them,
and come back when you are not sure about something or
encounter some weird encoding issues.

Below one is a lot more tougher, if you’re not trying to become
an expert about the system, no need to read it.

By the way, I guess mongo_mapper is a red flag for moving onto
ruby 1.9.2!! I’ll try mongomatic to replace it. mongo_mapper
is really a headache of mine for a long time. Also, delayed_job,
but it's easy to workaround it.


updated 2011-04-25 18:35 Oh well, mongomatic didn't work on
Rails 2 as it claims! Then I'll just use the official driver directly.
People, stop depending on ActiveSupport!! :(

0 retries:

Post a Comment

All texts are licensed under CC Attribution 3.0