星之一角: 12/9/12

2012-12-14

與 ko1 的一些聊天內容 (2)

延續上篇：
2223. 12-13 與 ko1 的一些聊天內容

忘記提關於 fibers 的事。很明顯我的信寫太長了，他懶得看 XD 我有貼出來：
2222. 12-09 Regarding Fibers
因此他直接問我內容。可惜的是時間不夠，結果只稍微講了前面，後面沒講到。

簡單地說，他後來終於懂了這個用法的意思。我說在 nodejs 裡面，有無數解決
這個問題的東西被發明，而在 ruby 中，這是其中一個。不過他仍然覺得這是一個
不好的寫法。我不確定理由是什麼，不過他表面上的意思是 fibers 是被設計來寫
generator, 不是做這種事。另外我也提到有時候我會需要 double resume
intentionally, 他提到可以用 Fiber#transfer.

我是知道這個東西，但想不出來有什麼場合會需要這樣用。ko1 回答，這只有
computer scientists 才會有興趣 XD

但後來跟 fumin 試了好一會。fumin 寫了一個 future 的實作，玩弄了一會，
我發現我原本對 Fiber#transfer 的理解並不是很完整。同時 ko1 說的沒錯，
在我需要 double resume 時，用 transfer 就對了。

這樣其實讓我更困惑，因為感覺上 transfer 就只是比較強的 resume 而已。
那麼是因為有實作效能的差異，所以才區分 resume 跟 transfer 嗎？
可惜來不及問了 :P

總之 transfer 有點像是 resume + yield, 因此可以亂轉一通。但是 resume
並不會 yield, current fiber 仍然會是在 resumed 狀態，因此不能寫：

f1 = nil
f0 = Fiber.new{ f1.resume }
f1 = Fiber.new{ f0.resume }
f1.resume # double resume

但是可以寫：

require 'fiber'
f1 = nil
f0 = Fiber.new{ f1.transfer }
f1 = Fiber.new{ f0.transfer }
f1.transfer # ok

另一個差別是 resume 並不會使得 current fiber yield, 像是：

f1 = nil
f0 = Fiber.new{ true }
f1 = Fiber.new{ f0.resume; false }
f1.resume # false
f1.alive? # false

但是如果是 transfer, 則 current fiber 會 yield:

require 'fiber'
f1 = nil
f0 = Fiber.new{ true }
f1 = Fiber.new{ f0.transfer; false }
f1.transfer # true
f1.alive?   # true

2012-12-13

與 ko1 的一些聊天內容

edited 2012-12-14 00:21
忘記提關於 fibers 的事，見：
2224. 12-14 與 ko1 的一些聊天內容 (2)

抱歉拖稿了好幾天。其實我原本只是想問一些對於 llvm 與 rubinius 的看法而已，
不過總之後來在丹堤聊了約一個小時。事情的經過等等在講 rubyconf.tw/2012 的
心得時再說。

聊天的內容大抵上圍繞著 ko1 之後想做的事，也可以說是 cruby 的未來吧。我得
承認說，聽完這些後，對 cruby 又有了不少希望 (?) 不過說是這樣說，這些幾乎
每一個都是很龐大的計畫，難度也都很高，先別說可能要很久才能完成，能不能完成
都很難說。

以下的訊息除了從 ko1 口中說出外，為了讓行文流暢，也包含我個人的想法。我沒有
寫任何筆記，全部靠大腦回憶三、四天前的事，全文不代表 ko1 的發言 XD

*

從 GIL/GVL 開始說好了。大家知道 cruby 一直不肯拿掉 GIL/GVL, 真實理由是什麼？
這很難一言以蔽之，有好幾個理由。我後來也想起來，我記得有看過討論，ko1 幾年前
曾經把 GIL/GVL 拿掉過。ko1 說好像是五年前的樣子，如果要跟現在的程式合併，
還需要很大的力氣。但總之結果不是很令人滿意，跑單 thread 程式速度下降，程式變得
難寫，內部結構變得複雜，等等。

jruby 的作法是 fine grained locking, 即在真正需要 lock 的地方 lock.
似乎很多 jvm based language 是這樣實作的。ko1 說，這是由於 jvm 的 jit
系統很厲害[0]，如果說程式本身是 single threaded, 那麼 synchronize 的
block 是可以被 optimize 掉的。因此 jruby 這樣做的成本相較之下沒有 cruby
來得高。同時，要做到像 jvm 這樣的能力是很困難的（ko1 笑 XD）

同時 ko1 舉了一個例子，想說明 jruby 的作法其實像是 c++ 的作法，即 libraries
本身不是 thread-safe, 如果使用者希望是 thread-safe 的操作，需要自己 lock.
我們知道 c++ 的 STL 基本上都是 thread-unsafe 的，看來 jruby 的 string
亦若是。例子是這個：

size = 10000000
s = ''
t0 = Thread.new{
  size.times{ s << 'x' }
}
t1 = Thread.new{
  size.times{ s << 'y' }
}
t0.join
t1.join

在 cruby 下，程式不會有問題。在 jruby 上，在我的電腦上，這程式會爆炸，
而且爆炸的原因是 java exception. 我不知道 jruby 本身能不能 rescue
java exception, 合理推測是不能，同時如果不能的話，這跟 cruby segfault
又有什麼不一樣？[1]

jruby 1.7.1 (1.9.3p327) 2012-12-03 30a153b on Java HotSpot(TM) 64-Bit Server VM 1.7.0_09-b05 [darwin-x86_64]
Exception in thread "RubyThread-1: crash.rb:1" java.lang.ArrayIndexOutOfBoundsException
 at org.jruby.util.ByteList.ensure(ByteList.java:342)
 at org.jruby.RubyString.modify(RubyString.java:923)
 at org.jruby.RubyString.cat(RubyString.java:1359)
 at org.jruby.RubyString.cat19(RubyString.java:1324)
 at org.jruby.RubyString.cat19(RubyString.java:1317)
 at org.jruby.RubyString.append19(RubyString.java:2580)
 at org.jruby.RubyString.concat19(RubyString.java:2611)
[...]

另一個有趣的是，後來跟 fumin 驗證 s 的輸出會是什麼。看起來 cruby 會輸出
像是 xxxxxxyyyyyy 的東西。同時那個 xxxxx 會很長很長，可能是 1000000
那麼長。合理推測這是為了加速 GIL/GVL 下的影響，實際上 YARV 的執行狀況
可能是透過時間去切割哪個 thread 可以取得執行權。

實際運作這段程式時，確實 cruby 是吃滿一顆核心。

接著跑 rubinius, 跑出來的結果則是非常漂亮的 xyxyxyxyxy, 同時吃滿兩顆核心。
因此我想 rubinius 的 scheduling 應該會是相當 "fair" 的。也理所當然地，
rubinius 的執行速度比 cruby 要來得快。不過也只稍微快一點點而已，跟理想上的
「兩倍快」有很大的距離。我的電腦上大約是 1.2 倍快。這可能也意味著 cruby 在跑
single threaded 的程式時，還是無人能敵的。

ko1 也不知道 rubinius 是不是用 fine grained locking. 我也沒看到那麼細，
所以也不清楚。

*

取而代之的是，ko1 與 matz 想走 multi-processes 的路線。ko1 認為 thread
safe 的程式太過難寫，process 會是比較好利用多核心的方式。[2] 但是理所當然，
如果是 multi-processes 就會面臨到使用太多記憶體的問題。因此他們想了很多方法，
希望能改善這個問題。

首先最顯而易見的是，ko1 希望能把 ruby AOT compile 至 binary code.[3] 接著
則是我不太懂的地方。他列出各種 binary object code, 例如在 linux 上是 ELF,
還有幾個我忘記了。我不確定他的意思是不是這樣，但如果我沒誤解，也辦得到的話，那
ko1 希望能發明另一種 container 可以存放 ruby binary code, 並放到 .so 檔裡面，
使得不同的 process 可以共享同一份 ruby standard library!

同時 compile 成 .so 也需要包含 packaging 的功能，甚至是 encryption 和
obfuscation 的功能。ko1 提到 python 有 package 的功能，但似乎沒有什麼東西
有做到 encryption 和 obfuscation. (java? ko1 沒有特別回答) 這些東西很
顯然是「大公司」會想要的。雖然不會是我想注意的目標，但確實可以讓大公司感興趣的話，
還是有很多好處的。瞧瞧 jvm?

下一步則是 MVM (Multiple VM), 使得多 process 的管理能夠更為方便。但是實作
這個的困難點在於，ko1 說由於 cruby 是個跑了 20 年的 project, 裡面用了很多
比較老式的架構，包含了許多的 global state, 因此要達成 multiple vm 不是那麼
容易，可能需要改很多很多的程式才有辦法。

mruby 則是用了相對很新的架構，要達成 multiple vm 要來得容易許多。這時我提到了
希望可以擊敗 lua.... XD

接下來則是 process 與 process 之間溝通的問題。ko1 說他還在大學教書時，有個
學生做了一個跑在 linux 上的 process 跟 process 之間溝通的 protocol. 這個
細節我不太記得了，因為我跟 linux 沒那麼熟，不知道 linux 有哪些 sharing
memory 的方法。不過大抵上就是我們會希望可以做到像是 ruby primitive 的物件，
例如數字或是字串的東西，可以用另外一種 marshaling 的格式。例如與其真的把
ruby string marshal 之後傳過去，不如直接用個 tag 表達這是 ruby string,
接著把真正的 bytes 傳過去即可。

這樣一來，至少 primitive type 的傳遞會是高效能的。那學生的實驗是在 24 核心的
電腦上完成的，很貴很貴 XD

這或許也可以讓 distributed object 變得容易。不過 ko1 說透過網路的話就很慢了。
但我想當然網路會很慢，可是如果我們需要跨機器的溝通，嗯，或許是像之前聽到的，
有種特殊的 protocol 可以讓機器與機器間更高速溝通。或許真的用什麼來溝通才是重點，
而非溝通什麼吧。

*

ko1 也提到 IBM 有個人用了 HTM (hardware transactional memory) 做了 ruby
相關的東西？還沒有機會交流，所以要找機會聊聊。我記著我接下來就說也希望 ruby 可以
提供 STM (software transactional memory) 的支援。

ko1 說問題在於有些東西是沒辦法 rollback 的。這時我就說可以像 haskell 一樣區分
I/O operations, 可惜沒聽到什麼回應 XD

另外我也提了之前看到 pypy 有用 STM 拿掉 GIL/GVL, 由於那是僅限於內部使用的
STM, 因此都是在控制的環境下使用，比較不會有 I/O 無法 rollback 的問題，或是
如果碰上 I/O 的東西，還是可以用一般的方式做 locking.

連結在這：
We need Software Transactional Memory
STM update: back to threads?
STM with threads

可惜也是沒聽到什麼回應。

*

接著是關於 rubinius 的事。其實這是最早說的，不過可能比較不重要，所以放在最後說。
ko1 提到一點他不喜歡 rubinius 的狀況，就是由於很多內部的東西是由 ruby 寫成的，
單跑 ruby -e raise 我們會看到：

-e:1:in `': unhandled exception

但是單跑 rbx -e raise 會看到什麼？

An exception occurred evaluating command line code
    No current exception (RuntimeError)

Backtrace:
              { } in Object#__script__ at -e:1
 Rubinius::BlockEnvironment#call_on_instance at kernel/common
                                                /block_environment.rb:75
         Kernel(Rubinius::Loader)#eval at kernel/common/eval.rb:75
                Rubinius::Loader#evals at kernel/loader.rb:584
                 Rubinius::Loader#main at kernel/loader.rb:815

我同意對於不想探究細節的人而言，這些其實都是噪音。因此這應該是一種 trade off,
我是覺得可以看到 raise 是怎麼實作出來的，不是什麼壞事。同時一個可能的解決辦法是，
對印出 backtrace 的地方動手腳，把 kernel/* 的部份全部去掉。啊其實就是 rails
會幹的事情啦~~~

另一個關於 rubinius 或 jruby 的最佳化的問題，ko1 說 method inlining
也許不是那麼有利的事情。在 micro benchmark 中，method inlining 當然很有利，
因為需要最佳化的瓶頸非常明確。

loop do lambda{}.call end

類似這樣的東西，也就是 micro benchmark 裡可能會出現的沒意義的例子，最佳化
這個當然很有利。可是在大型程式裡面，hot spot 可能是任何的 code, 甚至路徑會
不斷改變，使得 hot spot 的位置不斷變動。

在這種情況下，method inlining 就不是那麼有利。這也是為什麼 luajit 的
tracing jit 會在某些情況下放棄 jit, 或是在某些情況下才會開始 jit. 我想分析這個
將會是最大的困難，也是影響最舉足輕重的地方。

由於太難做了，完全不做，有時候反而會有利許多。我猜這可能也是為什麼我的一些測試
還是 cruby 跑最快，其他的有的會慢到很可怕的境界。不過老實講，我也很久沒試了，
說不定最近又有很大的改變。

*

最後最後是關於 llvm 的事。ko1 說幾年前試 llvm 2.6 時，跟現在 api 完全不同了 XD
確實 llvm api 的變化非常大，要一直跟隨他們腳步不太容易。但 ko1 也說現在日本也有
很多 llvm 的資源可以看，也可能 api 已經穩定了，因此說不定是時候進場了。

*

以上就是我目前還記得的內容。

--
[0] jit 是我說的，ko1 先說 no, 然後想了一下，又說 yes, yes, 所以
我猜應該是 jit 沒錯吧？

[1] ko1 先提到別人討厭 cruby segfault, 接著我附和在 jruby 上有
java exception 基本上跟 cruby segfault 的意思差不多 XD

[2] """it's my opinion""" :D

[3] 正好我之前才跟 ko1 說幾年前我想用 RubyVM::InstructionSequence 做出
AOT compilation, 但是失敗。ko1 後來那段英文我聽不太懂... 但我隱隱約約猜
他的意思是 InstructionSequence compile 出來的 byte code 不是完整的
byte code, 實際要執行需要一些 read/write barrier 的 code. 他解釋了不少，
不過真抱歉我聽不太懂... 可能需要電腦寫一下單字 :P

2012-12-09

Regarding Fibers

Sorry that I think I'm too excited to be talking with ko1,
so I decided to post this email on my blog except those
unrelated to fibers. Hope this might be helpful for some
others, too.

from: Lin Jen-Shin (godfat)
to: ko1
date: Sun, Dec 9, 2012 at 12:37 AM
subject: Regarding Fibers

Hi ko1 san,

We talked about fibers at rubyconf.tw/2012. It's really nice to be
chatting with you. I knew YARV for a long while, although I never
really read the source, but I am still quite interested in compiler
and virtual machine. I didn't expect I could talk with you in person :D
I am really happy about this.

There are a number of libraries I wrote involved with fibers, and
I'll list all of them. First I want to mention em-synchrony as I told
you yesterday, because actually I learned that fiber usage from
there.

And here's my slide if you missed it.
http://godfat.org/slide/2012-12-07-concurrent.pdf

# Fiber.current

Here's a very simple code demonstrates the idea, and this is
the reason why we would want `Fiber.current'.
simple-fiber-demo.rb

require 'eventmachine'
require 'fiber'

def process
  f = Fiber.current
  EM.add_timer(2){ puts "Time's Up!"; f.resume }
  Fiber.yield

  puts "Process Done!"
  EM.stop
end

EM.run{
  Fiber.new{
    # here i don't want to pass the fiber in, so Fiber.current would be useful
    process
  }.resume
}

# Fiber.root

And here's my code I showed you yesterday, which would need `Fiber.root'
Check if we're under a context of fiber or thread.

def self.create *args, &block
  if Fiber.respond_to?(:current) && RootFiber != Fiber.current &&
     # because under a thread, Fiber.current won't return the root fiber
     Thread.main == Thread.current
    FutureFiber .new(*args, &block)
  else
    FutureThread.new(*args, &block)
  end
end

So the idea is that I want to implement Futures.

Essentially whenever we do an HTTP request, rest-core would
pick whether to use FutureFiber or FutureThread depending on
the context automatically. Say that we're inside eventmachine's
event loop *and* a fiber, then I assume that the user would want
to use fibers. Otherwise if we're inside a thread, then I assume
the user want to use threads.

If we're inside a non-root fiber, then I assume that we're in a
context where there's a fiber wrapped around, and also we're
using eventmachine! So that I can `Fiber.yield' later on and keep
the reactor in eventmachine running, and only resume whenever
eventmachine has done its job.

You can read example/use-cases.rb for all possible use cases.
There are 4 different configurations:

* pure_ruby (thread)
* eventmachine_fiber
* eventmachine_thread
* eventmachine_rest_client (thread)

# Fiber#resumed? or Fiber#running? or Fiber#started?

I forgot to tell you that this method would be useful too.
Here's the code in FutureFiber.

current_fibers.each{ |f|
  next unless f.alive?
  next_tick{
    begin
      f.resume
    rescue FiberError
      # whenever timeout, it would be already resumed,
      # and we have no way to tell if it's already resumed or not!
    end
  }
}

It would be good that if we can tell the fiber is actually running or not.
`Fiber#alive?' cannot tell this. This happens when the HTTP is timed out
while the original callback cannot or hard to be canceled. That means,
we would hit `Fiber#resume' twice intentionally. It would be good that
we don't have to rescue FiberError and accidentally rescue `cannot yield
from root fiber' error. Well, should I parse the error message to distinguish
them? :P Reraise if it's a root fiber error where it should be a bug, but ignore
it if it's double resuming intentionally.

Better to have a check for that instead of using exception handling and
paring error messages.

# Fiber#[] for fiber local variables

Sorry that it seems I've already deleted the code because I dropped the
support for cool.io. The commit which removed it could be found here.
remove coolio support. sorry, i guess no one is using it :(

The exact code is:

Thread.current[:coolio_http_client].detach if
  Thread.current[:coolio_http_client].kind_of?(::Coolio::HttpFiber)

While this should be read `Fiber.current[:coolio_http_client]' here,
because this code is only used inside a fiber. The class name is
CoolioFiber :)

This is all for readability though.

p.s. I thought I used it in sync-defer, but apparently I have bad memory.
This gem is actually obsolete, no longer really used though.

# How we configure to use fibers in a web server.

I also want to show you this since you told me that you never
thought of using fibers this way. Here's my playground for various
server configurations. https://github.com/godfat/ruby-server-exp
The rack application is located at: config.ru
It's quite complicated because I tried to use the same application
for all possible configurations, so it's not quite readable.

The simplest configuration would be using Rainbows! with FiberSpawn
model, which located is here: rainbows-fiber-spawn.rb

And here's the script to launch the server: rainbows-fiber-spawn.sh

rainbows -E none -c config/rainbows-fiber-spawn.rb

Here's the code which define FiberSpawn in Rainbows! (The author is Eric Wong)
fiber_spawn.rb

Essentially it wraps every HTTP request in a fiber, so that we can yield
in the application and resume back whenever the resource is ready.

Celluloid is using fibers heavily, too.

I remember there's an issue that fiber stack is too small!
Actually we had the same issue before, since we're using Rails,
and you know that Rails has a quite huge call stack.....

I also know that in Ruby 2.0, under a 64bit machine, the stack
is much more larger than it was in 1.9.3. I am looking forward to it :D
Hope that would be large enough for such stack monster like Rails.

That's all what in my mind currently. Thank you so much for listening!
And sorry for writing it so long :P Personally I still like fibers a lot, the
idea is cool, and I think every language should have some coroutine
support. (maybe except Haskell though) It would be good that if we
could push it forward. Umm, Matz is not a thread guy, but a fiber guy? :P

[snip]

Cheers, and safe travel!

*

Doh, I should have remembered Goliath,
which is a step further from em-synchrony.

禁止餵食

日期分類

標籤分類

星之一角