Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Don't believe me? Try it:

Node.js: https://gist.github.com/3200829

Clojure: https://gist.github.com/3200862

Note that I picked the really small messages here--integers, to give node the best possible serialization advantage.

    $ time node cluster.js 
    Finished with 10000000

    real 3m30.652s
    user 3m17.180s
    sys	 1m16.113s
Note the high sys time: that's IPC. Node also uses only 75% of each core. Why?

    $ pidstat -w | grep node
    11:47:47 AM     25258     48.22      2.11  node
    11:47:47 AM     25260     48.34      1.99  node
96 context switches per second.

Compare that to a multithreaded Clojure program which uses a LinkedTransferQueue--which eats 97% of each core easily. Note that the times here include ~3 seconds of compilation and jvm startup.

    $ time lein2 run queue
    10000000
    "Elapsed time: 55696.274802 msecs"

    real 0m58.540s
    user 1m16.733s
    sys	 0m6.436s
Why is this version over 3 times faster? Partly because it requires only 4 context switches per second.

    $ pidstat -tw -p 26537
    Linux 3.2.0-3-amd64 (azimuth) 	07/29/2012 	_x86_64_	(2 CPU)
    
    11:52:03 AM      TGID       TID   cswch/s nvcswch/s  Command
    11:52:03 AM     26537         -      0.00      0.00  java
    11:52:03 AM         -     26540      0.01      0.00  |__java
    11:52:03 AM         -     26541      0.01      0.00  |__java
    11:52:03 AM         -     26544      0.01      0.00  |__java
    11:52:03 AM         -     26549      0.01      0.00  |__java
    11:52:03 AM         -     26551      0.01      0.00  |__java
    11:52:03 AM         -     26552      2.16      4.26  |__java
    11:52:03 AM         -     26553      2.10      4.33  |__java
And queues are WAY slower than compare-and-set, which involves basically no context switching:

    $ time lein2 run atom
    10000000
    "Elapsed time: 969.599545 msecs"

    real 0m3.925s
    user 0m5.944s
    sys	 0m0.252s

    $ pidstat -tw -p 26717
    Linux 3.2.0-3-amd64 (azimuth) 	07/29/2012 	_x86_64_	(2 CPU)

    11:54:49 AM      TGID       TID   cswch/s nvcswch/s  Command
    11:54:49 AM     26717         -      0.00      0.00  java
    11:54:49 AM         -     26720      0.00      0.01  |__java
    11:54:49 AM         -     26728      0.01      0.00  |__java
    11:54:49 AM         -     26731      0.00      0.02  |__java
    11:54:49 AM         -     26732      0.00      0.01  |__java
 
TL;DR: node.js IPC is not a replacement for a real parallel VM. It allows you to solve a particular class of parallel problems (namely, those which require relatively infrequent communication) on multiple cores, but shared state is basically impossible and message passing is slow. It's a suitable tool for problems which are largely independent and where you can defer the problem of shared state to some other component, e.g. a database. Node is great for stateless web heads, but is in no way a high-performance parallel environment.


Yep, serialized message passing between threads is slower - you didn't have to go through all that work. But it doesn't matter because that's not the bottleneck for real websites.

Also Node starts up in 35ms and doesn't require all those parentheses - both of which are waaaay more important.


It is definitely a bottleneck I face daily in designing crowdflower. Our EC2 bill is much higher than it could be because we have to rely so much on process-level forking.

A great example of this is resque. It'd be great if we could have multiple resque workers per process per job type. This would save a ton of resources and greatly improve processing for very expensive classes of jobs. It's a very real-world consideration. But instead, because our architecture follows this "share nothing in my code, pass that buck to someone else" model like a religion, we waste a lot of computing resources and lose opportunities for better reliability.

What I find most confusing about this argument is that I challenge you to find me a website written in your share nothing architecture that, at the end of the day, isn't basically a bunch of CRUD and UX chrome around a big system that does in-process parallelism for performance and consistency considerations. Postgresql, MySQL, Zookeeper, Redis, Riak, DynamoDB ... all these things are where the actual heavy lifting gets done.

Given how pivotal these things are to modern websites, it's bizarre for you to suggest it is not something to consider.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: