Richard Huang

Upgrade to capistrano3

2013-11-02T00:00:00+00:00

New things

I updated capistrano from 2.x to 3.0 for one project, it was a huge change. The followings are the new things:

1. New structure. If you use capify to generate base structure, you will see some new syntax.

In Capfile, you need to require all dependencies/plugins you need for deployment.

require 'capistrano/setup'
require 'capistrano/deploy'

require 'capistrano/rvm'
require 'capistrano/bundler'
require 'capistrano/rails'

Dir.glob('lib/capistrano/tasks/*.cap').each { |r| import r }

It also generates 2 stages config/deploy/production.rb and config/deploy/staging.rb, which means you don't need capistrano-ext anymore, capistrano 3 supports different stages itself.

2. New plugins. All capistrano plugins for 2.x can't be used in capistrano 3, like builtin capistrano plugin in bundler and rvm, fortunately capistrano team already wrote the bundler, rvm and rails plugins. So you should remove old capistrano 2.x plugins

# capistrano 2.x
require 'bundler/capistrano'
require 'rvm/capistrano'

and use new capistrano 3 plugins

# capistrano 3
require 'capistrano/bundler'
require 'capistrano/rvm'

3. New flow.

# capistrano 3
deploy:starting
deploy:started
deploy:reverting           - revert server(s) to previous release
deploy:reverted            - reverted hook
deploy:publishing
deploy:published
deploy:finishing_rollback  - finish the rollback, clean up everything
deploy:finished

4. New syntax. Capistrano 3 introduces lots of new syntax and new variables.

# capistrano 2.x
set :repository, "git@github.com:railsbp/rails-bestpractices.com.git"
# capistrano 3
set :repo_url, "git@github.com:railsbp/rails-bestpractices.com.git"

capistrano 3 use repo_url instead of repository variable.

# capistrano 3
set :linked_files, %w{config/database.yml config/memcache.yml}
set :linked_dirs, %w{bin log tmp/pids tmp/cache tmp/sockets vendor/bundle public/system}

linked_files and linked_dirs are very useful, it automatically creates symbolic files and dirs, which you have to write your own task in capistrano 2.x.

# capistrano 2.x
namespace :css_sprite do
  task :build, :roles => :app do
    run "cd #{release_path}; #{rake} RAILS_ENV=#{rails_env} css_sprite:build"
  end
end

# capistrano 3
namespace :css_sprite do
  task :build do
    on roles(:app) do
      within release_path do
        with rails_env: fetch(:rails_env) do
          execute :rake, "css_sprite:build"
        end
      end
    end
  end
end

capistrano 3 likes dsl (on, within, with, etc.) more.

Capistrano 3 also added parallel and sequence execution, and other features.

Problems

But I also found some problems from capistrano 3

1. capistrano 3 doesn't allow invoke inside on() block, sometimes I have to write duplicated code, see this pull request.

2. capistrano generate linked_files and linked_dirs only for app servers, so when execute deploy:migrate on db server, it can't find config/database.yml, here is my temp solution.

Use paperclip without activerecord

2013-10-31T00:00:00+00:00

Recently I built an image upload api which didn't use activerecord, but I don't want to handle resizing image thumbnails by myself, so I decided to reuse paperclip.

Paperclip is an easy file attachment management for ActiveRecord, but we used activemodel without activerecord, I found a gist which gave me a simple solution, but it was not enough. We continued the hacking work.

1. defined the attachment path and url. Paperclip used AR id partition in default path, but activemodel don't have id attribute, so I have to override the attachment path and url

# config/initializers/paperclip.rb
Paperclip.interpolates :uuid_partition do |attachment, style|
  attachment.instance.uuid.scan(/.{1,8}/m).join("/")
end

# app/models/image.rb
has_attached_file :attachment,
  styles: { three_dot_five_inch: "640x960>", four_inch: "640x1136>" },
  path: ":rails_root/public/system/:attachment/:uuid_partition/:style/:filename",
  url: "/system/:attachment/:uuid_partition/:style/:filename"

def initialize
  @uuid = UUID.new.generate.gsub('-', '')
end

Instead of auto incremented id, I used uuid partition for attachment path and url, because it's more scalable.

2. run_callbacks during save, which will also trigger paperclip callbacks

define_model_callbacks :save, only: [:after]

def save
  run_callbacks :save do
  end
end

Then you can handle the attachment by activemodel and paperclip, I pasted all code on gist here.

railsrumble 2013 - designapis

2013-10-21T00:00:00+00:00

I took part in this railsrumble last weekend, my entry is designapis.com, vote it up here.

Why did I built it? It comes from my working experience, I worked on serveral projects that needs to design http apis for ios clients, as we worked remotely, we have to write down the apis so that ios developers and ruby developers can work independently.

Before we used google docs and github gists, they are both good to share between team, but they don't provide any style/format, I have to set the format by myself, it's not convenient, I have to remember all formats for requests, responses and parameters, otherwise they will generate wrong styles.

So I decided to build a service to simplify apis design/documentation, as railsrumble last only 2 days, I just implemented a small set of features I wanted, currently you can generate CRUD apis from a template (inspired from rails generator), you can add any request, response and parameter, it also gives you some hints that you should take care of error responses like 404 and 422.

Feel free to try it without registration and any feedback is welcome.

mongoid 3.0.x not set relation properly

2013-09-08T00:00:00+00:00

I was trying to fix bullet test failure with mongoid 3.0.23, the failed test is to test the 1-1 relationship as follows

it "should detect non preload association" do
  Mongoid::Company.all.each do |company|
    company.address.name
  end
  ......
end

After reading the logs, it generated 2 unexpected query

MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_companies selector={} flags=[:slave_ok] limit=0 skip=0 batch_size=nil fields=nil (196.5840ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_companies selector={} flags=[:slave_ok] limit=0 skip=0 batch_size=nil fields=nil (0.8612ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_addresses selector={"$query"=>{"company_id"=>"522c78a4c41a6b019b000014"}, "$orderby"=>{:_id=>1}} flags=[:slave_ok] limit=-1 skip=0 batch_size=nil fields=nil (2.9750ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_companies selector={"$query"=>{"_id"=>"522c78a4c41a6b019b000014"}, "$orderby"=>{:_id=>1}} flags=[:slave_ok] limit=-1 skip=0 batch_size=nil fields=nil (1.2510ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_addresses selector={"$query"=>{"company_id"=>"522c78a4c41a6b019b000015"}, "$orderby"=>{:_id=>1}} flags=[:slave_ok] limit=-1 skip=0 batch_size=nil fields=nil (0.9012ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_companies selector={"$query"=>{"_id"=>"522c78a4c41a6b019b000015"}, "$orderby"=>{:_id=>1}} flags=[:slave_ok] limit=-1 skip=0 batch_size=nil fields=nil (0.2601ms)

As you can see, every time it queries an address, it also queries a company, but it already queries all companies, how stupid it is! It is caused by Mongoid::Relations::Accessors#set_relation doesn't set properly. I don't want to explain the details here, but solution is simple, just update mongoid to 3.1.x. Here are the logs for the same test running on mongoid 3.1.4

MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_companies selector={} flags=[:slave_ok] limit=0 skip=0 batch_size=nil fields=nil (0.4551ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_companies selector={} flags=[:slave_ok] limit=0 skip=0 batch_size=nil fields=nil (0.3150ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_addresses selector={"$query"=>{"company_id"=>"522c7b8ec41a6b3712000014"}, "$orderby"=>{:_id=>1}} flags=[:slave_ok] limit=-1 skip=0 batch_size=nil fields=nil (0.8950ms)
MOPED: 127.0.0.1:27017 QUERY        database=bullet collection=mongoid_addresses selector={"$query"=>{"company_id"=>"522c7b8ec41a6b3712000015"}, "$orderby"=>{:_id=>1}} flags=[:slave_ok] limit=-1 skip=0 batch_size=nil fields=nil (0.5548ms)

Great, issue is solved.

Safari video tag without referer header

2013-08-07T00:00:00+00:00

We're building a website which needs to play video online, we know it's pretty easy for modern browsers who support html5, like chrome and safari, they all support video tag, which can play video files online directly.

<video controls>
  <source src="RESOURCE URL HERE" />
</video>

It's supposed to work for most cases, but our video resources are uploaded to s3, and our s3 policy for video resources is only when HTTP referer header is one of our websites, then the video resources can be accessed, this is used to prevent our video resources to be played on other website.

I noticed the videos are played welled on chrome, but not on safari, after some time's digging, I found the chrome will send requests to fetch video resources with expected http headers, including referer header, but safari's video requests won't carry any http header, it causes the video requests are refused by s3.

One solution is to use flash video player, which sends requests with proper http headers, although flash is old, it's still installed on most of computers, but it won't help on ios devices (iphone and ipad). So the only way to make it work on ios is to change our s3 policy, allow the special referer header "empty".

JQuery AMD Plugin Template

2013-08-06T00:00:00+00:00

Several years ago I posted how to write a jquery plugin template, but in the recent years, browser javascript is evolving, developers are more likely using asynchronous module definition API (Require.js). So the jquery plugin template should also be updated like

(function (factory) {
  if (typeof define === 'function' && define.amd) {
    // AMD. Register as anonymous module.
    define(['jquery'], factory);
  } else {
    // Browser globals.
    factory(jQuery);
  }
}(function ($) {
  $.fn.pluginName = function(options) {
    var defaults = {
      // define default options
    }

    var o = $.extend({}, defaults, options);

    return this.each(function() {
      var e = $(this);
      // write logic here
    });
  }
});

The difference is the jquery plugin uses asynchronous jquery module if it exists, otherwise uses global jQuery as usual.

Migrate rails-bestpractices.com to rails4

2013-07-14T00:00:00+00:00

These 2 weeks I migrated rails-bestpractices.com to rails 4 from rails 3.2.13. Here are some experience I'd like to share with you.

Make sure you have good test code

rails-bestpractices.com has many rspec and cucumber test code, they can find out most of warnings and errors after migration.

Update Gems

First, update rails to 4.0.0 in Gemfile, but soon you will find you have to update many gems, devise, compass-rails, cucumber-rails, etc., some are rc version or raisl4 branch,

You also need to remove some gems, like strong_parameters and turbo-sprockets-rails3.

Update bin executables

Rails 4 app finds executables in bin/ directory, run

rake rails:update:bin

to get bin/bundle, bin/rails and bin/rake

Remove unused configs

config.whiny_nils
config.active_record.mass_assignment_sanitize
config.active_record.auto_explain_threshold_in_seconds

Add new configs

config.eager_load = false

to config/environments/development.rb and config/environments/test.rb

config.eager_load = true

to config/environments/production.rb

New secret_token

Rails 4 encrypts the contents of cookie-based sessions, need to use secret_key_base instead of secret_token.

Application.config.secret_token = 'xxx'
# =>
Application.config.secret_key_base = 'yyy'

Remove assets group

Rails 4 has removed assets group, you should remove it from Gemfile and config/application.rb

# Gemfile
group :assets do
  gem 'sass-rails'
  gem 'coffee-rails'
  gem 'uglifier'
end
# =>
gem 'sass-rails'
gem 'coffee-rails'
gem 'uglifier'

# config/application.rb
if defined?(Bundler)
  # If you precompile assets before deploying to production, use this line
  Bundler.require(*Rails.groups(:assets => %w(development test)))
  # If you want your assets lazily compiled in production, use this line
  # Bundler.require(:default, :assets, Rails.env)
end
# =>
Bundler.require(:default, Rails.env)

Filter parameters in initializer

Rails 4 prefer setting filter_parameter in initializer.

# config/application.rb
config.filter_parameters += [:password]
# =>
# config/initializers/filter_parameter_logging.rb
Rails.application.config.filter_parameters += [:password]

Fix routes

Rails 4 doesn't allow using match (without setting via get or post), should use get or post instead, like

match '/auth/failure' => redirect('/')
# =>
get '/auth/failure' => redirect('/')

New scope syntax

Rails 4 only allows scopes as a proc

scope :published, where(:published => true)
# =>
scope :published, -> { where(:published => true) }

Enable turbolinks

Assume you are not using any client side MVC framework, like Backbone or Ember, turbolinks can speed up your web pages initialization.

1. add turbolinks gem in Gemfile

gem "turbolinks"

2. require turbolinks in application.js

//= require turbolinks

Please let me know if you have any problems to migrate to rails 4 :-)

How to render, upload and download large files on heroku with s3

2013-06-18T00:00:00+00:00

I'm consulting on a rails project on heroku, it involves generating a large pdf for customer, so you must already guess it led to 30s timeout on heroku.

At first, I handled it with common sense, moving pdf render to a background job, in the client side, it polls the status of bj, if job is complete, then render the pdf.

Everything works fine on my laptop, but after pushing to heroku, it succeed to running then job, polling the status, but finally it can't find the generated pdf. Then I realized web dyno and worker dyno are running on different servers, that means web dyno can't find the pdfs which are generated on worker dyno. Okay, we need a cloud storage service, of course, s3 is the first choice.

I used aws-sdk as the s3 client, it's pretty easy to upload pdf to s3, as pdfs are private on s3, it has to download pdf and render to client after polling successfully. The timeout problem still exists if the pdf file is large or the network is not good. (take a long time to render pdf content to client)

After googling some solutions, I decided to use S3 Temporary Security Credentials, it creates a resource url with a temporary credential, you can set the expire date for the resource, it sacrifices some privacy, the resources are still private, but they can be accessed by the url with a temporary credential, we set the expire date to 1 hour later, so it's not a big deal.

Resource url with temporary security credential doens't exist in aws-sdk gem, so I have to implement it by myself.

require 'openssl'
require 'digest/sha1'
require 'base64'

def signed_url(path, expire_date)
  digest = OpenSSL::Digest::Digest.new('sha1')
  string_to_sign = "GET\n\n\n#{expire_date}\n/#{S3_BUCKET}/#{path}"
  hmac = OpenSSL::HMAC.digest(digest, S3_SECRET_ACCESS_KEY, string_to_sign)
  signature = CGI.escape(Base64.encode64(hmac).strip)
  "https://#{S3_BUCKET}.s3.amazonaws.com/#{path}?AWSAccessKeyId=#{S3_ACCESS_KEY_ID}&Expires=#{expire_date}&Signature=#{signature}"
end

It can generate signed url for different resources with different expire date, now I just tell client the signed_url and the client just render the pdf from s3 rather than heroku, so no timeout anymore, awesome!

So the whole process is as follows

Client clicks "Print PDF" button, it sends a request to web dyno.
Web dyno asks worker dyno render pdf. i
Client keeps polling the job status.
Worker dyno render the pdf and uploads to s3, generates a signed url.
Client gets the jos complete message with s3 signed url.
Client print the pdf from s3.

My presentation at reddotrubyconf 2013

2013-06-07T00:00:00+00:00

This is my presentation on reddotrubyconf 2013 with notes, building asynchronous apis.

3. Several years ago when I started learning rails, many people said rails was not fast, but it can significantly speed up development, the famous words are "Hardware is cheap, Programmers are expensive".

4. It is true when your business is still young, at first, you may have only one server, then you can buy more servers, distribute web, app and db to different servers, it's the easiest way to handle more traffic.

5. but...

6. one day your business gets successful, more and more users will bring more and more traffics, you have to buy more and more web servers, app servers and db servers.

7. At this stage, machines are not cheap any more, so programmers try to find ways to improve server performance, increase concurrency and reduce machines.

8. So you see linkedin moved from rails to node, 27 servers cut and up to 20x faster.

9. Iron.io mvoed from rails to go, 28 servers cut.

10. Does it mean ruby or rails is so slow? Does it mean we should drop ruby and use node.js or go instead?

11. From my experience, the answer is no, this is what I want to share with you today.

12. During my last job, I'm lucky that I have the opportunity to build the same leaderboard apis service running with multi processes, multi threads and asynchronous non blocking io ruby servers. At first we build the api service with rails and ree, db is mysql, the average response time is 50 ms, and it handled 60k rpm on production with 13 machines, 6 passenger instances per machine. After that we migrated to JRuby 1.7.0, it introduced 40% performance improved, average response time decreased to 30 ms, it was same to handle 60k rpm on production, but with only 10 machines, 1 torquebox instance with 5 threads per machine. Finally, we rewrote the api service, using ruby 1.9.3, and used goliath, which is a non-blocking ruby web server, switched db to redis, the average response time decrease to 4 ms, and handled 240k rpm on production with only 4 machines, 4 goliath instances each machine. If old rails api service also handled 240k rpm, it needs 52 machines.

13. So I can say moving from ruby to ruby, 48 servers cut and 10 times faster.

14. We run the api service with rails synchronously, all IO operations are blocked, but run with goliath is asynchronous, and IO operations are nonblocking. It's a bit unfair to compare the performance directly between Rails and Goliath, or Rails and Node.js, but it's good to know that building api service with asynchronous nonblocking io can significantly increase the concurrency.

15. How blocking IO works? e.g. when a request is coming, a process gets the cpu time, run the code, but when it calls database query, cpu has to wait for it to complete, we all know IO operations are slow, the blocking IO will block the whole process.

16. In multi processes model, when cpu is blocked in process A, it will schedule from process A to process B, keep running, when IO operation is completed in process A, cpu will schedule back to process A and continue working.

17. The advantages of multi processes are the multi processes can be executed in true parallel on multi cores cpu. Running with multi processes model is easy to manage, we can start or stop a processor by sending a signal. The disadvantages of multiple processes are process switching is expensive, it involves switching out all of the process resources. It also consumes many memory, multiple processes means multiple memory copy.

18. Multi threads model is similar to multi processes model, when cpu is blocked in thread A, it will schedule from thread A to thread B in one process.

19. The advantages of multi threads are threads switching is cheap, it involves switching out only the resources unique between threads. As many resources are shared between threads in one process, multi threads consume much less memory. The disadvantages are if you share mutable data across threads, you need to synchronize access to that data for thread safety, this will affect performance and concurrency. With ruby 1.9 or 2.0, GIL is still there, that means only one thread can handle request at a time. The exception is JRuby and Rubinius which already removed GIL and can make use of multi cores.

20. Evented model is running with a main loop, and never blocked, all io operations are asynchronous, when calling an io operation, instead of waiting, the main loop can process other requests, and come back when the response from io call is ready.

21. The advantages of the evented model are there is no blocking io, no context switching and it consumes least memory usage comparing to multi processes and multi threads models. The disadvantage is your code will be full of callbacks, make it difficult to understand.

22. In ruby world, we usually use eventmachine to implement non blocking io, but it's very common to write many nested callbacks, like this.

23. Good luck, ruby 1.9 introduces fiber, and a gem, named em-synchrony, fiber aware eventmachine can help solve too many callbacks issue.

24. The code works same as the last example, using em-synchrony, but no callbacks and more readable.

25. Let’s clarify the definitions of concurrency and parallelism, concurrency performs 2 operations in tandem, while parallelism performs 2 operations at the same time.

26. Evented model is used to increase concurrency in one processor, so in practice, we will use multi processes with evented in order to utilize all of the cores on your CPU.

27. We already talked why we should use async non-blocking IO, but how? I wrote a project on github named apis-bench, it implements a simple leaderboard apis service with multiple framework and run on multi processes, multi threads and asynchronous non blocking ruby server.

28. Assume we use rails to build the apis service, router dispatches the request to controller, controller creates, reads, updates or destroys models, then controller generate a json view and sends the response.

29. A good practice is skinny controller and fat model, so I write most logic in models.

30. Write the controller as simple as possible, and use respond_to / respond_with to render a json response.

31. Here is the router.

32. Instead of migrating asynchronous io directly, let's do a small step, migrating rails to grape framework. Grape is a micro framework to build REST apis, using grape instead of rails controller can decrease response time.

33. Grape is responsible for router, controller and view, gets request, ask model to do something and then render response.

34. Most developers prefer using activerecord, it provides many powerful ways to develop models rapidly. We can use activerecord without rails, so here, we don't need to do any change in model layer.

35. Grape provides its own DSL, we have to use Grape::API replace rails controller api, but we already followed the skinny controller practice, it should not be too much work to do.

36. Next step, let's migrate to asynchronous non-blocking io, goliath is a non-blocking ruby web server framework, adding goliath can significantly increase the throughput.

37. Here, each HTTP request is handled by goliath, request is executed within a ruby fiber, then goliath proxies request to grape, all IO operations are asynchronous.

38. In this migration, we also no need to change any code except adding goliath api, and telling it delegate request to grape api, very simple.

39. Besides we must replace our existing blocking io libraries to eventmachine's libraries, like mysql2 to em_mysql2, mongo to em_mongo, etc.

40. We have another option besides grape, it's sinatra, it can also decrease the response time.

41. Similar to grape, sinatra takes care of router, controller and view.

42. The only place we should change from rails to sinatra is the controller, here we define the route, action and render json

43. After using sinatra, it's also easy to migrate to asynchronous non blocking io by adding sinatra-synchrony, as its name implies, sinatra-synchrony adds em-synchrony to sinatra.

44. sinatra-synchrony is not a web server, so we have to use an event ruby server, like thin, then sinatra-synchrony executes each http request within a ruby fiber just like goliath.

45. Adding sinatra-synchrony is also easy, what you need to do is only register Sinatra::Synchrony as an extension, then it works.

46. Same, you need to replace blocking io library with eventmachine's libraries.

47. Finally, let's see the benchmark result.

48. The first benchmark test is a CPU bound action, db time takes about 10% total response time, it's tested with apache benchmark.

49. I tested with several groups, sending 1000 requests with 10 concurrency, 50 concurrency, 100 concurrency, 200 concurrency, 500 concurrency and last one is sending 2000 requests with 1000 concurrency, in each group, I tried rails, sinatra, grape, sinatra with threads, grape with threads, sinatra-synchrony with thin and grape with goliath, they all run in a single process, rails, sinatra and grape are running in unicorn, sinatra with threads and grape with thread are running in rainbows. The value here is the time taken for completing all requests. As you have seen, sinatra api is 40% faster than rails api and grape api is 30% faster than rails api, sinatra threads and grape threads in this case, about 2 times slower, it should perform better when running threads with jruby. Asynchrous non blocking io performs best, especially sinatra-synchrony. When sending requests with 200 concurrency, rails, sinatra, grape, sinatra threads and grape threads are all timed out, they are failed to handle so many requests, but grape with goliath and sinatra-synchrony with thin can handle 1000 concurrency without any errors. With CPU bound actions, threads didn't perform well, but asynchronous non blocking io works can handle much more requests.

50. What about the IO bound action? the following test is an action, in which db time takes more than 90% total response time, I sent the requests with stable rates and measured the performance on newrelic.

51. I sent 1200 rpm to rails api, but it can only handle 984 rpm, others returns 502 timed out error.

52. Sinatra api can only handle 1080 rpm.

53. and grape api handle 1130 rpm, all of them failed to handle 1200 rpm.

54. I set rainbows work with 200 threads and 200 connections in db pool, sinatra threads successfully handle 3000 rpm.

55. Grape threads also succeed.

56. grape with goliath passed as well.

57. I failed to add newrelic with sinatra-synchrony, but it also succeed to handle 3000 rpm. Another known issue is I failed to add fiber_pool to goliath, I appreciate it if you can help to solve the issue and open a pull request to me.

58. Okay, the conclustion is Rails is good, it's still a good choice to build apis service rapidly, we can migrate rails to sinatra or grape to decrease response time, then migrate to sinatra-synchrony or goliath to increase the throughput, finally I have to say asynchronous non-blocking io is awesome, you should give it a try.

another zero downtime deployment solution

2013-01-30T00:00:00+00:00

I wrote a post for jruby migration 2 monthes ago, it mentioned a solution to do zero downtime deployment: pull out server out of load balancers, restart server, and then put in the server. It works but has some cons

you must have more than 1 app hosts.
deployment process gets much slower if you have lots of app hosts.
you lost one host's throughput during deployment.

I'm using a different solution for zero downtime deployment now, instead of processing app hosts one by one

it starts replicated ruby instances on all app hosts.
reload load balancer (proxy) to send traffic to replicated ruby instances.
stops original ruby instances.

It won't slow down your deployment process, it also works well if you only have 1 app host and you don't lost any throughput during deployment.

The disadvantage is it needs more memory on your app host, it occupies x2 ruby instances' memory during deployment. Our project is an api service built on ruby not rails, memory usage is pretty low, only 50 mb per ruby instance, so x2 memory usage is not a big deal.

another redis automatic failover solution for ruby

2013-01-13T00:00:00+00:00

Redis gets more and more popular as a backend storage, so the redis failover solution becomes important before you use redis as a critical resource.

Currently the most popular automatic master/slave failover solution for ruby is redis_failover, it's based on ZooKeeper, if you already have ZooKeeper in your infrastructure, it's great.

But I noticed that redis already has a built-in automatic failover solution, called Redis Sentinel. In case you didn't heard of it, please read the official document, it's simple and no other external dependency. I searched on github, but none was working well. I have to implement it by myself.

The key point is you never connect to redis master server directly. Instead, you talk to redis sentinel servers, ask them where is the master server, and then connect to the redis master server.

When your redis master server down, your redis sentinel servers will tell you a new master server, so you just disconnect old server and connect to new master server

My soluion is a monkey-patch to redis-rb gem, it's redis-sentinel, before it tries to connect redis server, it firstly asks redis sentinels where is master server, then connect as usual. Try it and give me the feedback.

newrelic-grape released

2012-12-21T00:00:00+00:00

No instrumentation, no performance tuning!

This is my first time to use grape to build an api service, grape repo has more than 2k watchers, but I'm surprised there is no existing newrelic grape suppport, I just found some gists to do it, and this blog post gave me the idea to add newrelic instrument as grape middleware, but it's not the standard way newrelic recommends.

So I released newrelic-grape gem to help you integrate newrelic into grape.

What you need to do is

require "newrelic-grape"
require "rpm_contrib"

and monitor the performance on newrelic.

How I find out a memory leak in grape

2012-12-16T00:00:00+00:00

I'm helping my customer build a high performance api service these weeks, we are close to release, but when I did load test this Wednesday, I found the memory kept growing when I sent traffic and never went down, it was obviously a memory leak.

Lucky is I can reproduce the memory leak on my local machine, so I can detect it easily. Our api service is simple, only contains model layer (AR and redis) and api layer (based on grape). At first, I disabled model layer, but memory leak was still there, so I was pretty sure the leak was in api layer.

Memory leak is always not easy to find, especially when I'm not sure where it is, in my own code or some dependent libraries I used. I need some tools' help.

First, I used heap_dump to dump the ruby heap memory after sending 10 minutes' traffic, and searched the keywords used in our repository, I noticed every request path string resided in memory, why? Was there an array or hash used them? heap_dump can't answer me.

Then I tried ruby 1.9.3 ObjectSpace to find more info. I changed Grape::API.call behavior, printing live objects for each request.

def call_with_gc(env)
  GC.start
  result = call_without_gc(env)
  p ObjectSpace.count_objects
  result
end

The followings are parts of the result

{:TOTAL=>331126, :FREE=>218067, :T_OBJECT=>3339, :T_CLASS=>3394, :T_MODULE=>474, :T_FLOAT=>195, :T_STRING=>55324, :T_REGEXP=>1135, :T_ARRAY=>20188, :T_HASH=>926, :T_STRUCT=>125, :T_BIGNUM=>22, :T_FILE=>4, :T_DATA=>16011, :T_MATCH=>13, :T_COMPLEX=>1, :T_RATIONAL=>33, :T_NODE=>11273, :T_ICLASS=>602}
[23153:INFO] 2012-12-16 21:59:55 :: Status: 200, Content-Length: 19, Response Time: 42.43ms
{:TOTAL=>331126, :FREE=>218066, :T_OBJECT=>3339, :T_CLASS=>3394, :T_MODULE=>474, :T_FLOAT=>195, :T_STRING=>55325, :T_REGEXP=>1135, :T_ARRAY=>20188, :T_HASH=>926, :T_STRUCT=>125, :T_BIGNUM=>22, :T_FILE=>4, :T_DATA=>16011, :T_MATCH=>13, :T_COMPLEX=>1, :T_RATIONAL=>33, :T_NODE=>11273, :T_ICLASS=>602}
[23153:INFO] 2012-12-16 21:59:56 :: Status: 200, Content-Length: 20, Response Time: 43.29ms
{:TOTAL=>331126, :FREE=>218065, :T_OBJECT=>3339, :T_CLASS=>3394, :T_MODULE=>474, :T_FLOAT=>195, :T_STRING=>55326, :T_REGEXP=>1135, :T_ARRAY=>20188, :T_HASH=>926, :T_STRUCT=>125, :T_BIGNUM=>22, :T_FILE=>4, :T_DATA=>16011, :T_MATCH=>13, :T_COMPLEX=>1, :T_RATIONAL=>33, :T_NODE=>11273, :T_ICLASS=>602}
[23153:INFO] 2012-12-16 21:59:57 :: Status: 200, Content-Length: 20, Response Time: 45.74ms

As you can see, every request, there was a string couldn't be garbage collected, but I still didn't know where it was. I commented my logic code in api layer, just returned an empty json, and string leak still existed, then I went to grape source code, commented the code in Grape::API#call method, updated as following code

def call(env)
  [200, {}, ""]
end

After that, the string memory leak disappeared, It was a strong possibility that memory leak was in grape gem.

Next thing was pretty easy, tried to replace grape middleware one by one, grape middleware has 3 methods, call!, before and after, will be called in every request, I replaced all of them to figure out leak.

Finally, I found it's method format_from_extension in grape Formatter middleware caused memory leak, it genereate a symbol no matter if there is an extension in the request path, e.g.

if requesting /v1/blog/post/1, it will create symbol :"/v1/blog/posts/1"

if requesting /v1/blog/post/2, it will create symbol :"/v1/blog/posts/2"

......

In case you don't know, symbol won't be garbage collected in ruby, so every time it got a request path different then before, it created a symbol in memory which won't be garbage collected.

Problem detected, solution is here.

In conclusion, be careful to ruby symbol, do not convert any non controlled string to symbol.

JRuby at OpenFeint - a JRuby migration success story

2012-11-14T00:00:00+00:00

TL;DR: OpenFeint gets 40% performance improvement after migrating to JRuby from REE.

About OpenFeint

OpenFeint was the largest mobile social gaming platform in the world, It was acquired by GREE for $104 million last year, and a new global platform is building to replace OpenFeint. It is still one of the biggest rails applications, with hundreds of thousands API calls per minute.

OpenFeint platform is using rails 2.3.14 and was running on ree 1.8.7.

Why try JRuby

My main job is to improve the performance and scalability of OpenFeint platform. This April, I attended Railsconf at Austin, there was a panel discussion talking about real world rails apps, speakers came from New Relic, Zendesk, Groupon, etc. They use the similar achitecture like us, ree 1.8.7, rails 2.3, mysql, memcached, redis, rabbitmq and so on. They all complained the slow gc of ruby 1.8.7, so did we. After that, there are 2 jruby sessions interested me.

Up and to the right - how Spiceworks is scaling 200 million requests per month, they shown how they migrate their rails app to jruby and got 20% performance improvement.
Complex Made Simple: Sleep Better with Torquebox, it introduced torquebox, a ruby application server that is build on JRuby and JBoss AS 7.

When I went back to hotel, I googled something about jruby performance and found torquebox performance benchmark, it looked pretty exciting. At that time I decided to try jruby on OpenFeint platform.

Note: you probably know new relic and zendesk have already migrated to ruby 1.9.

Quick and dirty performance test with JRuby

I always prefer doing the performance test by myself rather than blindly believing the performance benchmark online. So the first thing I want to do was to do a quick performance test with JRuby on OpenFeint platform.

It was expected that OpenFeint platform couldn't work with JRuby. To quickly verfiy if JRuby could give us a great performance improvement, I fixed incompatible ruby gems, like adding jruby-openssl gem, removing SystemTimer gems and using activerecord-jdbcmysql-adapter instead of mysql gem. I also did some dirty hacks, e.g. I disabled database sharding, background job and other non working parts, just want to do a quick performance test. Then I deployed app to one of our qa servers, the result of quick performance test is as follows

response time of ree + passenger is 331ms
response time of jruby + torquebox is 51.5ms

I was shocked that JRuby is so fast, that made it easy to persuade manager to migrate OpenFeint platform to JRuby.

Note: our qa environment is quite different to production environment, databases are shared between qa servers, but memcached, redis, rabbitmq and app server are working together in one host, and ree on qa server didn't do any gc tuning.

JRuby migration strategy

After the quick performance test, JRuby looked very promising, then I'm allowed to focus my work on JRuby migration. Before I tell you how we migrate to JRuby, please let me give you a short introduction about what OpenFeint platform uses

load balancer servers with nginxes.
app servers with nginx + passenger.
memcached servers for caches.
redis servers for feature flags, high score caches, device mapper, etc.
mysql servers for data storage.
uses rabbitmq server and workling servers to handle background jobs.

Of course OpenFeint platform uses other servers for cron job, performance test, continuous integration, full text search, log analytics, etc.

To handle the massive requests, OpenFeint platform splits app and databse servers into different pools according to different functionalities.

Each app pool is isolated, they don't know each other. Load balancer servers decide sending requests to which pool according to the request urls. Each pool will connect to all db servers, e.g. high score app servers will fetch high score info from high score dbs and fetch user/game info from core dbs.

Considering that we don't have experienced java ops and we only have 1 or 2 qas can involve in, it is a big risk to migrate the whole OpenFeint platform to JRuby. So I decide to do JRuby migrate one app pool by one app pool.

The advantage of migration one pool by one pool is it allows OpenFeint gets the JRuby's speed earlier, 1 or 2 qas are enough to promise app is working correctly for one pool, ops can setup jruby environment and tune the jvm performance on one pool's hosts to accumulate jruby experience.

The disadvantage is we have to promise OpenFeint platform is working well on both REE and JRuby, running app with REE on some pools and running app with JRuby on other pools.

Note: only load balancers and mysql servers are dedicated servers, others are VPS.

Fix incompatible gems

The most problems for migrating a rails app to JRuby are incompatible gems, like c extensions gems or some non thread-safe ruby gems. I encountered 2 incompatible gems that wasted my time.

1. typhoeus, it is one of the fastest http client ruby gems, it's a c extenion gem, we used it to synchronize data between OpenFeint platform and the new global platform. The official document says it is built with FFI and is ready for use with any Ruby implementation. But during performance test, I found it always crashed the JVM after running about 1 hour. According to the crash log, I fixed a missing attach_function here, but it didn't help. I ended up using net-http-persistent in JRuby while using typhoeus in REE. From performance test, I surprisingly found JRuby + net-http-persistent isn't slower than REE + JRbuy.

2. memcached, it is the fastest memcached client ruby gems, it's also a c extension gem. At first I used jruby-memcache-client, but jruby-memcache-client uses Base64 to encode/decode value, which can't work with memcached gem together. Then I chose dalli which supports both REE and JRuby, but it uses different hash and distribution algorithms, which causes too much cache misses on production. I searched some other jruby memcached clients, but none of them are compatible with memcached gem, I ended up writing jruby-memcached gem by myself based on spymemcached. I wrote a post about this gem before, check it out here.

Enable threadsafe

By default, threadsafe is disabled in rails 2.3.14, which means every requests are locked by Rack::Lock, it's not a big deal when running in multi-processes servers, like unicorn or passenger, but it loses the JRuby's natvie multi-threads power. So make sure you enable the threadsafe when migrating to JRuby.

Enabling threadsafe means rails won't automatically load libraries under lib/ directory, you have to load them by yourselves.

Enabling threadsafe also means you must consider thread safety seriously. OpenFeint platform uses long-running threads to communicate with scribe, there is a eager loaded global queue and a lazy loaded thread for each process, when doing performance test with JRuby + Torquebox, sometimes it will genereate several lazy loaded threads, and finally cause memory leak. The solution is to eager load the long running thread.

Pass all tests

It's a common sense that you must have good coverage unit, functional and integration tests before doing a big migration. When all tests were passed, I was confident to go further.

Note: JRuby always eat much more memory to run memory, for openfeint platform, I have to allocate 2 GB memory

JRUBY_OPTS=-J-Xmx2g jruby --client -S bundle exec rake test

Pick up a JRuby server

There are 4 JRuby servers that I can choose

Trinidad, built on JRuby::Rack and Tomcat.
Torquebox, built on JBoss AS.
Mizuno, built on Jetty.
Puma, a new ruby web server built for concurrency.

Puma depends on rack ~> 1.2 but rails 2.3.14 depends on ~> 1.1.0, so I can't try Puma for OpenFeint platform.

I chose Torquebox from the other 3 servers, the reasons are as follows.

1. Torquebox runs faster than Trinidad and Mizuno according to our own performance test, I think this is bacause Torquebox is mostly written by Java while other servers are written by Ruby. 2. Some Torquebox core team members are paid by Red Hat to work on Torquebox project, that means we can get better supports. 3. Torquebox project is very actively developing, and always keeps up with latest JBoss AS server and JRuby.

Note: Recently I replaced torquebox with torquebox-lite, which is a smaller, web-only version of Torquebox, you can easily add other jboss submodules when necessary.

Monitor JVM

Running on JVM is quite different than running on REE, you probably face some new issues, like memory leak and thread safety. We uses New Relic to monitor response time, throughput, etc., but it doesn't help to monitor jvm heap / non heap memory and thread stacks. Fortunately we also use scout to monitor our servers, scout provides JMX Monitoring plugin which collects the memory usage of jvm. It is okay for production so far, but we will use zabbix for better monitoring in the future.

In Java world, there are a lot of monitor tools. Command tools like jstat, jstack and jmap, graphical tools like jconsole and visualvm, you can easily get the heap / non heap memory usage, gc stats, each thead stack trace, etc.

It's really important to monitor JVM when doing performance / stress test, it can help you find out memory leak and thread safe issues before running on production. Here are 2 examples.

1. memory leak, I noticed that heap memory (both edge and old) reached 100% during stress test. Although no OutOfMemoryError raised, it was definitely a memory leak. I used jmap to dump all heap memory and read them by Eclipse MAT, here is the result.

It's a typical memory leak, objects in container can't be gabarge collected.

2. thread safe, I also found the db connection pool in activerecord 2.3.14 is not thread safe. The throughput will decline after running a long time, I used jstack to dump all threads stack trace and saw most of threads are locked in connection_pool as follows.

"http--127.0.0.1-8180-1" daemon prio=10 tid=0x00007f4a17609800 nid=0x725a in Object.wait() [0x0000000049dfc000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  - waiting on <0x0000000704f40e18> (a org.jruby.libraries.ThreadLibrary$ConditionVariable)
  ......
  at rubyjit.ActiveRecord::ConnectionAdapters::ConnectionPool#checkout_0978F3C1EFB2CBFA2CD717B12DA76E3113CD78B7.block_1$RUBY$__file__(/home/deploy/rails_apps/  openfeint_platform/shared/bundle/jruby/1.8/gems/activerecord-2.3.14/lib/active_record/connection_adapters/abstract/connection_pool.rb:192)
  at rubyjit$ActiveRecord::ConnectionAdapters::ConnectionPool#checkout_0978F3C1EFB2CBFA2CD717B12DA76E3113CD78B7$block_1$RUBY$__file__.                          call(rubyjit$ActiveRecord::ConnectionAdapters::ConnectionPool#checkout_0978F3C1EFB2CBFA2CD717B12DA76E3113CD78B7$block_1$RUBY$__file__:65535)
  ......

But the count of http threads is equal to the count of db connections, no thread should be locked. Considering our situation, I added a monkey patch to connection_pool with one db connection per thread. It's not perfect but works well.

Tune JVM performance

There are several jvm settings you should set for JRuby performance.

1. Xms and Xmx, when we hot deployed app to Torquebox by touching -knob.yml.dodeploy, it took more than 20 minutes to complete, which was unacceptable, after discussing with Torqeubox support team, I knew default value for Xms is 64m and Xmx is 256m, they are too small, then I increased them to 2g, it took only 100 seconds to hot deploy. The root cause is hot deployment will increase memory a lot, which causing lots of full GCs.

2. CodeCache, when we do the performance test, I found response time suddenly jumped after running a few minutes, the torquebox log told me "CodeCache is full. Compiler has been disabled." CodeCache is a part of non heap memory in Hopspot JVM, it's 64m by default, so I increased it to 256m by setting -XX:ReservedCodeCacheSize=256m, then I don't see the response time jump anymore.

There are a lot of JVM parameters you can tune for your application, talk and learn from some Java experts.

Performance / Stress test

I mentioned I already did a quick performance test, but it didn't make a big sense, because qa and production have different environments. So this time I did performance / stress tests on a reserved host, which has the exactly same environments with production servers, connecting to production database, memcache and redis servers.

Here are test results for actions in one pool.

	read action	write action
REE 1.8.7 + passenger	448 ms	44 ms
Ruby 1.9.3 + passenger	374 ms	42 ms
JRuby 1.7.0.RC2 + torquebox-lite	187 ms	38 ms

JRuby is much faster than REE 1.8.7 and Ruby 1.9.3 in both read and write actions. It's promising we can get a big performance improvement on production.

Make sure you run your stress tests multiple times and run long time, some memory leak and thread safety issues are not reproduced every time or not occurred in a short time.

Note: REE in reserved host is already optimized with twitter's settings.

Deployment strategy

Everything was ready, it was time to think about deployment strategy.

In Java world, you can deploy an app by packaging your source code into a war file and copying the war package to app server. We can do the same thing with JRuby, but it will break our existing capistrano deployment script.

We kept existing capistrano deployment script except deploy:restart task, replacing

touch tmp/restart.txt

with

touch /opt/torquebox/current/jboss/standalone/deployments/openfeint_platform-knob.yml.dodeploy

Torquebox will detect openfeint_platform-knob.yml.dodeploy, undeploy old openfeint_platform and deploy new openfeint_platform, works very similar to passenger. But I found everytime we redeploy app, the non heap memory will jump a lot and the app will be super slow (multiple times slower than usual) during redeployment process.

So I decided to deploy app by restarting jboss instead of hot deployment.

sudo /etc/init.d/jboss-as-standalone restart

It solved memory issue, mitigated the slow requests, but introduced a new issue, it will lost the requests during restarting jboss. The solution we used is rolling restart to provide zero downtime deployment, e.g. we have 3 app servers A, B, C

tell load balancers stop sending http requests to server A.
restart jboss on server A.
tell load balancers resend http resquests to server A when jboss on server A is ready.

And restart server B and C one by one following the above steps.

So far, it works perfect, no memory jump and no request lost.

JRuby on production

Finally we successfully migrated to JRuby on production and the response time dropped a lot.

It was about 40% performance improvement, although it was expected, I was still very excited. Actually after fully warming up, it run even faster than you see on the figure.

The following is the response time comparing to ree's 1 week ago.

This is the successful migration for one pool on OpenFeint platform, we have already migrated 5 pools to JRuby, all got ~ 40% performance improvement. I'm still working on the rest pools' migration and looking forward to replacing all OpenFeint servers to JRuby.

Some JRuby servers have been running on OpenFeint platform for more than 2 months, they are running stably and much faster than before according to New Relic's weekly report.

Further

Java 7 introduced invokedynamic feature, a lot of people said enabling invokedynamic made JRuby 1.7 run much faster, closer to Java speed. But I'm failed tn enable invokedynamic feature with Torquebox, saw the following error

18:29:03,515 ERROR [org.torquebox.core.runtime] (Thread-71) Error during execution: ENV['RAILS_ROOT']=RACK_ROOT
ENV['RAILS_ENV']=RACK_ENV
require %q(org/torquebox/web/rails/boot)
: org.jruby.exceptions.RaiseException: (LoadError) load error: haml/buffer -- java.lang.NoClassDefFoundError: org/jruby/runtime/ThreadContext
     at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1010) [jruby.jar:]
     at ActiveSupport::Dependencies::Loadable.require(/home/deploy/rails_apps/openfeint_platform/shared/bundle/jruby/1.8/gems/activesupport-2.3.14/lib/active_support/dependencies.rb:182)

Torquebox team is trying to fix this issue, I will definitely enable invokedynamic with new Torquebox release, and am looking forward to another big performance improvement.

Some Resources

If you join the JRuby world, the first thing you need to do is to follow Charles Nutter on twitter, he is one of the JRuby core team members and always shares a lot of JRuby knowledge. Also check out his presentations to get latest JRuby features and benchmarks. JRuby wiki pages are helpful to learn everything about JRuby.

At the end, please allow me to thank JRuby and Torquebox team for providing such great things and thank Gree for allowing me to share the knowledge.

switch_user 0.9 released

2012-10-31T00:00:00+00:00

switch_user provides a convenient way to switch current user that speeds up your development and reproduce user specified error on production.

Today switch_user gem 0.9.0 is released, all thanks to Luke Cowell.

He is a collaborator of switch_user gem, and did a great job for 0.9.0 gem.

did lots of refactors.
added unit tests.
made switch_user a rails engine.

check out the changelog here.

I also updated switch_user example to use switch_user 0.9.0.

zero downtime deployment

2012-10-23T00:00:00+00:00

This is my new post on jrubytips. It teaches you how to achieve zero downtime deployment for jruby servers.

http://jrubytips.com/posts/5-zero-downtime-deployment

set proper value for CodeCache

2012-10-19T00:00:00+00:00

This is my new post on jrubytips. It tells you the jvm CodeCache which may affect your server performance.

http://jrubytips.com/posts/4-set-proper-value-for-codecache

newrelic-rake released

2012-10-12T00:00:00+00:00

4 months ago, I released newrelic-workling gem, which helps us montior the performance of background jobs. We used it to find out a GC performance issue. But we still have some cron jobs, who call rake tasks, running in the black box.

So I created a new project newrelic-rake that adds newrelic instrument for rake tasks. Now when I go to the newrelic, I can see the rake tasks listed in Background tasks section, it shows me the average execution time and call count for all rake tasks.

I can also see the performance breakdown for each rake task.

This rake task probably needs to use persistence net http or some c extension http client, and reduce the GC calls.

It's really important to do monitor first, then do optimize.

avoid using rubyzip

2012-10-02T00:00:00+00:00

More precisely I want to say allocating as less objects as you can, rubyzip is just an example.

We have a background job compressing webui assets, uploading to S3, so mobile sdk can download assets to update webui dynamically.

After iphone5 and ios6 came to the market, we received much more webui requests than before, it was expected, but our background job couldn't consume so much asynchronous messages. We could easily scale out by adding more background job servers, but I decided diving deeply into webui job to see if I could speed it up and increase throughput.

Thank newrelic for providing great monitoring service, I saw the webui job took averagely 725ms to complete a webui job, and 80% time was taken by GC calls, WTF. Instead of blaming ruby gc, I blamed our bad code.

I noticed that we used rubyzip to compress webui assets, it was the root reason to cause so much GC.

def create(path, files)
  Zip::ZipFile.open(path, Zip::ZipFile::CREATE) do |z|
    files.each do |file|
      source_path = "#{Rails.root}/public/webui/#{file}"
      expand_dirs(file).each do |dir|
        begin
          z.mkdir dir
        rescue Errno::EEXIST
        end
      end
      z.add file, source_path
    end
  end
end

It sucks, all files are reading and compressing in ruby VM, too many objects are allocated, then cause several GC calls. So I tried to use shell zip command instead of rubyzip.

I did an experiment between rubyzip and shell zip. The followings are code examples.

require 'zip/zip'
GC::Profiler.enable
before_stats = ObjectSpace.count_objects
start = Time.now
Zip::ZipFile.open("test.zip", Zip::ZipFile::CREATE) do |z|
  Dir["**/*"].each do |file|
    z.add file, file
  end
end
puts "Total time: #{Time.now - start}"
after_stats = ObjectSpace.count_objects
puts "[GC Stats] #{before_stats[:FREE] - after_stats[:FREE]} new allocated objects."

# Total time: 0.75344
# [GC Stats] 718691 new allocated objects.

GC::Profiler.enable
before_stats = ObjectSpace.count_objects
start = Time.now
files = Dir["**/*"].map { |file| file unless File.directory?(file) }
`zip test.zip #{files.join(" ")}`
puts "Total time: #{Time.now - start}"
after_stats = ObjectSpace.count_objects
puts "[GC Stats] #{before_stats[:FREE] - after_stats[:FREE]} new
allocated objects."

# Total time: 0.349816
# [GC Stats] 2269 new allocated objects.

As you can see, rubyzip allocates > 700k objects for reading and compressing, and it also takes more than double time to finish the script, shell zip command is a much better solution. So I replaced rubyzip with shell zip in our product.

def create(path, files)
  `cd #{Rails.root}/public/webui && zip #{path} #{files.join(' ')}`
end

After deploying to background job server, I see a big performance improved, it takes only 218ms for webui job to finish, and only 28% time is taken by GC calls. The throughput is also increased from 44cpm to 64cpm, and it can keep up with the webui asyncrhonous messages, we don't need to add more servers, money saved. :-)

So keep in mind, allocating less objects means less GC calls, also means better performance.

Updated: zip_ruby gem gives a similar speed of shell zip command.

require 'zipruby'
GC::Profiler.enable
before_stats = ObjectSpace.count_objects
start = Time.now
Zip::Archive.open("test.zip", Zip::CREATE) do |z|
  Dir["**/*"].each do |file|
    z.add_file file, file
  end
end
puts "Total time: #{Time.now - start}"
after_stats = ObjectSpace.count_objects
puts "[GC Stats] #{before_stats[:FREE] - after_stats[:FREE]} new allocated objects."

# Total time: 0.367729
# [GC Stats] 1116 new allocated objects.

speed up git deployment with depth 1

2012-09-14T00:00:00+00:00

By default, when you deploy your application by capistrano git, it will clone the repository with entire history on production server, but it's meaningless. You should never go to production host and check git log, instead you just need latest code on production host.

With your application grows, git clone with entire history may take a bit longer time than you expected. The following is the time spent with fully cloning.

$ time git clone git@github.com:railsbp/rails-bestpractices.com.git
Cloning into 'rails-bestpractices.com'...
remote: Counting objects: 11438, done.
remote: Compressing objects: 100% (3915/3915), done.
remote: Total 11438 (delta 7012), reused 11277 (delta 6886)
Receiving objects: 100% (11438/11438), 5.52 MiB | 127 KiB/s, done.
Resolving deltas: 100% (7012/7012), done.
git clone git@github.com:railsbp/rails-bestpractices.com.git  0.55s user 0.26s system 1% cpu 55.275 total

But if clone with depth 1, it's finished much faster since there is only 1 revision fetched.

$ time git clone --depth 1 git@github.com:railsbp/rails-bestpractices.com.git
Cloning into 'rails-bestpractices.com'...
remote: Counting objects: 1635, done.
remote: Compressing objects: 100% (1243/1243), done.
remote: Total 1635 (delta 265), reused 1367 (delta 189)
Receiving objects: 100% (1635/1635), 3.02 MiB | 134 KiB/s, done.
Resolving deltas: 100% (265/265), done.
git clone --depth 1 git@github.com:railsbp/rails-bestpractices.com.git  0.24s user 0.17s system 1% cpu 34.236 total

It's time to apply this on your capistrano file to speed up your deployment.

set :scm, :git
set :git_shallow_clone, 1

Warning: git_shallow_clone can't work with branch.

enable threadsafe for rails

2012-09-09T00:00:00+00:00

This is my new post on jrubytips. It tells what threadsafe is in rails, and what're the benefits with threadsafe in jruby server.

http://jrubytips.com/posts/3-enable-threadsafe-for-rails

rolling out with feature flags

2012-09-02T00:00:00+00:00

This is my new post on rails-bestpractices. It tells how we use feature flags to rolling out our features.

http://rails-bestpractices.com/posts/697-rolling-out-with-feature-flags

how to write a jruby gem - part 2

2012-08-23T00:00:00+00:00

In my previous post, I introduced how to write a jruby gem with ruby code, today I will show you how to write a jruby extension with java code, which can give you better performance.

Standard Steps

1. create java classes to wrap any java library you need, and the java classes must extend RubyObject, then it can be called from jruby. e.g.

class Memcached extends RubyObject {
    // MemcachedClient is what we want to wrap
    private MemcachedClient client;

    // java constructor
    public Memcached(final Ruby ruby, RubyClass rubyClass) {
        super(ruby, rubyClass);
    }

    // ruby initialize
    public IRubyObject initialize(ThreadContext context) {
        client = MemcachedClient.new();
    }

    // wrapper method, the first argument for jruby methods must be ThreadContext
    public IRubyObject get(ThreadContext context, IRubyObject key) {
        return (IRubyObject) client.get(key);
    }
}

Keep in mind, every objects you read from ruby or return to ruby must be a RubyObject. So you have to convert between RubyObject and java Object in your wrapper methods.

2. add JRubyModule, JRubyClass, JRubyMethod and JRubyConstant annotations.

@JRubyClass
class Memcached extends RubyObject {
    @JRubyMethod
    public IRubyObject initialize(ThreadContext context) {
        client = MemcachedClient.new();
    }

    @JRubyMethod
    public IRubyObject get(ThreadContext context, IRubyObject key) {
        return (IRubyObject) client.get(key);
    }
}

JRuby annotations tells jvm which classes and methods should be open to ruby world. It can tell the details of classes and methods, like what's the parent class, how many arguments of a methods, and so on.

3. load all jruby modules, classes and methods with BasicLibraryService.

public class MemcachedService implements BasicLibraryService {
    public boolean basicLoad(final Ruby ruby) throws IOException {
        // define Memcached class
        RubyClass memcached = ruby.defineClass("Memcached", ruby.getObject(), new ObjectAllocator() {
            public IRubyObject allocate(Ruby ruby, RubyClass klazz) {
                return new Memcached(ruby, klazz);
            }
        });
        // define all methods with @JRubyMethods in Memcached class
        memcached.defineAnnotatedMethods(Memcached.class);
        return true;
    }
}

BasicLibraryService is the standard load mechanism for easy extensions, you should implement basicLoad method to define ruby modules, classes and methods.

4. finally, load MemcachedService in your ruby file

# MemcachedService is in com.openfeint.memcached package
require 'com/openfeint/memcached/memcached'

Then you can load your jruby gem, and use any Memcached classes and methods you defined.

Some Advanced Tips:

1. JRuby method names.

different name

in ruby

def active?
end

in java

@JRubyMethod(name = "active?")
public IRubyObject active_p(ThreadContext context) {
}

alias methods

in ruby

def get(key)
end

alias :"[]" :get

in java

@JRubyMethod(name = { "get", "[]" })
public IRubyObject get(ThreadContext context, IRubyObject key) {
}

2. JRuby method arguments.

rest arguments

in ruby

def initialize(*args)
end

in java

@JRubyMethod(name = "initialize", rest = true)
public IRubyObject initialize(ThreadContext context, IRubyObject[] args) {
}

arguments with default value

in ruby

def get(key, marshal=true)
end

in java

@JRubyMethod(name = "get", required = 1, optional = 1)
public IRubyObject get(ThreadContext context, IRubyObject[] args) {
    Ruby ruby = context.getRuntime();
    RubyString key = (RubyString) args[0];
    RubyBoolean marshal = ruby.getTrue();
    if (args.length > 1) {
        marshal = args[1];
    }
}

3. custom exceptions

Exception is also a class, so, you could define an Exception in jruby just like defining a class.

@JRubyClass(name = "Memcached::Error", parent = "RuntimeError")
public class Error {
    // you should wrap your custom exception with RaiseException for java land throwing purpose.
    public static RaiseException newNotFound(Ruby ruby, String message) {
        RubyClass errorClass = ruby.getModule("Memcached").getClass("NotFound");
        return new RaiseException(RubyException.newException(ruby, errorClass, message), true);
    }
}

// Yes, it is a subclass.
@JRubyClass(name="Memcached::NotFound", parent="Memcached::Error")
public class NotFound extends Error {
}

// Finally, load the Error in MemcachedService.
public class MemcachedService implements BasicLibraryService {
    public boolean basicLoad(final Ruby ruby) throws IOException {
        RubyClass runtimeError = ruby.getRuntimeError();
        RubyClass memcachedError = memcached.defineClassUnder("Error", runtimeError, runtimeError.getAllocator());
        memcached.defineClassUnder("NotFound", memcachedError, memcachedError.getAllocator());
        return true;
    }
}

so when your call Error.newNotFound(ruby, "Not Found") in your java code, it can be catched with Memcached::NotFound in ruby.

4. object convertion

RubyObject to java Object

you can use RubyObject convertToXXX methods

convertToArray
convertToFloat
convertToHash
convertToInteger
convertToString

e.g.

List<String> keys = (List<String>) args.convertToArray();

java Object to RubyObject

you can use Ruby newXXX methods

newArray
newBoolean
newFixnum
newFloat
newString

e.g.

ruby.newString("hello world");

You can read the source code of jruby-memcached to get more information. Feel free to leave a comment if you have any question or suggestion.

jruby-memcached 0.5.0 released

2012-08-22T00:00:00+00:00

I just released jruby-memcached 0.5.0, it contains the following changes:

add travis-ci support, testing jruby-18mode, jruby-19mode and jruby-head environment.
update spymemcached to 2.8.3, which set shouldOptimize to false by default, there are some bugs with true shouldOptimize so far.
fix increment/decrement issue, in < 0.5.0, incr/decr with unmarshal encode while get with marshal decode.
accept exception_retry_limit option.
add Memcached::ATimeoutOccurred error to handle timeout case, otherwise you will probably see following error.

ActionView::TemplateError: undefined method `clean_message' for #<Java::NetSpyMemcached::OperationTimeoutException:0x26e02e71>

check out the full code changes here.

jruby-memcached 0.4.0 released

2012-08-17T00:00:00+00:00

I just released jruby-memcached 0.4.0, it contains the following changes:

run spymemcached as a daemon thread. I found when running rake task with jruby-memcached < 0.4.0, it won't stop unless you press Ctrl+C.
get method can accept multiple keys.
add Memcached::Rails as a rails cache_store. Of course, it is compatible with Memcached::Rails in memcached.gem.
make full use of jruby annotation to reduce method definitions with optional and rest arguments.

check out the full code changes here.

jruby-memcached 0.3.0 released

2012-08-07T00:00:00+00:00

I just released jruby-memcached 0.3.0, it runs about 10%-20% faster than 0.2.0, I removed ruby code and totally wrote it by java code, check out the file changes.

2 weeks ago, I released jruby-memcached 0.1.0, in that post I mentioned jruby-memcached response time in a request is 40+ms while memcached.gem response time is 30+ms, it looked fine, but I was still investigating the way to improve jruby-memcached performance.

After reading jruby-spymemcached gem and jruby source code, I rewrote jruby-memcached by pure java code instead of ruby code, because calling java from java is much faster than from ruby.

I did the performance compare with new jruby-memcached, the result is as follows:

MBP 2.8G i7    jruby-memcached 0.3.0

ruby-1.9.3-p194
                              user     system      total        real
memcached set              1.110000   1.020000   2.130000 (  4.592509)
memcached get              0.970000   1.000000   1.970000 (  4.172170)
                               user     system      total        real
dalli set                  8.360000   1.650000  10.010000 ( 10.193101)
dalli get                  8.040000   1.670000   9.710000 (  9.828392)

jruby-1.6.7.2
                              user     system      total        real
jruby-memcached set       5.842000   0.000000   5.842000 (  5.842000)
jruby-memcached get       5.561000   0.000000   5.561000 (  5.561000)
                              user     system      total        real
jruby-spymemcached set    5.919000   0.000000   5.919000 (  5.919000)
jruby-spymemcached get    5.615000   0.000000   5.615000 (  5.615000)
                              user     system      total        real
dalli set                10.132000   0.000000  10.132000 ( 10.132000)
dalli get                10.600000   0.000000  10.600000 ( 10.600000)

As you can see, jruby-memcached runs as fast as jruby-spymemcached, and it provides memcached.gem compatible apis and hashing algorithm. jruby-memcached is still slower than memcached.gem, and on production, the response time for memcached has reduced to 40-ms, which is very close to the memcached.gem performance.

how to write a jruby gem - part 1

2012-08-06T00:00:00+00:00

In my previous post, I mentioned I have written a jruby memcached gem. I'm glad to share my experience how to extend jruby here.

JRuby is a 100% java implementation of ruby programming language, it allows you calling java code from ruby code. Java world has much more libraries than ruby gems, to make use of those java jar, it makes your code easier and faster.

I assume you already had the experience to create a pure ruby gem, the first step to create a jruby gem is just the same as ruby gem, the gem structure is as follows:

|- lib/                   // ruby implementation code
|   |- memcached/
|   |- memcached.rb
|- test/                  // ruby test code (rspec or minitest)
|   |- memcached/
|   |- memcached_test.rb
|   |- test_helper.rb
|- Gemfile
|- jruby_memcached.gemspec // your gem manifest
|- Rakefile
|- README.md

Then let's introduce the java jar into our jruby gem.

It's well-known to use maven2 to manage your java source code and dependencies, maven uses pom.xml as a config file to define compile, test and package processes, it also defines the dependencies, looks like the combination of rake and bundler. All the java implementation and test code are put in src directory, while compiled classes and jar files are put in target directory. Now the structure looks like:

|- lib/
|   |- memcached/
|   |- memcached.rb
|- test/
|   |- memcached/
|   |- memcached_test.rb
|   |- test_helper.rb
|- src/                    // java source code
|   |- main/               // java implementation code
|   |   |- java/
|   |- test                // java test code
|       |- java/
|- target/
|   |- classes/            // compiled classes files
|   |- spymemcached-ext-0.0.1.jar // package java source code to a jar
|- Gemfile
|- jruby_memcached.gemspec
|- pom.xml                 // maven config file, compile, test, package
|- Rakefile
|- README.md

In pom.xml, I said it depends on spymemcached 2.8.1 jar, so I can import spymemcached in my hack code under src/main/java. I also defined package shade plugin which package spymemcached 2.8.1 jar and my hack code together into target/spymemcached-ext-0.0.1.jar.

The last step is to combine the ruby and java code. JRuby provides the power to easily use spymemcached-ext-0.0.1.jar in ruby code.

require 'target/spymemcached-ext-0.0.1.jar'

java_import 'net.spy.memcached.MemcachedClient'
java_import 'net.spy.memcached.ConnectionFactoryBuilder'
java_import 'net.spy.memcached.AddrUtil'

builder = ConnectionFactoryBuilder.new
@client = MemcachedClient.new builder.build, AddrUtil.getAddresses(Array(addresses).join(' '))

As you seen, after require the java jar file, you can import the java classes and call the java methods with ruby syntax, jruby is smart enough to convert ruby code into java code. Check out more about how to calling java from jruby here.

You can check out jruby-memcached 0.2.0 source code here to get more details. This is the simplest solution to create jruby gem, in next post I will introduce you how to write a real jruby ext, which can improve the performance of your jruby gem.

Update: I have written part 2 for a real jruby ext, don't miss it.

jruby-memcached 0.1.0 released

2012-07-24T00:00:00+00:00

I just released jruby-memcached 0.1.0 gem, which is the fastest jruby memcached client so far and it is also compatible with memcached.gem. The following is the story why I created jruby-memcached gem.

We are trying to migrate our service from ree to jruby. It's a big project for us, as our repository is written from early 2009, it becomes bigger and bigger, and nobody can promise migrating it to jruby without any errors. Fortunately we are separating our service into different pools, like one pool to handle high scores requests, one pool to handle achievements requests, so our strategy is to migrate to jruby one pool by one pool, it makes migrating processes easier, everytime we only focus on one pool.

This raises one problem, we are using evan's memcached gem which is the fastest memcached client for MRI, but it isn't working in jruby, yes, it's a ruby gem with c extention. But we have to solve the situation that memcached client must work on both ree pool and jruby pool.

The first idea in my mind is to use a jruby memcached client, after googling I found jruby-memcached-client, but soon I get to know they can't be used together. jruby-memcached-client marshal dump the value, encode the value to base64 then save to memcached, and memcached gem only marshal dump, do not encode to base 64. Althrough I can fork and change jruby-memcached-client, but there is still a string issue passing from ruby to jave, I will mention it later.

So I have to give up jruby-memcached-client, then I try to use pure ruby memcached client like memcache-client or dalli. I pick up dalli which is faster than memcached-client, but after we deployed it on production, we found memcached misses jump too high which we can't afford, see

we have to revert the release.

I did some research about memcached, then I realized memcached and dalli are incompatible, why? You may think memcached is simple, just set key/value pair and get value based on key. It is right in common, but if you have more than 1 memcached servers, memcached client should know what key is on which memcached server, client must not fetch the key from server1 now but fetch it from server2 10 minutes later.

Let me introduce you 2 important client configurations

Hashing, it is the algorithm to convert you string key to long hash.
Consistent Hashing, aka distribution in memcached gem, it is the algorithm to map you generated hash to one of your memcached servers, it promises low ratio cache reassigns when you add or remove a memcached server. See more about consistent hashing on wikipedia.

memcached uses fnv1_32 as hash algorithm by default and dalli uses crc32, and their distribution algorithm are not compatible as well.

Finally I decided to write a jruby memcached based on a java memcached library by myself. After googling, I find 2 options: xmemcached and spymemcached. The author of xmemcached shows the benchmark which said xmemcached is a bit faster than spymemcached and it provides libmemcached compatible hasing algorithm (although I was cheated in the end :-) ), so I gave it a try first.

xmemcached's LibmemcachedMemcachedSessionLocator is not compatbile with libmemcached, at least not compatible with libmemcached 0.32 which is used by memcached gem. I have to dive into libmemcached 0.32 source code and override xmemcached MemcachedSessionLocator, and write a jruby gem to wrap the xmemcached. (writing a jruby gem is not difficult, I probably write a new post to introduce in the future) Then I released it on our reverse pool, sending high traffic to see the performance, I'm disappointed, memcached get time increased from 30+ ms to 60 ms, and it generated about 200 threads for xmemcached (we have 30 memcached servers and 2 memcached client instances).

Quickly I replaced xmemcached to spymemcached, and memcached get time decreased to 40+ ms and it only generates 2 threads, awesome. And its hash and distribution algorithms are 100% compatible to libmemcached 0.32. You can read the source code in src/main/java to see all hacks I did for spymemcached.

I mention a string issue above, it is when we passing a zlib deflated value, like "x\234c?P??/?I\001\000\b8\002a", it changes to "x?c?P??/?8a" in java, so we can't pass deflated string directly, instead we pass bytes.

I also did some benchmark between memcached, jruby-memcached and dalli.

in ruby-1.9.3
                           user     system      total        real
memcached set          1.110000   1.020000   2.130000 (  4.592509)
memcached get          0.970000   1.000000   1.970000 (  4.172170)
                           user     system      total        real
dalli set              8.330000   1.560000   9.890000 ( 10.094499)
dalli get              8.530000   1.680000  10.210000 ( 10.331083)

in jruby-1.6.7.2
                          user     system      total        real
jruby-memcached set   6.902000   0.000000   6.902000 (  6.902000)
jruby-memcached get   6.845000   0.000000   6.845000 (  6.845000)
                          user     system      total        real
dalli set            13.251000   0.000000  13.251000 ( 13.251000)
dalli get            13.536000   0.000000  13.536000 ( 13.536000)

see more here, as you seen, both memcached and jruby-memcached are 2x faster than dalli.

my railsconf 2012 video

2012-06-14T00:00:00+00:00

This is my railsconf 2012 video on youtube.

newrelic-workling released

2012-06-07T00:00:00+00:00

We are using workling with RabbitMQ as our background service and monitoring RabbitMQ on scout. Last month, we released a new background job which generates tons of messages in RabbitMQ, then messages in RabbitMQ queue kept growing, that means our workling processes are not many enough to handle that messages. We fixed it by reverting that job, using cron job to handle instead.

We thought about this accident, and we decided to add newrelic support to measure workling instrument, so that we can have an idea about how many messages generates for each job and how much does it cost to consume one message.

We finally released the newrelic-workling 1.0 gem today, thank newrelic's help, we are the official support for newrelic workling, feel free to ping us if you have any question. The following is the screenshot for the workling instrument on newrelic.

bullet 4.0.0 released

2012-05-09T00:00:00+00:00

bullet is designed to help you reduce the number of db queries, such as adding eager loading to kill n+1 queries and removing unused eager loadings.

bullet works well in activerecord from 2.1 to 3.2 before, today I released bullet 4.0.0, it starts to support mongoid (>= 2.4.1) now.

Why does bullet need to support mongoid? Does mongo also have n+1 queries issue?

The answer is yes, check out the performance metric of mongoid eager loading, about 40% performance improved. 1 year ago I already created a gem mongoid-eager-loading to add eager loading feature in mongoid, it is deprecated as mongoid has already supported eager loading natively.

Be aware that bullet for mongoid doesn't support 2 level deep eager loading and counter cache because they are not supported in mongoid so far.

What about mongomapper, I'd like to support it in future, but I have no experience in it, does anybody have interests to implement it? Feel free to contact me.

Another big improvement in 4.0.0 is much better integration tests. If you check out the source code, you will see I separate different integration tests for activerecord 2, activerecord 3 and mongoid, I also add these integration tests to different Gemfiles, and ask travis to test all of them for bullet, see the build result.

If you have any problems to use bullet gem, feel free to mail me, tweet me or open an issue on github.

my presentation on railsconf 2012

2012-05-03T00:00:00+00:00

I attended and spoke at railsconf 2012 last week, the following is my presentation

Semi Automatic Code Review

View more presentations from Richard Huang

If you have any questions or suggestions, feel free to email me, tweet me or open issues on github.

redis mget/mset vs get/set

2012-04-05T00:00:00+00:00

Our application uses redis a lot to perform large numbers of data reads/writes. But we didn't use it well enough, e.g. we call redis get and set in loop, just like touching mysql and memcache many times, it takes a long time to send multiple redis commands, if we can reduce the commands, it saves on round trip time.

The following script is used to bencharmark different commands count.

require 'redis'
require 'benchmark'

redis = Redis.new
Benchmark.bm(50) do |bm|
  bm.report "redis set" do
    10000.times do |i|
      redis.set("key#{i}", "value#{i}")
    end
  end

  bm.report "redis get" do
    10000.times do |i|
      redis.get("key#{i}")
    end
  end

  bm.report "redis mset with 1000" do
    1000.times do |i|
      keys = (10*i...10*i+10).map { |j| ["yek#{j}", "value#{j}"] }.flatten
      redis.mset(*keys)
    end
  end

  bm.report "redis mget with 1000" do
    1000.times do |i|
      keys = (10*i...10*i+10).map { |j| "yek#{j}" }
      redis.mget(*keys)
    end
  end

  bm.report "redis mset with 100" do
    100.times do |i|
      keys = (100*i...100*i+100).map { |j| ["eky#{j}", "value#{j}"] }.flatten
      redis.mset(*keys)
    end
  end

  bm.report "redis mget with 100" do
    100.times do |i|
      keys = (100*i...100*i+100).map { |j| "eky#{j}" }
      redis.mget(*keys)
    end
  end

  bm.report "redis mset with 10" do
    10.times do |i|
      keys = (1000*i...1000*i+1000).map { |j| ["eyk#{j}", "value#{j}"] }.flatten
      redis.mset(*keys)
    end
  end

  bm.report "redis mget with 10" do
    10.times do |i|
      keys = (1000*i...1000*i+1000).map { |j| "eyk#{j}" }
      redis.mget(*keys)
    end
  end

  bm.report "redis mset with 1" do
    keys = (0...10000).map { |j| ["kye#{j}", "value#{j}"] }.flatten
    redis.mset(*keys)
  end

  bm.report "redis mget with 1" do
    keys = (0...10000).map { |j| "kye#{j}" }
    redis.mget(*keys)
  end
end

This is the benchmark result.

#                             user   system      total        real
#  redis set              0.280000   0.170000   0.450000 (  0.809112)
#  redis get              0.290000   0.160000   0.450000 (  0.806711)
#  redis mset with 1000   0.070000   0.020000   0.090000 (  0.148474)
#  redis mget with 1000   0.080000   0.020000   0.100000 (  0.142837)
#  redis mset with 100    0.050000   0.000000   0.050000 (  0.067859)
#  redis mget with 100    0.050000   0.010000   0.060000 (  0.063040)
#  redis mset with 10     0.040000   0.000000   0.040000 (  0.060200)
#  redis mget with 10     0.050000   0.000000   0.050000 (  0.057818)
#  redis mset with 1      0.040000   0.000000   0.040000 (  0.062318)
#  redis mget with 1      0.050000   0.000000   0.050000 (  0.057483)

It's obvious that less redis commands means fast running time.

master slave replication in rails

2012-03-28T00:00:00+00:00

Introduction

By default activerecord works well with single db, it's applicable for most of websites with small/medium traffic, but if you website grows fast and gets much more reads than writes, you should definitly set up master slave replication for your databse. All inserts/updates are sent to master db, and reads are sent to slave db, it will reduce read load on your master db.

Master slave replication allows to set up as many slave dbs as you need, it's scalable, that means you can easily increase you db read throughput by adding more slave dbs. It also allows you to move some tasks like analytics on slave db without affecting your master db.

Replication in rails

How do we config master slave replication in rails app? There are a lot of choices, pick up one and setup according to its document. I don't want to discuss about these tools here, I will tell you how to use master slave replication in rails above these tools.

Problems

Master slave replication looks well, but it has a big problem in practice - replication lag. There is a lag between data inserted in master db and sync to slave db, let's see a case.

a user create a post on your application.
the post is inserted to master db.
your application redirects user to post show page.
your application read from slave db, but the post is not sync yet.
a 404 page is shown. :-(
the post is sync to slave db. (too late)

Lots of similar issues will raise after you applying master slave replication, how to solve them?

Solution

The solution is send some reads to master db to promise get fresh data.

By default all reads will be sent to master db in one db transaction, like

BEGIN
SELECT * from users where id = 1;
INSERT INTO posts(title, user_id) VALUES('test', 1);
COMMIT

In the following cases I will send reads to master db as well

queries in background job, like delayed_job, resque, workling, etc.

clas Post < ActiveRecord::Base
  after_create :notify

  protected
  def notify
    Delayed::Job.enqueue(DelayedJob::NotifyAdmin.new(self.id))
  end
end

class DelayedJob::NotifyAdmin < Struct.new(:post_id)
  def perform
    post = Post.find(post_id)
    ......
  end
end

It's probably the post does not exist when reading it from slave db in background job.

queries in the request which follows a redirect reponse

class PostsController < ApplicationController
  def show
    @post = Post.find(params[:id])
  end

  def create
    @post = Post.new(params[:post])
    if @post.save
      redirect_to post_path(@post)
    else
      render :new
    end
  end
end

This case is too common, creating/updating then redirecting, if the resource is not sync to slave db before next request, user will get a 404 page or get some fake data.

We know when we should explictly send reads to master db, but how can we do that. It's

ActiveRecord::Base.with_master {
  User.find(post.user_id)
}

Almost all of replication gem provide with_master method, any queries in the block will be sent to master db. I added a monkey patch to background job, wrapping it with with_master.

I added add a monkey patch to action controller as well, adding a parameter if the response is a redirect, then add a around_filter to controller to check if the reads in such request should be sent to master or slave db.

class ApplicationController < ActionController::Base
  around_filter :manage_slaving

  def manage_slaving
    if force_master?
      ActiveRecord::Base.with_master { yield }
    else
      yield
    end
  end
end

force_master? is a convenient way to manage your master/slave db on controller levels, you can also enable/disalbe master/slave for some specfied requests.

Finally test your application and add ActiveRecord::Base.with_mater {} if necessary.

bullet 2.3.0 released

2012-03-25T00:00:00+00:00

bullet is a gem to help you increase your application's performance by reducing the number of sql requests it makes. Today I released bullet 2.3.0 to better support rails 3.1 and 3.2 and performance improved. It's a long time I didn't do any changes to bullet, let me tell you the story I work for bullet 2.3.0.

At the beginning of this month, bullet got its 1000th watcher on github, I realized it's time to improve it e.g. speed up and compatible with edge rails.

The first thing I did is to refactor tests. Before I created several rspec tests, but they are more like integration tests instead of unit tests, so I move them to spec/integration/ directory. Then I added a bunch of test units to cover all codes, which can promise the correctness of further code refactors. I also use guard instead of watchr to do auto tests, why I preferred guard? It's much easier and has more extensions, like guard-rspec.

Then I moved AR models, which are used for integration tests, from integration tests to spec/models, and I also moved db connection, db schema and db seed to spec/support/, moved test helpers to spec/support/ as well. Now my tests looks much cleaner and run much faster (only connect db once).

After refactoring tests, I tried to improve the bullet performance, I already created a benchmark script before, bullet 2.2.1 with rails 3.0.12 spent 30s to complete

bullet 2.2.1 with rails 3.0.12
                                                                             user     system      total        real
Querying & Iterating 1000 Posts with 10000 Comments and 100 Users       29.970000   0.270000  30.240000 ( 30.452083)

Then I used perftools.rb to measure cpu time for methods, the result is garbage_collector, String#=~ and Kernel#caller

garbage_collector, it depends on how many objects allocated
String#=~, bullet use regexp to check if caller contains load_target
Kernel#caller, bullet uses caller to tell what codes caused n+1 query

I found the easiest is to mitigate String#=~, as bullet only check regexp with constant string load_target, so I simply used .include?("load_target") instead.

bullet 2.3.0 with rails 3.0.12
                                                                             user     system      total        real
Querying & Iterating 1000 Posts with 10000 Comments and 100 Users       26.120000   0.430000  26.550000 ( 27.179304)

another change is to store object's ar_key instead of object itself.

{<#Post id:1, title:"post1", body:"post body", created_at:..., updated_at:...> => [:comments]}

{"Post:1" => [:comments]}

it speeds up hash comparison time and save the hash size.

I also hacked ActiveRecord::Associations::SingularAssociation#reader instead of ActiveRecord::Associations::Association#load_target for rails 3.1 and 3.2, it fixes activerecord 3.1 and 3.2 compatibility, there is no need to call caller in Association#load_target, it runs much faster in rails 3.1 and 3.2, the following is the benchmark result

bullet 2.3.0 with rails 3.2.2
                                                                             user     system      total        real
Querying & Iterating 1000 Posts with 10000 Comments and 100 Users       16.460000   0.190000  16.650000 ( 16.968246)

bullet 2.3.0 with rails 3.1.4
                                                                             user     system      total        real
Querying & Iterating 1000 Posts with 10000 Comments and 100 Users       14.600000   0.130000  14.730000 ( 14.937590)

Enjoy the new bullet gem!

multiple_mailers - send emails by different smtp accounts

2012-03-21T00:00:00+00:00

I use gmail to send email notifications on my website, it's really easy to build based on actionmailer

ActionMailer::Base.smtp_settings = {
  :address => 'smtp.gmail.com',
  :port => 587,
  :domain => 'railsbp.com',
  :authentication => :plain,
  :user_name => 'notification@railsbp.com',
  :password => 'password'
}

But I found it does not allow to setup 2 different smtp accounts, e.g. I want to send notification email with notification@railsbp.com and send exception notifier email with exception.notifier@railsbp.com, after googling, I hacked my mailer classes with

class NotificationMailer < ActionMailer::Base
  if Rails.env.production?
    class <<self
      def smtp_settings
        options = YAML.load_file("#{Rails.root}/config/mailers.yml")[Rails.env]['exception_notifier']
        @@smtp_settings = {
          :address              => options["address"],
          :port                 => options["port"],
          :domain               => options["domain"],
          :authentication       => options["authentication"],
          :user_name            => options["user_name"],
          :password             => options["password"]
        }
      end
    end
  end
end

then add a new config file config/mailers.yml

production:
  common: &common
    address: 'smtp.gmail.com'
    port: 587
    domain: 'rails-bestpractices.com'
    authentication: 'plain'

  notification:
    <<: *common
    user_name: 'notification@rails-bestpractices.com'
    password: 'password'

  exception.notifier:
    <<: *common
    user_name: 'exception.notifier@rails-bestpractices.com'
    password: 'password'

that allows me to setup one smtp account per actionmailer class, keep in mind that you should only hack smtp_settings for what environment you really want to send emails (here is production), if you don't check Rails.env, it will send email even in development and test environments.

Now it works fine, I can send emails by as many smtp accounts as I like, but it looks ugly, I don't like hacking codes all over my mailer classes. So I abstract it to a new gem multiple_mailers, like the hack above, you should define config file config/mailers.yml and for each mail class, what you only need is to declare its mailer account name

class NotificationMailer < ActionMailer::Base
  mailer_account "notification"
end

class ExceptionNotifier
  class Notifier < ActionMailer::Base
    mailer_account "exception.notifier"
  end
end

passenger with http_gzip_static_module

2012-02-27T00:00:00+00:00

Rails 3.1 has been released for a long time, asset pipeline becomes more and more popular, I also upgraded my rails website.

I used nginx + passenger for my rails projects, but nginx only supports dynamic gzip support (compress in runtime), there is a http_gzip_static_module for nginx, which can make full use of rails asset pipeline.

I don't like the way to customize my Nginx installation during passenger installation, I found there is a pull request to add http_gzip_static_module, so I changed to source code of passenger gem, then installed nginx as default. :-)

rake arguments

2011-12-13T00:00:00+00:00

Long ago I began to write some rake tasks, it's simple but doesn't have an instruction about how to add arguments to a rake task. What I did before is to use ruby environment variables.

task :try_argument do
  ENV['GLOBAL_ARGUMENT1'] or ENV['GLOBAL_ARGUMENT2']
end

GLOBAL_ARGUMENT1=xxx GLOBAL_ARGUMENT2=yyy rake try_argument

As you seen, I have to set the global environment variable to pass the arguement to a rake task.

But there is another way to pass the arguments to rake task via []

task :try_argument, [:key1, :key2] do |t, args|
  args.with_defaults(:key1 => value1, :key2 => value2)
  args[:key1] or args[:key2]
end

rake try_argument[xxx, yyy]

and if there is dependent task, you should define it like

task :try_argument, [:key1, :key2] => :environment do |t, args|
  args.with_defaults(:key1 => value1, :key2 => value2)
  args[:key1] or args[:key2]
end

rake try_argument[xxx, yyy]

It looks like the difference between hash arguments and normal arguments.

Both of them have disadvantage:

ENV arguments also changes the system env variables normal arguments do not make sense when calling, difficult to remember the meanings of arguments.

Both work fine, it depends on you to use which one.

passenger with redis

2011-12-12T00:00:00+00:00

Today I encountered an issue that passenger forks too many workers than what we set (6) on qa servers. I used strace, the passenger worker is blocked by failed to writing to a socket, like

select(15, [], [13], [], [58, 915000])

fd 13 is a socket.

I also tried netstat and found the status for some redis socket connections are CLOSE_WAIT.

So I judged this is the problem the ruby redis clients are not closed correctly. This reminds me that passenger fork() nature, I checked our source codes, unfortunately, we didn't do anything special for passenger fork.

This is the link tells you how to close the redis connection after passenger forks a worker. After deploy the new codes to qa servers, passenger never forks more workers than we expected. But the workers still hang up according strace result, that means some workers keep inactive status, they won't be able to handle any requests. Wooops...

I looked through the redis-rb source codes, we used redis 2.0.5, it didn't handle TIMEOUT error and always retry writing to redis. Fortunately, the latest redis version is 2.2.2 and it already fixed this issue, retry 3 times, if still failed, the release the connection.

Now it works fine, no unexpected additional passenger workers and no unexpected inactive workers.

avoid committing git conflicts

2011-11-14T00:00:00+00:00

I made a mistake when merging branch last week, I forgot to remove a conflict syntax "<<<<<< HEAD" and push it to remote repository. It breaks other one's development. So stupid to make such mistake.

To avoid making such mistake anymore, I write a git hook .git/hooks/pre-commit to check conflict syntax "<<<<<<" and ">>>>>>"

#!/usr/bin/env ruby

`git diff-index --name-status HEAD`.split("\n").each do
|status_with_filename|
  status, filename = status_with_filename.split(/\s+/)
  next if status == 'D'
  File.open(filename) do |file|
    while line = file.gets
      if line.include?("<<<<<<<") || line.include?(">>>>>>>")
        puts "ERROR: #{filename} is conflict"
        exit(1)
      end
    end
  end
end

It will prevent you from committing conflicts.

after_commit

2011-11-06T00:00:00+00:00

We are using RabbitMQ as our message queue system, ruby client is workling. This week we encountered a strange issue, we create a notification, and define an after_create callback to ask workling to find that notification and then push the notification to twitter or facebook, it works fine except that sometimes it will raise an error said "can't find the notification with the specified ID"

class Notification < ActiveRecord::Base
  after_create :asyns_send_notification
  ......
  def async_send_notification
    NotificationWorker.async_send_notification({:notification_id => id})
  end
end

class NotificationWorker < Workling::Base
  def send_notification(params)
    notification = Notification.find(params[:notification_id])
    ......
  end
end

It's wierd the notification_id is passed to NotificationWorker, that means the notification is already created, the notification is supposed to be existed.

After talking with MySQL DBA, we find the problem is the find sql is executed before insert transaction is committed.

Let me describe it step by step.

Notification sends "Transaction Begin" command
Notification sends "INSERT" command
Notification gets "next sequence value" as new object id
Notification sends "new object id" to NotificationWorker
NotificationWorker sends "SELECT" command to find notification object
Notification sends "Transaction Commit" command

As you seen, at step 5, the new notification is not existed in the mysql database yet, so the error "Not found" will be raised.

To solve this issue, we can use after_commit callback.

In rails 2.x, we should install after_commit gem, in rails 3.x, after_commit callback is supported by default.

class Notification < ActiveRecord::Base
  after_commit_on_create :asyns_send_notification
  ......
  def async_send_notification
    NotificationWorker.async_send_notification({:notification_id => id})
  end
end

So Notification asks NotificationWorker to run only after the transaction is committed.

reset_counters in rails

2011-10-31T00:00:00+00:00

I thought reset_counters method is to reset a counter_cache column to be 0, but it is not. After trying several times, I finally realize that reset_counters is to update the value of counter_cache column to the exact count of associations. The usecase of reset_counters is when you add the counter_cache in migration and update the counter_cache value, like

def self.up
  add_column :posts, :comments_count
  Post.all.each do |post|
    Post.reset_counters(post.id, :comments)
  end
end

it will add comments_count column to posts table, and calculate the comments count for each post, and set it to posts' comments_count column.

I didn't find a method to reset the counter_cache column to be 0, why? Because counter_cache is used to cache the association count, it will be incremented and decremeneted automatically, you should never reset it 0. If you find you need to reset counter_cache to 0, that means it's a wrong usage of counter_cache.

use rspec filter to speed up tests

2011-10-21T00:00:00+00:00

Rspec 2 introduce a very efficient way to test only one test or one test suit, it's filter_run.

You should first add filter_run in rspec/spec_helper.rb

config.filter_run :focus => true

Then you can tell rspec to test only one test you are focused by

it "should focus now", :focus => true do
  ...
end

rspec will only test this spec, :focus => true can be applied on describe/context as well.

One problem is that if there is no :focus => true on your tests, rspec will do nothing, but most of time we are expecting to test all specs if no focus is true, so you should add a line to spec_helper as well.

config.run_all_when_everything_filtered = true

As the name implies, rspec will test all specs if no focus filter.

Another you may interest that you can also define filter_run_excluding

config.filter_run_excluding :slow => true

rspec will run all specs except what specs are marked as slow.

rubykaigi presentation

2011-07-17T00:00:00+00:00

My presentation in RubyKaigin 2011 today.

Rails best practices_rubykaigi

View more presentations from Richard Huang.

and the video is here: http://www.ustream.tv/recorded/16051491

beijing ruby线下活动

2011-03-27T00:00:00+00:00

周日在北京的ruby线下活动的ppt

使用Rails best practices做代码审查

View more presentations from Richard Huang

Upgrade Mongoid - Multiple databases

2011-03-22T00:00:00+00:00

My recent post Use different mongodb instances in mongoid tells you how to use multiple databases, it looks good, but mongoid began to support multiple databases itself from mongoid.2.0.0.rc.1, much better than my hack.

It's really easy to use, first, you should define multiple databases in mongoid.yml like

development:
  <<: *defaults
  host: localhost
  database: main_mongo_instance
  databases:
    other_mongo_instance_name:
      database: other_mongo_instance
      host: localhost

As you seen, besides the common database param, I have defined a new param databases, you should define the mongo instance name with database and host name, and of course, you can define as many mongo instances as you need.

Then, you can choose which mongo instance to use in mongoid model.

class User
  include Mongoid::Document

  set_database :other_mongo_instance_name
end

set_database method tells mongoid that the model will use another mongo instance instead of the main mongo instance, here we use the name other_mongo_instance_name which should exactly be the same with the name defined in mongoid.yml. If you don't say anything, it will use the main_mongo_instance.

So all the users data will be stored to other_mongo_instance_name, and the other data will be stored to main_mongo_instance. Great!

Upgrade Mongoid - update_attribute

2011-03-21T00:00:00+00:00

Before mongoid 2.0.0.rc.6, there is no update_attribute method for Mongoid::Document, it makes me unhappy. As in ActiveRecord world, I always use update_attribute to change one attribute and use update_attributes to change two or more attributes.

It's a good news that mongoid introduces the update_attribute method from 2.0.0.rc.6, now I can follow my practice in mongoid.

post.update_attribute(:title => "New Post")

post.update_attributes(:title => "New Post", :body => "New Body")

Upgrade Mongoid - Many to many association

2011-03-08T00:00:00+00:00

Before mongoid 2.0.0.rc1, there is no default support for many to many association. So we use join document (aka join table in relational database) to implement the many to many association.

For example, we have two documents users and accounts, one user has many accounts and one account contains many users, to establish the many to many relationship between users and accounts, we create a new document named user_accounts, the document looks like

{'_id': '4d76d3a70bdb822d08000001', 'user_id': BSON::ObjectId('4d76d3b80bdb822d080015b3'), 'account_id': BSON::ObjectId('4d76d3b90bdb822d080015b7')}

and the models are defined as follows

class User
  include Mongoid::Document
  references_many :user_accounts
end

class Account
  include Mongoid::Document
  references_many :user_accounts
end

class UserAccount
  include Mongoid::Document
  referenced_in :user
  referenced_in :account
end

Are you familiar with it, it's what activerecord did for many to many association.

I'm glad that mongoid began to support many to many association after mongoid 2.0.0.rc1, the new syntax is "referenes_and_referenced_in_many".

class User
  include Mongoid::Document
  references_and_referenced_in_many :accounts
end

class Account
  include Mongoid::Document
  references_and_referenced_in_many :users
end

We don't need the join document any more. The implementation of mongoid is different with activerecord, it uses array attribute to store the relationship at both sides. Like

These are user documents

{'_id': '4d76d3a90bdb822d08000009', account_ids: [BSON::ObjectId('4d76d3aa0bdb822d0800001b'), BSON::ObjectId('4d76d3aa0bdb822d0800001d'), BSON::ObjectId('4d76d3aa0bdb822d08000017')]}
{'_id': '4d76d3a80bdb822d08000005', account_ids: [BSON::ObjectId('4d76d3aa0bdb822d08000017'), BSON::ObjectId('4d76d3a90bdb822d08000015')]}

And these are account documents

{'_id': '4d76d3aa0bdb822d08000017', user_ids: [BSON::ObjectId('4d76d3a80bdb822d08000005'), BSON::ObjectId('4d76d3a90bdb822d08000009'), BSON::ObjectId('4d76d3a90bdb822d0800000d')]}
{'_id': '4d76d3aa0bdb822d0800001b', user_ids: [SON::ObjectId('4d76d3a90bdb822d08000009'), BSON::ObjectId('4d76d3a90bdb822d08000011')]}

As mongodb support the Array type, it is really easy to maintain the many to many relationship.

Btw, if you use

references_many :name, :stored_as => :array

before, you will receive a runtime error. You should use

references_and_referenced_in_many :name

instead.

Upgrade Mongoid - Hash arguments for group

2011-03-01T00:00:00+00:00

You will receive a warning for the group method call after upgrading mongoid.

Collection#group no longer take a list of paramters. This usage is deprecated.

exactly this is because mongo gem changes the group method definition.

Before

key = ["ad_id"]
conditions = { 'ad_id' => { '$in' => ad_ids } }
initial = { "impressions" => 0.0, "clicks" => 0.0 }
reduce = "a reduce javascript function"

AdStat.collection.group(key, conditions, initial, reduce).each do |e|
  ......
end

After

key = ["ad_id"]
conditions = { 'ad_id' => { '$in' => ad_ids } }
initial = { "impressions" => 0.0, "clicks" => 0.0 }
reduce = "a reduce javascript function"

AdStat.collection.group(:key => key, :conditions => conditions, :initial => initial, :reduce => reduce).each do |e|
  ......
end

This is the usage of hash arguments, it makes the group calling more readable.

Upgrade Mongoid - Default Type for Field

2011-01-28T00:00:00+00:00

If you have watched the episode about mongoid from railscast, ryanb removed the default type String for field, like

class Article
  field :name, :type => String
  field :content, :type => String
end

can be written as

class Article
  field :name
  field :content
end

but it is not valid from mongoid.2.0.0.rc.1 again, the default type of field is changed from String to Object, that means we should explicitly set the type for each field.