Skip to content

yz/ibrowse configuration and interaction with Solr #330

@wbrown

Description

@wbrown

Currentlly, Yokozuna uses the ibrowse application to stage index updates to Solr. The default ibrowse settings are decidedly non-optimal with a default per host:port pair maximum of 10 sessions and 10 pipeline depth.

As the Solr indexes grow to millions of entries, the latency of the index updates exerts a backpressure upon Riak, slowing down key updates and PUTs. htop shows a back and forth pattern of Riak and Solr exchanging CPU cycles. At this point, Riak performance completely tanks, especially with the LevelDB backend.

Working with @evanmcc -- we found that the ibrowse author doesn't like to do configuration like the rest of the Erlang world, unfortunately. He suggested a configuration line update, that I had to alter -- hooray for documentation that doesn't match reality.

${RIAK_HOME}/lib/ibrowse-4.0.1/priv/ibrowse.conf had the following line added:

{dest, "localhost", 8093, 100, 100000, []}.

Note that {127,0,0,1} in place of localhost did not work. It really wants the string hostname, probably because yz is using localhost. By the way, if you do this, this will default to IPV6 for localhost on some systems, which is an interesting and potentially problematic side effect.

The above settings set the connection maximum to 100 with a pipeline depth of 10000. I restarted Riak, and sure enough, I got that amount of connections which is an indicator that I may need to raise the limit higher for my throughput of about 400-500 keys indexed a second.

[root@altostratus log]#  netstat -anp | grep 8093 | grep EST | wc -l200
200

Checking for TIME_WAIT revealed hundreds of connections left over, which is indicative that there is no pipelining.

[root@altostratus log]#  netstat -anp | grep 8093 | grep TIME_WAIT | wc -l
594

@coderoshi confirmed my suspicion. This is problematic for deployments that index hundreds or thousands of keys a second with the overhead of establishing a socket connection for every index update.

So, I'd suggest the following changes:

  • Add a more reasonable default max session value for localhost:8093
  • Use HTTP pipelining so that we don't run out of socket descriptors/file handles and also so we don't incur the punishing latency and overhead of a TCP connection for every key update.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions