Monday, October 28, 2013

Sticky Sessions with Cookies


This module is used to track upstream servers using cookies, enabling clients to be served by the same backend server for session persistence. The incoming request is examined for a cookie (we assume "route" to be the name of this cookie) and the client request is forwarded to the corresponding server based on the digest value in the mapping. It can't be applied when cookies are disabled and switches back to the classic round robin load balancing mechanism available in nginx. (or returns a Bad Gateway based on the fallback attribute specified )

We decided to examine the control flow and the source code of this sticky module, hoping that it would give us a fair idea of how to implement our own module - using url rewriting for session persistence. This blog is a brief description of what we've inferred from reading the source code.

This module is added by recompiling nginx from source :

./configure ... --add-module=/absolute/path/to/nginx-sticky-module
make
make install

To  use the sticky module,  a "sticky" directive is specified in the upstream block with options as indicated below:

Usage
upstream {
sticky;
server 127.0.0.1:9000;
server 127.0.0.1:9001;
server 127.0.0.1:9002;
}

sticky [name=route] [domain=.foo.bar] [path=/] [expires=1h] [hash=index|md5|sha1] [no_fallback];

On encountering the sticky directive, the handler mapped to this directive is invoked. This is called : ngx_http_sticky_set and is specified as a part of the nginx sticky directive (the enabling directive): 

static ngx_command_t  ngx_http_sticky_commands[] = {

{ ngx_string("sticky"),
NGX_HTTP_UPS_CONF|NGX_CONF_ANY,
ngx_http_sticky_set,
0,
0,
NULL },

ngx_null_command
};

The callback : ngx_http_sticky_set (registration function)
Reads and validates the arguments specified in the directive and saves them in an appropriate structure - ngx_http_sticky_srv_conf_t (custom structure) of the sticky module. These include specifications like the name of the cookie, its lifetime, domain, path, a callback based on the digest type specified in the directive, fallback specification and a reference to contain the peer mappings with the digest values that will be computed later. The validation also includes checking for unnecessary parameters and whether another upstream module has been loaded (in which case an error is thrown). The function also sets an upstream initialization callback.

The upstream initialization function - ngx_http_init_upstream_sticky
It computes the digest for all servers based on the encoding type specified(md5, sha1, etc.), calls the round robin module (for resolving host names and the allocation of sockets) and sets the handler for the peer initialization function. 

The peer initialization function - ngx_http_init_sticky_peer 
It is invoked per request. The configuration specifications that were packaged into a peer data structure (in the registration function) is now set as a part of the http request. The round robin module is invoked to determine the next peer to invoke (callback : ngx_http_get_sticky_peer). This function also performs a good chunk of the cookie operations. It checks the incoming request header for the sticky cookie : route. If found, it checks for a set encryption type and tries to find a match (peer). If no encoding type is specified, the cookie data is taken to be the index of the peer directly. The selected peer is saved into the peer data structure to be later used by the ngx_http_get_sticky_peer callback. In case a cookie wasn't found, indicated by an index value of -1, the regular load balancing mechanism of round robin will be used.

ngx_http_get_sticky_peer is the callback set in the peer initialization function and is called at least once per request, to select the next peer. The function examines the state of the selected peer and ensures that it hasn't been tried or isn't down. After all these validations, it has to perform the core operation of assigning this peer to the upstream module and setting it with the cookie data. For the latter, ngx_http_sticky_misc_set_cookie() is invoked. If an existing cookie with the "route" header is found, it is overwritten to update the lifetime of the cookie (expires attribute), if not, a new cookie is created and set in the output headers.

This is how the cookie based session based load balancing works. 

However, this approach is not truly session based load balancing.
Consider an application that has sessions only for an hour. The lifetime of the cookie is set to 2 days. The cookie will be set on the client side and subsequent requests will continue to be sent to the same backend server for the next 2 days, much beyond the expiry of the original session. This skews the load balancing and it is not truly "sessional".
We plan to correct it by making use of query parameters and maintaining a mapping of the session id generated by the backend server to the backend server.

2 comments:

  1. Are the connections to upstream server pre-established? If yes, what happens when a particular upstream server goes down?

    Further, it is assumed that while you are checking cookie value, you are also making use of hostname (possibly URI path too), since the nginx LB can be the LB for multiple websites and if using same cookie name, would require extra care.

    I suggest that you also provide insight why server session based load balancing is quite different fro cookie based load balancing and why one should be chosen over the other.
    Also, after your understanding of the work, highlight what are the key challenges in implementing session based load balancing and under what assumptions the session based LB will be implemented.

    ReplyDelete
  2. No, the connections are not pre-established. The load balancer gives the address of the backend server to which the connection has to be made. The actual connection is not handled by the sticky module.
    Each server has a max_fail count after which it is considered down and isn't tried until a fail_timeout is elapsed.
    The sticky module supports only one domain, specified in the configuration and only checks for the cookie in the code logic to read / update cookie data.

    Yes Sir, we'll reflect on session based load balancing in more detail in the following blogs.

    ReplyDelete