Sophie

Sophie

distrib > Mandriva > 2010.0 > x86_64 > by-pkgid > c27466c2a3fa3cf6008c3a485d00ce04 > files > 56

jetty5-manual-5.1.15-1.5.2mdv2010.0.noarch.rpm

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>Jetty Optimization Guide</title>
    <link rel="stylesheet" href="jetty.css" type="text/css"/>
    <meta name="generator" content="DocBook XSL Stylesheets V1.62.0"/>
  </head>

  <body>
    <h1 class="title">Jetty Optimization Guide</h1>

    <h1>Introduction</h1>

    <p>This guide describes techniques for optimizing a deployment of the
    Jetty HTTP server and servlet container. While some of the techniques
    described here are particular to the Jetty server, many are generally
    applicable to any similar servlet server. Note that for a J2EE application
    server, often it is the web tier that controls the vast majority of
    requests entering the server. Thus optimization of the web tier is key to
    the optimization of the entire container.</p>

    <p>Optimization is more of an art than a science, so this document does
    not present a specific solution. Instead the issues and parameters that
    need to be considered are discussed and &#34;rules of thumb&#34; are given
    where appropriate.</p>

    <h1>Optimization Overview</h1>

    <h2>HTTP Traffic Profile</h2>

    <p>In order to optimize a servlet container it is important to understand
    how requests are delivered to the container and what resources are used to
    handle it.</p>

    <h3>Browser Connection Handling</h3>

    <p>Each user connecting to the webapp container will be using a browser or
    other HTTP client application. How that client connects to the server
    greatly effects the optimization process. Historically browsers would only
    send a single HTTP request over a TCP connection, which meant that each
    HTTP request incurred the latency and resource costs of establishing a
    connection to the server. In order to quickly render a page with many
    images, each requiring a request, browsers could open up to 8 connections
    to the server so that multiple requests could be outstanding at once. In
    some specific circumstances with HTTP/1.0 browsers multiple requests could
    be sent over a single connection.</p>

    <p>Modern browsers are now mostly using HTTP/1.1 persistent connections
    that allow multiple requests per connection in almost all circumstances.
    Thus browsers now typically open only 1 or 2 connections to each server
    and send many requests over those connections. Browsers are increasingly
    using request pipelining so that multiple requests may be outstanding on a
    single connection, thus decreasing request latency and reducing the need
    for multiple connections.</p>

    <p>This situation results in a near linear relationship between the number
    of server connections and the number of simultaneous users off the server:</p>

    <pre>SimultaneousUser * NconnectionPerClient == SimultaneousConnections</pre>

    <h3>Server Connection Handling</h3>

    <p>For Jetty and almost all java HTTP servers, each connection accepted by
    the server is allocated a thread to listen for requests and to handle
    those requests. While non-blocking solutions are available to avoid this
    allocation of a thread per connection, the blocking nature of the servlet
    API prevents these being efficiently used with a servlet container.</p>

    <pre>SimultaneousConnections &#60;= Threads</pre>

    <h3>Persistent Connections</h3>

    <p>Persistent connections are supported by the HTTP/1.1 protocol and to a
    lesser extent by the HTTP/1.0 protocol. The duration of these connections
    and how they interact with a webapp can greatly effect the optimization of
    the server and webapp.</p>

    <p>A typical webapp will be comprised of a dynamically generated page with
    many static components such as style sheets and/or images. Thus to display
    a page a cluster of requests are sent for the main page and for the
    resources that it uses. It is highly desirable for persistent connections
    to be held at least long enough for all the requests of a single page view
    to be completed.</p>

    <p>After a page is served to a user, there is typically a delay while the
    user reads or interacts with the page. After which another request cluster
    is sent in order to obtain the next page of the webapp. The delay between
    request clusters can be anything from seconds to minutes. It is desirable
    for the persistent connections to be held longer than this delay in order
    to improve the responsiveness of the webapp and to reduce the costs of new
    connections. However the cost of this may be many idle connections on the
    server which are consuming resources for no server throughput.</p>

    <p>The duration that persistent connections are held is under the control
    of both the client and the server, either of which can close a connection
    at any time. The browsers cache settings may also greatly effect the use
    of persistent connections, as many requests for resources on a page may
    not be issued or may be handled with a simple 304-NotModified response.</p>

    <h2>Optimization Objectives</h2>

    <p>There are several key objectives when optimizing a webapp container,
    unfortunately not all of them are compatible and you are often faced with
    a trade off between two or more objectives.</p>

    <h3>Maximize Throughput</h3>

    <p>Throughput is the primary measure used to rate the performance of a web
    container and it is mostly measured in requests per second. Your efforts
    in optimizing the container will mainly be aimed at maximizing the request
    rate or at least ensuring a minimal rate is achievable. However you must
    remember that request rate is an imperfect measure as not all requests are
    the same and that it is simple to measure a request rate for load that is
    unlike a real load. Specifically:</p>

    <ul>
      <li>Containers will be more efficient handling high requests rates from
      a few long held persistent connections. Unfortunately this is often not
      a real traffic profile and requests more often come in from many
      connections which are mostly idle and/or short held. Thus it is key to
      also consider connection rate or at least the number of simultaneous
      connections when consider the meaning of a request rate figure.</li>

      <li>Requests with content or large responses take more time to package
      and process and may be exposed to more network inefficiencies. Thus
      requests rates of realistically sized requests must be considered and in
      some circumstances it is useful to consider data rate.</li>

      <li>There are several different ways that a webapp may serve a request
      and features that may be applied that will effect throughput, e.g.
      Static versus dynamic content, fixed versus variable length or security.
      The complexity of the requests must be considered when measuring
      throughput.</li>
    </ul>

    <h3>Minimize Latency</h3>

    <p>Latency is a delay in the processing of requests and it is desirable to
    reduce latency so that web applications appear responsive to the users.
    There are two key sources of latency to consider:</p>

    <ul>
      <li>The latency between when a request is initiated and when the
      handling of that request starts. This latency is effected by the time
      taken to establish a connection and the scheduling of threads within the
      server.</li>

      <li>The latency between requests in a request cluster. This latency can
      be large if the response for a previous request must complete before the
      next request can be issued. Browsers reduce this latency by using
      multiple connections or pipelining requests over a single connections.</li>
    </ul>

    <p>While latency is not directly related to throughput, there is often a
    trade off to be made between reducing latency and increasing throughput.
    Server resources that are allocated to idle connections may be better
    deployed handing actual requests.</p>

    <h3>Minimize Resources</h3>

    <p>The processing of each request consumes server resources in the form of
    memory, CPU and time. Memory is used for buffers, program stack space and
    application objects. Keeping memory usage within a servers physical
    available memory is important for maximum throughput. Conversely using a
    servers virtual memory may allow increased simultaneous users and can also
    decrease latency.</p>

    <p>Servers will have 1 or more CPUs available to process requests. It is
    important that the scheduling of these processors is done in such a way
    that they spend more time handling requests and less time organizing and
    switching between tasks.</p>

    <p>The servers often allocate resources based on time and it is important
    to tune timeouts so that those resources have a high probability of being
    productively used.</p>

    <h3>Graceful degradation</h3>

    <p>Much of optimization is focused on providing maximum throughput under
    average or high offered load rates. However for many systems that wish to
    offer high availability and high quality of service, it is important to
    optimize the performance under extreme offered load, either to continue
    providing reasonable service to some of the offered load or to gracefully
    degrade service to all of the offered load.</p>

    <h1>Analyzing Traffic</h1>

    <p>Before beginning to optimize the configuration of your HTTP server and
    servlet container, it is fundamental that you analyse the profile of the
    traffic you expect your server to handle. This can be estimated or
    measured from an actual live server. The type of information that is
    useful gather includes:</p>

    <table border="1">
      <tr>
        <th class="attribute">Attribute</th>

        <th class="variations">Variations</th>

        <th>Comment</th>
      </tr>

      <tr>
        <td>Request rate</td>

        <td>average, peak</td>

        <td>The number of requests per second</td>
      </tr>

      <tr>
        <td>Connection rate</td>

        <td>average, peak</td>

        <td>The number of new connections established with the server per
        second</td>
      </tr>

      <tr>
        <td>Simultaneous Users</td>

        <td>average, peak</td>

        <td>The number of users that a simultaneously interacting with the
        server.</td>
      </tr>

      <tr>
        <td>Requests per page</td>

        <td>average</td>

        <td>The number of requests that is required to render a page of the
        webapp. Includes images and style sheets, but may be affected by
        client caching.</td>
      </tr>

      <tr>
        <td>Page view time</td>

        <td>average</td>

        <td>The period of time that a typical user will view a page before
        requesting another from the webapp.</td>
      </tr>

      <tr>
        <td>Session duration</td>

        <td>average</td>

        <td>The period of time that an average user will remain in contact
        with the server. This can be used to estimate session and memory
        requirements</td>
      </tr>
    </table>

    <h2>Measuring Traffic</h2>

    <p>The most accurate way to measure the attributes listed above is to
    measure them on a live server that is handling real traffic for the webapp
    that you are trying to optimize. Statistics and log analysis can then be
    used to derive the information above.</p>

    <p>Jetty supports statistics collection at both the server and context
    level. The following configuration excerpt shows how to turn on statistics
    for the server and for a particular web application:</p>

    <pre>&#60;Configure class=&#34;org.mortbay.jetty.Server&#34;&#62;
  ...
  &#60;Call name=&#34;addWebApplication&#34;&#62;
    &#60;Arg&#62;/myapp&#60;/Arg&#62;
    &#60;Arg&#62;./webapps/myapp&#60;/Arg&#62;
    &#60;Set name=&#34;statsOn&#34;&#62;false&#60;/Set&#62;
  &#60;/Call&#62;
  ...
  &#60;Set name=&#34;statsOn&#34;&#62;false&#60;/Set&#62;
  ...
&#60;/Configure&#62;
</pre>

    <p>While statistics can be enabled as above, it is probably just as
    convenient to turn them on using a JMX agent to the Jetty MBeans. If Jetty
    is run with JBoss or within a JMX server, then a JMX agent can be used to
    configure and view statistics collection</p>

    <h3>Jetty HttpServer Statistics</h3>

    <p>The following statistics attributes are available on the
    org.mortbay.http.HttpServer class or via the associated MBean which is
    normally named like &#34;org.mortbay:Jetty=0&#34;:</p>

    <table border="1">
      <tr>
        <th>Attribute</th>

        <th>Comment</th>
      </tr>

      <tr>
        <td>statsOn</td>

        <td>True if statistics collection is turned on.</td>
      </tr>

      <tr>
        <td>statsOnMs</td>

        <td>Time in milliseconds stats have been collected for</td>
      </tr>

      <tr>
        <td>statsReset()</td>

        <td>Reset statistics</td>
      </tr>

      <tr>
        <td>connections</td>

        <td>Number of connections accepted by the server since statsReset()
        called</td>
      </tr>

      <tr>
        <td>connectionsOpen</td>

        <td>Number of connections currently open that were opened since
        statsReset() called</td>
      </tr>

      <tr>
        <td>connectionsOpenMax</td>

        <td>Maximum number of connections opened simultaneously since
        statsReset() called</td>
      </tr>

      <tr>
        <td>connectionsDurationAve</td>

        <td>Sliding average duration in milliseconds of open connections since
        statsReset() called</td>
      </tr>

      <tr>
        <td>connectionsDurationMax</td>

        <td>Maximum duration in milliseconds of an open connection since
        statsReset() called</td>
      </tr>

      <tr>
        <td>connectionsRequestsAve</td>

        <td>Sliding average number of requests per connection since
        statsReset() called</td>
      </tr>

      <tr>
        <td>connectionsRequestsMax</td>

        <td>Maximum number of requests per connection since statsReset()
        called</td>
      </tr>

      <tr>
        <td>errors</td>

        <td>Number of errors since statsReset() called. An error is a request
        that resulted in an exception being thrown by the handler</td>
      </tr>

      <tr>
        <td>requests</td>

        <td>Number of requests since statsReset() called</td>
      </tr>

      <tr>
        <td>requestsActive</td>

        <td>Number of requests currently active</td>
      </tr>

      <tr>
        <td>requestsActiveMax</td>

        <td>Maximum number of active requests since statsReset() called</td>
      </tr>

      <tr>
        <td>requestsDurationAve</td>

        <td>Average duration of request handling in milliseconds since
        statsReset() called</td>
      </tr>

      <tr>
        <td>requestsDurationMax</td>

        <td>Get maximum duration in milliseconds of request handling since
        statsReset() called.</td>
      </tr>
    </table>

    <h3>Jetty HttpContext Statistics</h3>

    <p>The following statistics attributes are available on the
    org.mortbay.http.HttpContext class or via the associated MBean which is
    normally named like
    &#34;org.mortbay:Jetty=0,HttpContext=0,context=/myappp&#34;:</p>

    <table border="1">
      <tr>
        <th>Attribute</th>

        <th>Comment</th>
      </tr>

      <tr>
        <td>statsOn</td>

        <td>True if statistics collection is turned on</td>
      </tr>

      <tr>
        <td>statsOnMs</td>

        <td>Time in Milliseconds that stats have been collected for</td>
      </tr>

      <tr>
        <td>statsReset()</td>

        <td>Reset statistics</td>
      </tr>

      <tr>
        <td>requests</td>

        <td>Number of requests since statsReset() called</td>
      </tr>

      <tr>
        <td>requestsActive</td>

        <td>Number of requests currently active</td>
      </tr>

      <tr>
        <td>requestsActiveMax</td>

        <td>Maximum number of active requests since statsReset() called</td>
      </tr>

      <tr>
        <td>responses1xx</td>

        <td>Number of responses with 1xx status (Informal) since statsReset()
        called</td>
      </tr>

      <tr>
        <td>responses2xx</td>

        <td>Number of responses with 2xx status (Success) since statsReset()
        called</td>
      </tr>

      <tr>
        <td>responses3xx</td>

        <td>Number of responses with 3xx status (Redirection) since
        statsReset() called</td>
      </tr>

      <tr>
        <td>responses4xx</td>

        <td>Number of responses with 4xx status (Client Error) since
        statsReset() called</td>
      </tr>

      <tr>
        <td>responses5xx</td>

        <td>Number of responses with 5xx status (Server Error) since
        statsReset() called</td>
      </tr>
    </table>

    <h2>Estimating Traffic</h2>

    <p>It may not be possible to measure actual live traffic of a deployment
    to be optimized. In this case estimates must be made to obtain a traffic
    profile on which to base your optimization. The following work sheets give
    some examples of how this may be done:</p>

    <table border="1">
      <tr>
        <th>Attribute</th>

        <th>Formula</th>

        <th>Example</th>

        <th>Comment</th>
      </tr>

      <tr>
        <td>SimultaneousUsers</td>

        <td>-</td>

        <td>1000</td>

        <td>Estimated from marketing or other sources.</td>
      </tr>

      <tr>
        <td>UserSessionDuration</td>

        <td>-</td>

        <td>180 seconds</td>

        <td>Time a single user spends interactive with the webapp. Estimated
        from marketing, usage trials or other sources.</td>
      </tr>

      <tr>
        <td>AvePageViewTime</td>

        <td>-</td>

        <td>30 seconds</td>

        <td>Time between page requests from a single user. Estimated from
        marketing or usage trials or other sources.</td>
      </tr>

      <tr>
        <td>PagesPerUserSession</td>

        <td>UserSessionDuration/PageViewTime</td>

        <td>6</td>

        <td> </td>
      </tr>

      <tr>
        <td>RequestsPerPageNoCache</td>

        <td>-</td>

        <td>12</td>

        <td>Calculated from inspection of HTML</td>
      </tr>

      <tr>
        <td>RequestsPerPageCache</td>

        <td>-</td>

        <td>3</td>

        <td>Calculated from inspection of HTML and usage trials.</td>
      </tr>

      <tr>
        <td>RequestsPerUserSession</td>

        <td>RequestsPerPageNoCache+ (RequestsPerPageCache*
        (PagesPerUserSession-1))</td>

        <td>27</td>

        <td> </td>
      </tr>

      <tr>
        <td>RequestsPerSecPerUser</td>

        <td>RequestsPerUserSession/ UserSessionDuration</td>

        <td>0.15</td>

        <td> </td>
      </tr>

      <tr>
        <td>RequestsPerSec</td>

        <td>SimultaneousUsers* RequestsPerSecPerUser</td>

        <td>150</td>

        <td> </td>
      </tr>

      <tr>
        <td>ConnectionsPerUser</td>

        <td>-</td>

        <td>2.5</td>

        <td>Measured from usage trials with estimated browser mix.</td>
      </tr>

      <tr>
        <td>AverageConnections</td>

        <td>SimultaneousUsers* ConnectionsPerUser</td>

        <td>2500</td>

        <td></td>
      </tr>

      <tr>
        <td>ConnectionsPerSecond</td>

        <td>ConnectionsPerUser* SimultaneousUsers/ UserSessionDuration</td>

        <td>13.88</td>

        <td>Assuming persistent connections that will span entire user
        session. If connections will not span session the multiply by
        PagesPerUserSession</td>
      </tr>

      <tr>
        <td>PeakRequestsPerSecond</td>

        <td>2*ConnectionsPerSecond + (RequestsPerPageNoCache-
        RequestsPerPageCache) * SimultaneousUsers/ UserSessionDuration</td>

        <td>77.76</td>

        <td>Based on SimultaneousUsers doubling in UserSessionDuration. The
        formula represents double the normal requests rate, plus the
        additional load of the new users loading the initial page with no
        cache.</td>
      </tr>
    </table>

    <p>This work sheet is only indicative of an estimate process that can be
    used, specially the method for determining the peak request rate. If
    possible , several estimation techniques should be used and the worse case
    numbers assumed.</p>

    <h2>Clustered Traffic</h2>

    <p>When running a cluster of application servers, it is often desirable to
    be able to handle the max expected load in the advent of a node failure.
    Thus once the single node traffic has been estimated or measured, the
    traffic loads for failure modes can be calculated:</p>

    <table border="1">
      <tr>
        <th>Nodes in Cluster</th>

        <th>Failed Nodes</th>

        <th>Load</th>
      </tr>

      <tr>
        <td>2</td>

        <td>1</td>

        <td>200%</td>
      </tr>

      <tr>
        <td>3</td>

        <td>1</td>

        <td>150%</td>
      </tr>

      <tr>
        <td>3</td>

        <td>2</td>

        <td>300%</td>
      </tr>

      <tr>
        <td>4</td>

        <td>1</td>

        <td>133%</td>
      </tr>

      <tr>
        <td>4</td>

        <td>2</td>

        <td>200%</td>
      </tr>
    </table>

    <h2>Generating Traffic</h2>

    <p>Once the expected traffic profile has been analysed, a test client can
    be used to generate load on the server that reflects realistic load. It is
    important to make sure that the test client used is generating realistic
    load:</p>

    <ul>
      <li>Are persistent connections supported? Persistent connections are
      much more efficient than non persistent connections and a realistic mix
      should be used to represent the expected browser population using the
      server.</li>

      <li>Are connections held idle for realistic times? Idle connections
      reduce latency for individual users at the expense of server resources.
      A test client that does not idle connections will not test the servers
      ability to balance these competing resource requirements.</li>

      <li>Does the test client account for client caching and
      if-modified-since headers? Most pages of a webapp are rendered from a
      cluster of requests for the initial page and it&#39;s included resources
      such as images and style sheets. Most client browser will cache many of
      the included resources and may often issue no requests for them or a
      request with an if-modified-since header that can be responded to with a
      simple 304-Not-Modified response. Test clients that do not model client
      caching will be measuring an unlikely worse case scenario.</li>

      <li>Is the test client run on a different machine to the server? Local
      networking has different characteristics to remote networking and a
      local test client will consume resources that could have been used by
      the server</li>
    </ul>

    <h1>Optimizing Jetty</h1>

    <p>Jetty has a few features that have been deprecated or that are
    particularly resource hungry. Before starting optimizing the more
    conventional attributes it is worthwhile to make sure that these features
    are turned off or minimally configured.</p>

    <h2>Request Log Buffering</h2>

    <p>The Jetty request log mechanism has the ability to buffer its output in
    memory before writing this to a file, which was intended to reduce
    synchronization load on the server. Unfortunately analysis of actual
    performance shows that the a server with buffering turned on has around 5%
    maximum throughput. Prior to Jetty release 4.2.9 log buffering was turned
    on by default. This should be turned off:</p>

    <pre>&#60;Configure class=&#34;org.mortbay.jetty.Server&#34;&#62;
  ...
  &#60;Set name=&#34;RequestLog&#34;&#62;
    &#60;New class=&#34;org.mortbay.http.NCSARequestLog&#34;&#62;
      &#60;Arg&#62;&#60;SystemProperty name=&#34;jetty.home&#34; default=&#34;.&#34;/&#62;/logs/yyyy_mm_dd.request.log&#60;/Arg&#62;
      &#60;Set name=&#34;retainDays&#34;&#62;90&#60;/Set&#62;
      &#60;Set name=&#34;append&#34;&#62;true&#60;/Set&#62;
      &#60;Set name=&#34;extended&#34;&#62;false&#60;/Set&#62;
      &#60;Set name=&#34;buffered&#34;&#62;false&#60;/Set&#62;
      &#60;Set name=&#34;logTimeZone&#34;&#62;GMT&#60;/Set&#62;
    &#60;/New&#62;
  &#60;/Set&#62;
  ...
&#60;/Configure&#62;
</pre>

    <h2>Statistics</h2>

    <p>The Jetty server supports statistic collection at the server and at the
    context level. While stats collection itself does not involve significant
    work load, it does require synchronization in order to correctly count
    some statistics. On a multi CPU machine, this extra synchronization could
    significantly affect the performance of the server, thus statistics should
    be turned off while optimizing the server. Note that this is somewhat
    counter productive, as the statistics are very useful for measuring the
    results of optimization. Thus the recommended use of the server statistics
    is to measure the profile of real load being handled by the server. This
    profile can then be used in generating test load, hopefully from a test
    client which itself can generate statistics which can be used to evaluate
    optimizations.</p>

    <h2>NIO SocketChannelListener</h2>

    <p>Jetty releases from release 4.0.0 to 4.2.9 contained the
    SocketChannelListener implementation of the HttpListener interface. This
    implementation used the features of the java 1.4 NIO library to use
    non-blocking sockets for idle connections. The intent was to avoid
    allocating a java thread to idle connections. Unfortunately, due to the
    nature of the servlet API, the sockets had to be returned to blocking mode
    before control was passed to a servlet. The resulting constant changing of
    the NIO select sets proved to consume significantly more system resources
    than was saved by reducing the required number of threads. The
    SocketChannelListener has been deprecated since 4.2.10 and should not be
    used for any release unless for experimental purposes.</p>

    <h2>Max Read Time</h2>

    <p>The Jetty HTTP Listeners in versions prior to 4.1.1 had a parameter
    called maxReadTime, which was used to limit the time a request handler
    would wait for request content (e.g. on a form POST). This parameter, like
    maxIdleTime, was used to set the SO timeout value on the underlying
    connection socket. Unfortunately, if the maxReadTime value was different
    to the maxIdleTime value, then the SO timeout value was changed twice for
    every request. This proved to cause a significant reduction of throughput
    of the server, in the order of 10%. Thus for Jetty versions prior to 4.1.1
    it is important to set the maxIdleTime and maxReadTime parameters to the
    same value:</p>

    <pre>&#60;Configure class=&#34;org.mortbay.jetty.Server&#34;&#62;
  &#60;Call name=&#34;addListener&#34;&#62;
    &#60;Arg&#62;
      &#60;New class=&#34;org.mortbay.http.SocketListener&#34;&#62;
        &#60;Set name=&#34;port&#34;&#62;8080&#60;/Set
        &#60;Set name=&#34;minThreads&#34;&#62;25&#60;/Set&#62;
        &#60;Set name=&#34;maxThreads&#34;&#62;255&#60;/Set&#62;
        &#60;Set name=&#34;maxIdleTimeMs&#34;&#62;60000&#60;/Set&#62;
        &#60;Set name=&#34;maxReadTimeMs&#34;&#62;60000&#60;/Set&#62;
      &#60;/New&#62;
    &#60;/Arg&#62;
  &#60;/Call&#62;
  ...</pre>

    <p>For Jetty versions 4.1.1 or later, maxReadTime should not be set as it
    is ignored and produces a warning.</p>

    <h1>Optimizing Memory</h1>

    <p>Memory is a key resource that must be managed in any optimization of a
    web container. The procedure is to:</p>

    <ol>
      <li>Measure the static and dynamic memory requirements of your
      application.</li>

      <li>Configure the JVMs memory limits</li>

      <li>Adjust the thread pool to constrain dynamic memory use.</li>

      <li>Tune garbage collection.</li>
    </ol>

    <p>To tune memory usage, using a profiling tool like optimizeIt of jProbe
    can be very useful, however it can also be done simply by monitoring the
    memory allocated to the process by the operating system.</p>

    <h2>Measuring memory usage</h2>

    <p>Running a webapp can consume memory for:</p>

    <ul>
      <li>Statically allocated memory during initialization.</li>

      <li>Heap space allocated for Session objects per user of the webapp.</li>

      <li>Stack space allocated per thread.</li>

      <li>Heap space allocated for objects created during the processing of
      requests.</li>
    </ul>

    <h3>Check for memory leaks</h3>

    <p>Before optimizing your memory, it is important to establish that your
    webapp does not have any memory leaks. This is to say that no memory
    allocated when processing requests that cannot be freed when the server
    returns to idle. This can be determined by running the application with a
    constant low to medium load and monitoring the memory usage. The memory
    allocated should increase to a level and then stabilize. If the memory
    continues to grow and/or a OutOfMemory exception is eventually thrown,
    then the application has an object/memory leak. Such an application will
    not be able to run long term and the leak should be fixed before
    optimizing or deploying the webapp.</p>

    <p>Note that application data caches or poor garbage collection (GC)
    behaviour may appear as a memory leak. If possible disable application
    caches or configure them to small sized in order to test the applications
    underlying memory requirements. The JVM may be forced to perform a GC
    after a fixed number of requests by the requestsPerGC attribute of
    HttpServer. This can be set to a low value to avoid large fluctuations in
    memory usage during this measurement phase: </p>

    <pre>&#60;Configure class=&#34;org.mortbay.jetty.Server&#34;&#62;
  ... 
  &#60;Set name=&#34;requestsPerGC&#34;&#62;100&#60;/Set&#62;
  ...
&#60;/Configure&#62;
</pre>

    <h3>Stack space Usage</h3>

    <p>JVMs allocate a fixed amount of stack space per thread created. The
    stack space is used for storing parameters and other objects associated
    with a method call. The more nested method calls that you application
    requires (deeper stacks), then more stack space is required. Typically the
    default stack settings for JVMs are rather generous and are allocated per
    thread, thus significant savings can be made by tuning this allocation.</p>

    <p>For many JVMs, the stack space allocation is controlled with the -Xss
    option and the following command runs Jetty with 96kb allocated per stack:</p>

    <pre>java -Xss96k -jar start.jar</pre>

    <p>The simplest way to measure your stack requirements is to reduce the
    stack allocated until complex requests fail with StackOverflowException.
    You then need to increase your stack allocation with a good safety margin,
    the size of which will depend greatly on your application as some may have
    large variation in stack usage, specially those that use recursion. </p>

    <h3>Static &#38; Dynamic Memory Usage</h3>

    <p>An estimate of the static and dynamic heap space usage is needed to
    optimize the memory allocation. This is best done by measuring memory
    usage under realistic steady load at several load levels. The Jetty
    HttpListener should be configured to have a low minimal threads setting,
    so that idle threads do not effect the measurements. </p>

    <p>The following table shows some results for a simple test for memory
    usage using the unix ps command to determine the resident memory set size:</p>

    <table border="1">
      <tr>
        <th>Active connections/threads</th>

        <th>Process size in kb</th>

        <th>kb per connection</th>
      </tr>

      <tr>
        <td>0</td>

        <td>23076</td>

        <td></td>
      </tr>

      <tr>
        <td>20</td>

        <td>27540</td>

        <td>224</td>
      </tr>

      <tr>
        <td>40</td>

        <td>29352</td>

        <td>90</td>
      </tr>

      <tr>
        <td>60</td>

        <td>31868</td>

        <td>125</td>
      </tr>

      <tr>
        <td>100</td>

        <td>33852</td>

        <td>49</td>
      </tr>

      <tr>
        <td>150</td>

        <td>38264</td>

        <td>88</td>
      </tr>
    </table>

    <p>Extrapolating from this table gives the following approximate formula
    for memory usagage for this webapp:</p>

    <pre>memoryRequired = 23Mb + threads * 200kb</pre>

    <p>Ideally this formula should be tested with direct measurement under all
    load levels.</p>

    <p>This formula can now be used to calculate the memory requirements for
    your system and the JVM parameters should set to ensure that enough memory
    is available when the maximum number of threads are in use. For the above
    example, if a maximum of 500 threads are required (see below) and a 128k
    stack size is used, then 120MB of memory is required and the JVMs memory
    parameters should be configures as follows:</p>

    <pre>java -Xss128k -Xms120m -jar start.jar</pre>

    <p>Alternately, the memory formula can be used in reverse. If a known
    amount of physical or virtual memory is available and must not be
    exceeded, then the maximum number of threads can be determined. </p>

    <h2>Clustered Memory Usage.</h2>

    <p>Memory usage for a node in a cluster cannot be measured by looking at a
    single node. If distributed sessions or EJBs are being used, then memory
    used on one node may be replicated on all nodes. For example, with
    distributed HTTP sessions, each node must have capacity to store all the
    sessions for all the nodes in the cluster.</p>

    <p>For this reason, it is often desirable to not have large homogenous
    clusters. Rather a cluster of clusters topology can reduce the memory and
    failure contingency load on each node.</p>

    <h1>Optimizing Threads</h1>

    <p>Once you have determined your traffic profile and your memory profile,
    it is now possible to tune your server by adjusting the parameters of the
    thread pool. Each Jetty HttpListener has a pool of threads that is used to
    allocate threads to accepted connections. The following parameters can be
    used to configure the thread pool of each listener:</p>

    <table border="1">
      <tr>
        <th>Parameter</th>

        <th>Comment</th>
      </tr>

      <tr>
        <td>maxThreads</td>

        <td>limit to the number of threads that can be allocated to
        connections for that HTTP listener. This will effectively limit the
        number of simultaneous users of the server as well as the maximum
        memory usage. </td>
      </tr>

      <tr>
        <td>minThreads</td>

        <td>The minimum number of unused threads to keep within the thread
        pool. A large number of unused threads will allows a server to respond
        to a sudden increase in load with little latency. More importantly, a
        HTTP listener is considered to be low on resources once it&#39;s pool
        cannot allocate minThreads unused threads without exceeding max
        threads.</td>
      </tr>

      <tr>
        <td>maxIdleTimeMs</td>

        <td>The maximum time in milliseconds that a thread can be allocated to
        a connection without a request being received. This limits the
        duration of idle persistent connections.</td>
      </tr>

      <tr>
        <td>lowResourcePersistTimeMs</td>

        <td>An alternative value for maxIdleTimeMs to be used when the
        listener is low on resources (see minThreads).</td>
      </tr>

      <tr>
        <td>poolName</td>

        <td>If multiple HTTP Listeners are used, those with the same pool name
        will share the same thread pool. This avoid one listener running low
        on threads while another has idle threads.</td>
      </tr>
    </table>

    <h2>Setting maxThreads</h2>

    <p>The primary objective of the maxThread setting is to protect the server
    from excess resource utilization from high connection or request rates.
    Without a limit to the maximum threads, it would be possible for arbitrary
    high load to be accepted by the server which would eventually lead to one
    of the following failure modes:</p>

    <ul>
      <li>Out of memory. Each accepted connection/thread consumes memory and
      unlimited threads will eventually result in an OutOfMemoryException.
      Note that the memory allocated to the JVM can be increased to avoid this
      limit, but at some level physical memory will be exceeded and the server
      performance will decline. Eventually virtual memory can be exhausted.
      </li>

      <li>Out of threads. Threads are normally implemented by the host
      operating system and are a finite resource that can be exhausted. The OS
      can normally be tuned to increase this limit, but not indefinitely as
      system performance will eventually degrade.</li>

      <li>Out of file descriptors. TCP/IP connections are implemented by most
      operating systems using file descriptors and are a finite resource that
      can be exhausted. The OS can normally be tuned to increase this limit,
      but not indefinitely as system performance will eventually degrade.</li>

      <li>100% CPU. Each connection accepted will allows a flow of requests
      into the system, each which takes CPU to process. Once 100% CPU has been
      reached any additional connections accepted are just increasing latency
      for all connections and eventually reducing total throughput.</li>
    </ul>

    <p>There are two main approaches to setting maxThreads:</p>

    <ol>
      <li>If a good estimate or measurement of the maximum load is known, then
      maxThreads is set high enough to handle this and then system verified to
      check that none of the failure modes are breached. This approach results
      in a server that is good enough for the webapp and can leave server
      resources available for other uses.</li>

      <li>Various maxThreads values are tested with a test client generating a
      load of approximately the same value. The tested maxThreads value is
      increased until such time as one of the failure modes above is detected
      or the measured throughput starts to decrease. This approach results in
      a server that uses all the system resources and requires a dedicated
      machine.</li>
    </ol>

    <p>If with either of these approaches, the estimated, measured or required
    maximum load requires a maxThread value that exhaust the system memory,
    CPU, connections or other resources, then the machine is not sufficient
    for that webapp. In this case, additional server resources (memory, CPU,
    kernel configuration) is required or a clustering solution can be
    considered.</p>

    <p>Once a server has reached it&#39;s maximum number of threads, then any
    new connections attempted are held by the operating system until either
    they time out, a thread becomes available to accept the connection or they
    are refused when the operating system queue becomes full.</p>

    <h2>Setting minThreads</h2>

    <p>The minThreads value is used to control how a server degrades under
    extreme load. Once there are less than minThreads available in the thread
    pool, then the lowResourcePersisteTimeMs parameter can be used to free up
    other idle threads.</p>

    <p>If a good estimate or measure of average and maximum load are known,
    then the minThreads value can be set to half the difference between the
    average and maximum.</p>

    <pre>minThreads == (maxThreads - averageConnections) / 2</pre>

    <p>Thus if maxThreads is 3000 and averageConnections is 2500, then
    minThreads could be set at 250, so that low resource timeouts will be
    applied once the actual connections exceeds 2750.</p>

    <p>Alternately, minThreads may be set to protect excess memory usage. If
    maxThreads requires more memory than is physically available, then
    minThreads can be set to free resources once physical memory is exceeded.
    Using the memory formula example from above and if 47Mb of physical memory
    is available on the system (when running the OS), then for maxThreads ==
    200:</p>

    <pre>minThreads == maxThreads - ( ( 47Mb - 23Mb ) / 200kb ) == 80</pre>

    <h2>Setting maxIdleTimeMs</h2>

    <p>The idle time of a thread is used to limit the time that an persistent
    connection can be idle. Higher values are desirable to reduce latency for
    a user and avoid the expense of recreating TCP/IP connections. However, if
    the value is set too high, it wil result in many connections being left
    open when the user is no longer browsing the webapp and the resources
    allocated to it are effectively wasted for a long period of time.</p>

    <p>A good value to use for the maxIdleTimeMs is slightly longer than the
    average page view time for the application, so that persistent connections
    are held long enough to span the time between page requests for an average
    user.</p>

    <h2>Setting lowResourcesPersistTimeMs</h2>

    <p>A HTTP Listener is considered low on resources if there are less than
    minThreads available in the thread pool and a lowResourcePersistTimeMs can
    be set to replace maxIdleTimeMs so that idle connections can be freed for
    other connections. The reasoning for this is that once a server is low on
    resources, there is little benefit keeping resources allocated to idle
    connections in the hope that new requests will come from them.</p>

    <p>With a low lowResourcesPersistTimeMs value set, performance will
    degrade more gracefully as maxThreads is approached.</p>

    <p>The value of lowResourcePersistTimeMs should be long enough to ensure
    that all requests in the cluster for a page view can be served by a
    persistent connection. This is typcially governed by the network latency
    and should not be more than a few seconds and can be as low as a few
    hundred milliseconds for a good network.</p>

    <h2>Setting poolName</h2>

    <p>If a server has multiple HTTP listeners configured, it may be desirable
    to share the thread pool between listeners, so that one listener is not
    starved or resources if the other has free threads. If you wish to reserve
    capacity for a particular listener, then a shared thread pool should not
    be used:</p>

    <pre>&#60;Configure class=&#34;org.mortbay.jetty.Server&#34;&#62;
  ...
 &#60;Call name=&#34;addListener&#34;&#62;
    &#60;Arg&#62;
      &#60;New class=&#34;org.mortbay.http.SocketListener&#34;&#62;
        &#60;Set name=&#34;port&#34;&#62;8080&#60;/Set&#62;
        &#60;Set name=&#34;minThreads&#34;&#62;80&#60;/Set&#62;
        &#60;Set name=&#34;maxThreads&#34;&#62;200&#60;/Set&#62;
        &#60;Set name=&#34;maxIdleTimeMs&#34;&#62;30000&#60;/Set&#62;
        &#60;Set name=&#34;lowResourcePersistTimeMs&#34;&#62;2500&#60;/Set&#62;
        &#60;Set name=&#34;poolName&#34;&#62;Listener&#60;/Set&#62;
      &#60;/New&#62;
    &#60;/Arg&#62;
  &#60;/Call&#62;

  &#60;Call name=&#34;addListener&#34;&#62;
    &#60;Arg&#62;
      &#60;New class=&#34;org.mortbay.http.SunJsseListener&#34;&#62;
        &#60;Set name=&#34;port&#34;&#62;443&#60;/Set&#62;
        &#60;Set name=&#34;poolName&#34;&#62;Listener&#60;/Set&#62;
        &#60;Set name=&#34;keystore&#34;&#62;./etc/demokeystore&#60;/Set&#62;
        &#60;Set name=&#34;password&#34;&#62;OBF:1vny1zlo1x8e1vnw1vn61x8g1zlu1vn4&#60;/Set&#62;
        &#60;Set name=&#34;keyPassword&#34;&#62;OBF:1u2u1wml1z7s1z7a1wnl1u2g&#60;/Set&#62;
      &#60;/New&#62;
    &#60;/Arg&#62;
  &#60;/Call&#62;
  ...
&#60;/Configure&#62;
</pre>

    <h1>Other Optimizations</h1>

    <h2>Buffering</h2>

    <p>Providing larger buffers for the HTTP Listeners allows more efficient
    processing and generation of content, with less blocking and content
    switching. It also allows the TCP/IP protocol to more efficiently run
    it&#39;s sliding window protocol and avoid network latencies. Prior to
    Jetty release 4.2.10, the default buffer size was 4096 bytes. This has now
    been increased to 8192 bytes. The buffer size can be set as follows: </p>

    <pre>&#60;Configure class=&#34;org.mortbay.jetty.Server&#34;&#62;
  ...
 &#60;Call name=&#34;addListener&#34;&#62;
    &#60;Arg&#62;
      &#60;New class=&#34;org.mortbay.http.SocketListener&#34;&#62;
        &#60;Set name=&#34;port&#34;&#62;8080&#60;/Set&#62;
        &#60;Set name=&#34;minThreads&#34;&#62;80&#60;/Set&#62;
        &#60;Set name=&#34;maxThreads&#34;&#62;200&#60;/Set&#62;
        &#60;Set name=&#34;maxIdleTimeMs&#34;&#62;30000&#60;/Set&#62;
        &#60;Set name=&#34;lowResourcePersistTimeMs&#34;&#62;2500&#60;/Set&#62;
        &#60;Set name=&#34;poolName&#34;&#62;Listener&#60;/Set&#62;
        &#60;Set name=&#34;bufferSize&#34;&#62;8192&#60;/Set&#62;
      &#60;/New&#62;
    &#60;/Arg&#62;
  &#60;/Call&#62;
  ...
&#60;/Configure&#62;
</pre>

    <h2>Security</h2>

    <p>Authenticated security constraints on a webapp can be expensive to
    check as often a realm is implemented using crypto algorithms or with a
    remote AAA server or database involved. </p>

    <p>Frequently a webapp page is constructed with many images that are not
    sensitive and do not need to be protected with an authenticated security
    constraint. Significant performance gains can be obtained by excluding
    such static resources from a security constraint.</p>

    <p>For example consider a webapp that protects the directory /private with
    an authenticated constraint, but has a number of non-sensitive images in
    the /private/images directory, then the following web.xml excerp can be
    used to protect the private directory without the expense of protecting
    the images directory.</p>

    <pre>  ...
  &#60;security-constraint&#62;
    &#60;web-resource-collection&#62;
      &#60;web-resource-name&#62;Authed User Required&#60;/web-resource-name&#62;
      &#60;url-pattern&#62;/private/*&#60;/url-pattern&#62;
    &#60;/web-resource-collection&#62;
    &#60;auth-constraint&#62;
      &#60;role-name&#62;*&#60;/role-name&#62;
    &#60;/auth-constraint&#62;
  &#60;/security-constraint&#62;
  
  &#60;security-constraint&#62;
    &#60;web-resource-collection&#62;
      &#60;web-resource-name&#62;Images Not Protected&#60;/web-resource-name&#62;
      &#60;url-pattern&#62;/private/images/*&#60;/url-pattern&#62;
      &#60;http-method&#62;GET&#60;/http-method&#62;
      &#60;http-method&#62;HEAD&#60;/http-method&#62;
    &#60;/web-resource-collection&#62;
  &#60;/security-constraint&#62;</pre>

    <h2>Logging</h2>

    <p>Logging of requests can add extra CPU load per request and an extra
    synchronization point. The following points should be considered to
    optimize the logging configuration: </p>

    <ul>
      <li>Is logging really required? Many webapps collect requests logs that
      are never viewed or analyzed. If the logs are unlikely to be used, then
      it would be better to not generate them. Note that there is a security
      audit aspect to collecting request logs that may require them to be
      generated even if seldom viewed.</li>

      <li>Is the extended log format required? The extra content of the
      extended log is only useful if detailed log analysis is being performed.
      </li>

      <li>Are all request required to be logged? Images and style sheets often
      do not add any significant information to a request log. The ignorePaths
      attribute of the NCSARequestLog class can be used to exclude some paths
      from the log.</li>

      <li>Turn off buffering.</li>
    </ul>

    <p>The following request log configuration applies the points above.</p>

    <pre>&#60;Configure class=&#34;org.mortbay.jetty.Server&#34;&#62;
  ...
  &#60;Set name=&#34;RequestLog&#34;&#62;
    &#60;New class=&#34;org.mortbay.http.NCSARequestLog&#34;&#62;
      &#60;Set name=&#34;filename&#34;&#62;./logs/yyyy_mm_dd.request.log&#60;/Set&#62;
      &#60;Set name=&#34;buffered&#34;&#62;false&#60;/Set&#62;
      &#60;Set name=&#34;retainDays&#34;&#62;90&#60;/Set&#62;
      &#60;Set name=&#34;append&#34;&#62;true&#60;/Set&#62;
      &#60;Set name=&#34;extended&#34;&#62;false&#60;/Set&#62;
      &#60;Set name=&#34;logTimeZone&#34;&#62;GMT&#60;/Set&#62;
      &#60;Set name=&#34;ignorePaths&#34;&#62;
        &#60;Array type=&#34;String&#34;&#62;
          &#60;Item&#62;/images/*&#60;/Item&#62;
          &#60;Item&#62;*.css&#60;/Item&#62;
        &#60;/Array&#62;
      &#60;/Set&#62;
    &#60;/New&#62;
  &#60;/Set&#62;
  ...
&#60;/Configure&#62;</pre>

    <h2>Application</h2>

    <p>The way a web application is written can greatly effect the efficiency
    of the service. The following points should be considered when writing or
    reviewing your webapplication: </p>

    <ul>
      <li>Do not flush the response output stream or writers. This can result
      in inefficient packet fragmentation.</li>

      <li>If possible, implement the HttpServlet.getLastModfied() method so
      that content is only generated and served if the browser does not have a
      cached copy of the page.</li>

      <li>If possible, set the content length of the content served. This
      allows simple persistent connections for both HTTP/1.0 and HTTP/1.1
      clients.</li>
    </ul>
  </body>
</html>