Carrier Class DHCP Testing Setup and Suggested Performance
Numbers
Devising a carrier-class test environment to measure
near-real-world DHCP performance
By: Bruce Bahlmann - Contributing Author (your
feedback
is important to us!)
Created: September 9, 2002
Note: Birds-Eye.Net offers a DHCP Stress Testing Suite [evaluation/buy]
as well as expert consulting help for the peskiest of DHCP configurations [consulting].
The purpose of this document is to define a testing environment for a
DHCP server that closely represents the load that such a server would see in a large
broadband operator or Internet Service Provider (ISP). The following are necessary
components in this testing environment:
DHCP client generator: The goal of the DHCP client generator
is to non-invasively test the performance of a DHCP server through a battery of
performance and packet handing tests. The generator operates similar to a DHCP client by
performing various interactions with a DHCP server along with throwing an occasional
errant or incomplete packet at the DHCP server. The client measures or times these
transactions with the DHCP server individually and overall. An individual transaction
represents handshaking events between server and client (e.g. DISCOVER-OFFER) and can be
measured in terms of loop times and stored by the client. The client also keeps track of a
completed overall DHCP cycles those that complete the DHCP transaction with the
server (i.e. from DHCP DISCOVER to DHCP ACK). The client also maintains overall cycle
times for each interaction between server and client. Some handy indications on a running
DHCP client generator might include:
-
Average
transaction time: The current average of all completed DHCP transactions between
server and client. This number is helpful and will gradually increase as the server and
the network becomes increasingly taxed.
-
Average overall
cycle time: The current average of all completed DHCP cycles with the server (DISCOVER
to ACK).
-
% of completed
DHCP transactions: The percentage of the number of transactions with the server that
have been successfully completed by the DHCP client generator (completed as opposed to
timed out or dropped).
-
Current
transaction rate: The number of transactions currently being sent to the server per
second.
As each client transaction is about to begin, it is helpful to obtain
a snap shot of these average times, the last completed individual transaction, and the
overall cycle time and then store these along with the record assigned to the impending
transaction. The purpose of obtaining this snap shot is to be able to determine the
overall performance of the DHCP server upon the last good transactions before it begins
dropping packets (as finding this point should be the goal of any quality DHCP testing).
When stress testing, you want to find the spot at which the server fails, begins to drop
packets, and/or does not complete requested DHCP transactions with clients. Note that each
of these spots may take place at different times (if at all) as load is increased (failure
of the server may or may not occur unless the incoming packets somehow overload the
application, available resources [disk, connection, memory, etc.], or the operating system
[swap, memory, disk, etc.]). If the server does not fail, it may just drop some packets
while completing others it all depends on the capability of the server to
prioritize its processing capability and complete the work it has started. It is the duty
of the client generator to determine this point as well as the performance of the server
leading up to that point. The sweet spot of the server (how many DHCP clients it can
effectively maintain during any given time) may well be the spot at which the server can
no longer keep up with any additional load or potentially just beyond this point depending
on what the server does upon reaching saturation as well as its ability to overcome these
instances and catch back up with the incoming requests. The ability to catch back up
with incoming request must be a feature of carrier class DHCP servers and should be
thoroughly tested!
In terms of which performance numbers should be used to explain the
overall performance of the server three options exist.
-
Emphasizes
Client Capacity: The average transaction time measured at the point where the server
supports the most possible DHCP clients. This point may not guarantee that the server
doesnt drop any packets, only that any additional clients added merely increase the
number of packets dropped by the same amount thus no additional client capacity is
realized. It is assumed that the transaction time for packets that are dropped do not
figure into the average response time at this point. This point presents a client capacity
ceiling for the server that would be helpful to design information for architecting
solutions for customers that would likely prevent them from reaching the servers
capacity ceiling.
-
Emphasizes
Client Response Time: The average transaction time measured at some optimum point in
the performance curve of the server. Perhaps the center of the sweet spot for server
response time performance. This point may be desirable so long as this point represents
some reasonably useful client numbers (e.g. 100,000) while still representing an
economical solution to the customer.
-
Emphasizes No
Packets Dropped: The average transaction time measured at the point where the server
begins dropping packets. This response time represents a compromise of the above extremes
(response time versus client capacity). These numbers may represent the most economical
situation for customers as it would provide them with the most client capacity per box
while still performing at fairly low response times.
Since multiple DHCP client generators should be used, these numbers
should be averaged across all instances of the DHCP client generators. The DHCP client
generator should be assigned a range of valid (registered) and invalid (unregistered) MAC
addresses to generate. No MAC addresses should be duplicated across any DHCP client
generator, however all MAC addresses with registered leases in the DHCP server should be
used during the testing.
Product Feature Idea: It may be helpful to track performance
at each of the points explained earlier: point of failure, point of dropped packets, point
of not completing DHCP cycles. Each of these points could raise product differentiating
processing efficiencies on how best to optimize server performance during resource
overload. Some transactions (e.g. unregistered hosts) may well consume the most processor
time during resource overload thus delaying precious seconds in recovery time. It may be
helpful to consider assigning weights or priorities to different types of transactions
based on the state of server operation so as to help the server more successfully navigate
through broadcast storms. It may also be desirable to remember such hosts or repeated
incomplete requests with matching checksum so as to immediately drop these packets during
performance critical events.
Relay Agent: Since the goal of one or more DHCP client
generator(s) is to emulate the load of a very large system on the DHCP server these
transactions (server to client) must take place through a relay agent. Large systems all
have at least one or more relay agents between their DHCP server and their clients.
However with a large amount of traffic being generated, a single relay agent could impact
the perceived load of the number of clients being generated because they can become
serialize though the relay agent. Therefore multiple relay agents (a minimum of three)
should be used to helper traffic to the DHCP server. Behind each relay agent may be one or
more DHCP client generator(s).
DHCP Server: The DHCP server used for testing should be loaded
as it would to support at least 100,000 clients (or network nodes). Such a server should
have at least 600 subnets entered: 300 primary subnets and 300 secondary subnets. The
secondary subnets used may be within the 10net range (10.0.0.0) and the primary networks
used may be within the 24net range (24.0.0.0). Spread across these subnets, the server
should equally load* 100,000 leases the MAC addresses of which are assigned across
various DHCP client generators that are being used to test. Each subnet requires a gateway
or sub-interface on the relay agent so that the DHCP server subnets are in alignment with
the network subnets configured on the router/relay agent.
*NOTE the object of loading 100,000 leases and 600 subnets into the
DHCP server allows the performance testing of a simulated production server. Starting with
an empty lease database could skew the packet processing and server startup performance
results.
The DHCP server should either have
leases directly loaded (if possible) or just load the registered MAC addresses and then
perform a test run that would assign leases to each of the registered MAC addresses. If
these leases are not loaded prior to the performance testing, the server response time may
be impacted by the assignment of leases as part of the response instead of accessing an
existing lease record, modifying it, and then sending a response. New DHCP responses must
additionally deal with ping timeouts and manage a list of temporarily assigned addresses
before committing the lease to the database. Although it may be determined that a response
to a new client (one that does not exist with the database) is the same as the response
time to an existing client, it seems logical to provide the server with as much
preliminary information about the networks, clients, leases, etc. prior to running
performance tests.
Once the server is loaded with leases and subnets all testing should
commence only after the server has been started up and is idle. No testing should be done
on a server in the process of starting up or initializing.
DHCP Configuration Settings:
The DHCP server has many configuration settings that can vary from
server to server. Some of these configurations could impact performance when used so often
DHCP testing is done in a ideal sense such that the server is lightly configured and
optimized for startup and processing speed. However such a server does not reflect actual
performance in practice. These configurations include filtering, callouts or scripting
points, ping timeouts, database persistence, along with numerous DHCP option settings,
policies, automated DNS updates, and client classes, etc. When testing the DHCP server
its a good idea to determine what if any performance impact these settings have
rather than merely seek out the best possible performance gain. One should expect that at
least a couple dozen-client classes exist for testing along with perhaps as many subnet
policies.
Measuring DHCP Server Startup Time:
The startup time is a measurement of the time it takes from the point
of launching the DHCP application to the point at which the server begins responding to
DHCP clients. Startup time should be measured in two different cases.
-
Case #1:
Take a DHCP server with multiple DHCP client generators pumping transactions into it.
Execute a graceful DHCP server shutdown. Begin DHCP client generator and then execute a
DHCP server start. Measure the time it takes the server to begin from a dead stop to
responding to DHCP client requests.
-
Case #2:
Run the DHCP server as before with DHCP client generators pumping transactions into it.
Pull power plug or perform a non-graceful shutdown on the server process. Begin DHCP
client generator and then execute a DHCP server start. Measure the time it takes the
server to begin from a dead stop to responding to DHCP client requests.
Measuring the DHCP server startup time can differentiate some servers
with faster processing times by exposing how these applications depend on caching
transactions or write to a transaction log in terms of how quickly they can load previous
operational configurations into memory.
Suggested Performance Numbers:
The start time should be measured in seconds and in all cases not
exceed more than a minute. Ideally, a good start time would be 1-10 seconds. A fair start
time would be 10-20 seconds, and a poor start time would be 20-30 seconds. More than 30
seconds is unacceptable.
The DHCP packet performance should be less than theoretical maximum or 2,439 transactions per
second (represents best case persistent writes to a DISK drive). Ideally, a great
realistic performance would be in the range of 800-1000 transactions per second (tps). A
good (acceptable) performance would be in the range of 600-800 tps. Fair performance would
be in the range of 400-600 tps, and below 400 tps would be unacceptable.
Check out these other Birds-Eye.Net papers/products regarding DHCP:
Products:
White Papers and Reading Material
Can Birds-Eye.Net help you or your Company?
Receive your Birds-Eye.Net articles and white
papers hot off
the presses by adding our RSS feed to your reader.
|