|
Distributing Services
Brings a whole new meaning to “building out” data services
By: Bruce Bahlmann - Contributing Author (your
feedback
is important to us!)
Created: April 30, 2001
Note: For help determining the extent
you need to distribute your services or for
developing tools to help you determine is such measures are really
necessary contact Birds-Eye.Net.
The Internet provides a wealth of lessons learned for the aspiring
broadband operator. Called the mother of all networks, the Internet has
relatively quickly become a mainstay for reliably communications delivery
for millions of users. Certainly, one can appreciate the mechanics (or art)
of moving all this data around on a daily basis but what is truly remarkable
is the degree of reliability built into this enormous network of networks.
While the Internet was nurturing the tasks of routing and delivering
reliable data, applications that enabled various services offered over the
Internet have remained relatively the same and often merely rely on the ever
improving data handling of the Internet to provide increased reliability.
Although there have been significant technical advancements through the
years which have contributed to the overall reliability of the Internet,
uses of this medium for conducting business or even building applications
that exploit its power are still in their infancy. However, the idea of
building a single web that millions of people can visit is all but a memory.
So what has happed that has changed peoples’ minds that one site can’t
provide for all the hungry appetites out there. The answer is simply,
distributed services.
What Brick and Mortar Business Have Known All Along
Traditional brick and mortar business thrive while a majority
of the Internet businesses struggle to keep afloat. Why is this? It is
because they have learned how to expand and survive in many different
markets.
For example: If you have one storefront in one city, one way to
expand is to build a bigger building, expand the parking lot, etc. The
assumption here is if you build it more people will come. However, by doing
this you make perhaps an even bigger assumption that all roads into your
storefront will all but decrease – traffic problems will not be an issue. In
reality, companies don’t expand this way. Instead, they spread out, building
storefronts in many communities. Why? So they can be closer to even more
customers.
A simple analogy can provide a history of this problem and evolution of
thought regarding service delivery over the Internet. Lets say you would
like to start a business selling self-installation kits over the Internet.
You’ve done your research and know that you could make a killing signing up
broadband operators. With your business plan intact, you begin signing up
operators and provide them with an easy to use central web site. As business
improves, you sock more money into your business by providing fail-over
services for your web site and possibly even a redundant path to the
Internet – all the while placing increasing dependency on the Internet to
secure your business and reliably complete customer business.
Like socking more money into your distributors or investing in the core
assets of the company you invest in your sole touch point with your
customers. Makes sense, you have created a successful business, you want it
to survive and expand, so you do what comes natural – get bigger. Like a
free phone line, the Internet provides you with the opportunity to conduct
business without significant expense and without much effort – so it is easy
to forget about. But why invest in the Internet (i.e. distribute), it works,
its reliable, so put your money where it counts, right? I call this reliance
similar to that of recent businesses that solely depended on a certain
satellite or shipping company. When a particular satellite went out of orbit
rendering a number of pagers inoperable, several companies were completely
unprepared to deal with the consequences. UPS went on strike rendering many
businesses that solely depended on their service without any established
delivery options for their products – these businesses were forced to make
other arrangements to send packages at a much higher cost than what UPS
charged them. Business is great – so long as everything remains the same.
Rather than building more web sites, you create the equivalent of a mega
storefront on the Internet. However, placing all your eggs in one basket has
to give. The complexity of combining all your different businesses into a
single virtual storefront slows your development and gives your more agile
competitors room to grow up through the cracks in your foundation. The
Internet actually started out this way with companies building monster web
sites. Only it became increasingly expensive to maintain these so they began
to look for alternatives to this. Technologies like caching, load balancing,
etc. provide some relief to these large web sites but inevitably encourage
more of a hardware intensive solution.
Hardware intensive solutions (see Figure 1.0) are the result of limited
capabilities of software to provide expanded growth and scalability options.
Perhaps more importantly, hardware intensive solutions place you on a road
without a clear endpoint (fix) – constantly upgrading equipment, increasing
link speeds, etc. to keep up with demand. Once caught in this cyclone, it
becomes increasing harder to capitalize on what you’ve already invested, as
it is cheaper to simply buy another upgrade then fall back and punt.

Figure 1.0 Roadmap to a Hardware Intensive Solution
By the Numbers Example
Lets take a look at one the key services that enable
broadband operators to deliver the Internet today. Dynamic Host
Configuration Protocol (or DHCP) is one of the hottest distributed service
applications going. Why, because if the scalability of this application is
not addressed this service becomes a barrier to profitability. Distributing
DHCP is quite complex/controversial and has been debated in the Internet
Engineering Task Force (IETF) for years. In the mean time, centralized DHCP
has been the standard for many large systems (100’s of thousands of
subscribers).
But just how much work must a centralized DHCP server do. By the
numbers*:
- Each subscriber’s computer represents an average of ~16 transactions
a day. This number takes into account extraneous transactions from other
equipment that may or may not belong to each subscriber.
- Since each customer represents both a modem and a computer, this
number is at least doubled.
- Depending on the system, each customer may be allowed to have
multiple computers. In this case each subscriber (or head of household)
could represent 1+ computer(s). The most widely accepted number here has
been ~1.2). This may change (for the worse) as more and more Customer
Premise Equipment (CPE) become DHCP enabled (e.g. home appliances, Set
Top Boxes, etc.).
- Depending on the server policy, the lease/renewal times may force
the number of transactions to be greater or slightly less. For the
benefit of simplicity, this we will not take this into account.
*Numbers based on an actual DHCP and LDAP transaction study performed
by Bruce Bahlmann at MediaOne January 6, 1999
At a minimum, we know that each subscriber represents at least the
following:
16 CPE transactions
+ 16 modem transactions
32 minimum transactions per customer per day
Based on this minimum number of transactions, each subscriber represents
somewhere around ~ 3.68% load on a server capable of handling one
transaction per second (t/s). If you increase the number of subscribers by
some factor (e.g. 4x) the load on the server increases by that amount. For
example:
Impact of 4 subscribers on a 1 t/s server:
4 subscribers x 3.68% load = ~ 14.72% load on a 1 t/s server (or
~4x increase in the load that one subscriber represents)
If you increase the transaction capacity of a server by some factor (e.g.
4x) you decrease the load that any one subscriber represents to that server
by that amount. For example:
Impact of 1 subscriber on a 4 t/s server:
3.68% load / 4 t/s = ~ 0.92% load for a single subscriber (or ~4x
reduction in the load that one subscriber represents)
Well, you might say, that if you had a super fast server you should be ok
with handling quite a number of customers. However, it is much easier to
increase the load (or number of subscribers) by some factor (e.g. 4x) than
it is to increase the capacity of the server by some factor (e.g. 4x). In a
future article, I will explore this area further. But for now it is suffice
to say that high server transaction numbers don’t pave the way to peace of
mind when providing critical services such as DHCP. In fact, there are quite
a number of assumptions involved in actually being able to sustain central
DHCP server performance – much of which is beyond its control. Similar to
the assumptions made when building a large storefront, a centralized server
also makes some pretty large assumptions. Network bandwidth and capacity,
router performance, and database performance are just some of the factors
that will limit any vendor’s stated DHCP performance numbers.
The moral of this story is that that number of subscribers and their CPEs
will inevitably over run the operational capability of any one server as it
is always easier to double (or better) your subscriber count over double
some application’s performance numbers. Maybe not initially, but seeking the
fastest DHCP server money buy will not bring you peace of mind and can
likely place you on a collision course with a hardware intensive solution.
Next thing you know, you’ll be looking for more bandwidth and a bigger box
at which point you’ll know you’re locked in.
In addition to this break point, the centralized model also brings with
it a huge risk factor. Like placing the needs of ones business on a certain
shipping company, the risk is just as great in placing ones business on a
single application (even if it does fail-over) – thus the need for a
distributed service.
When is it Time To Distribute?
About the time you’ve heard your last rah, rah speech about why you need
fail-over capability, you should take a hard look at distributed services.
All vendors do not currently provide distributed capability. In fact, since
the standards have not been readily adopted/adhered, various applications
have taken slightly different approaches towards solving this problem (all
within the spirit of the standards but slightly different from one another).
As a result, to obtain this functionality you might be forced to delve into
more of a vendor specific (a kind word for proprietary) solution – in other
words it is unlikely you can mix and match vendors to provide distributed
services.
In light of the “use with caution” labeling that distributed services are
tagged with, their use promotes a superior model for providing a critical
service. Unlike centralized solutions, where you are constantly battling the
need for speed with your growing subscriber count, building a distributed
solution permits a simple “add as needed” approach – which is the ultimate
in scalability and reliability. It also allows you to more accurately plan
for additional purchases as opposed to the firefighting that goes on with
use of a centralized solution.
Recommended Model
One of the best ways to provide distributed DHCP services is
to look for one of the hybrids. Hybrids contain some centralized capability
along side the benefits of the distributed capability. Three important
centralized features within some of these hybrids are:
-
Registry – Permits a greatly simplified registration (provisioning)
process that is independent of subscriber location. While each
distributed server has its own registry that it uses in it’s stand-alone
mode (for only the subscribers it directly supports), existence of a
central registry permits each distributed server to learn about new or
discontinued subscribers based on the requests it receives from the
field.
-
Configuration Management – Permits one to setup, deploy, and maintain
all servers from a central point. Centralizing this permits reduction of
multiple entries, standardizing configurations (such as policies,
filters, performance tuning, etc.), and streamlined troubleshooting.
-
IP Address Management – This could be a subset of configuration
management but has grown a life of its own. It is extremely useful to
have this information centralized.
Without these centralized features, distributing services can become
quite complex to provision as well as manage. However it should be
emphasized that the failure of any centralized component must not cause an
outage to any distributed component – rather it only provides a value add –
like a convenience.
Fail-over: A means of providing a software work around for
hardware failures. In this way, if the hardware platform that hosts an
application becomes sick, the software will sense this and start up an image
of itself running on different hardware.
Distributed: A means of providing work around for hardware/software
failures by using multiple systems placed in strategic locations throughout
a large network--each of which can run independently.
The case for distributing
Building a centralized service is a step in the direction towards a
hardware intensive solution. Based on the successes of the Internet and
brick-and-mortar businesses, distributed services provide a realistic option
for widespread service delivery. Distributed services provide the greatest
flexibility and scalability.
Can Birds-Eye.Net help you or your Company?
Receive your Birds-Eye.Net articles and white
papers hot off
the presses by adding our RSS feed to your reader.
|