|
|
|
|
Customer Premise vs End-of-Line
Monitoring
Monitoring last mile transmission lines in lieu of end user
devices offers similar benefits with fewer devices
By: Bruce Bahlmann - Contributing Author (your
feedback
is important to us!)
Created:
August 15, 2000
Note: For help
maximizing your cable modem network management program or for
developing tools to help you improve or implement such a program contact
Birds-Eye.Net.
Consider a situation where you’re placed in charge of a
network consisting of a vast collection of hardware that is all connected
but in a variety of different means. Additionally consider that this
hardware is owned & maintained by several different organizations in your
company who independently monitor their portion of the equipment. Finally,
consider that none of your organizations’ monitoring systems can communicate
with one another and that there is nothing on the market that can
consolidate all this information. Well this is broadband and, unfortunately,
this is the way it is.
The vast collection of hardware described above
includes cable modems, residential service units (RSUs), Intersects, lasers,
CMTS, switches, Routers, and HFC actives (amplifiers, power supplies, etc).
Each of these devices is connected via fiber, coax, and category 5 that are
monitored via a diverse group of both standards-based and proprietary
network management systems (NMS). A little known fact is that collectively
these monitoring systems actually yield far less than comprehensive network
coverage. As a result, broadband operators must remain dependent on customer
calls to detect a large number of outages. Overcoming this dependency
involves solving the pesky problem of monitoring for partial node outages.
HFC’s Weakness
The single most challenging task of broadband operators
today is monitoring for partial node outages (see previous month’s article
on where do your services fail). Partial node outages occur when a portion
of a node becomes damaged (e.g. animals chew through plant – this really
happens!) or experiences an equipment failure (e.g. amplifier fails). For
customers, the results of a partial node outage can range from noisy
reception and intermittent HSD services to a loss of all services. The
numbers of customers affected by partial node outages depend greatly on the
size of the node (how many customers are served by the node) and where the
break occurs (near the top of the node or further down the line).
Unfortunately, HFC is not designed with sufficient
detail that would permit partial node outages to be detected and located.
Traditionally, HFC is designed around ‘actives’ and everything is related
back to these actives. An active is generally a piece of hardware that
requires some type of power and ‘may’ be visible remotely (e.g. amplifiers,
lasers, etc.). Figure 1.0 shows a very simplified node with its actives
highlighted in red. This node splits into
three lengths (runs) of coax that terminate at various distances. Along each
run lies a variety of taps that permit connections and amplifiers to boost
the signal. In a more typical node there are multiple runs and sub runs. A
sub run is generally a small length of coax that branches off a main run.
The key thing about sub runs is that they typically do not connect to the
main run via an active.

Figure 1.0 Extremely Simplified Node
Because sub runs do not interface directly with an
active, they are generally invisible to broadband operators (unless they
contain a remotely monitored active). ‘Invisible’ in this case means that
sub runs are not monitored by a NMS and are not labeled on a node map.
Therefore most outages that happen on a sub run are undetectable by
broadband operators. The only case where one could ‘currently’ detect a
partial node outage would be some ‘slight’ rise in return noise but since
sub runs are not labeled, broadband operators would be hard pressed to
localize the source of the noise (i.e. it is often dismissed). In addition,
any portions of the node that are not directly between two actives are also
invisible to broadband operators. This means that the portion of each run
beyond the last active is also invisible to broadband operators. Figure 2
shows the portion of the node (shown in Figure 1.0) that is invisible to
broadband operators. This portion of the node is shown in
red
and can represent as much as 40% of the plant on a working node. In terms of
the numbers of customers this represents, its projected that somewhere
between 30-50% of all customers on a node fall into area of the node that is
invisible to broadband operators.

Figure 2 Portion of Node Invisible to Broadband Operators
Monitoring Options for Detecting Partial Node
Outages
With this large portion of the node being invisible,
broadband operators are seeking ways to monitor this portion of nodes in
whole or in part. One attractive option is to monitor customer premise
equipment (CPE). The information gained from sampling CPEs (e.g. cable
modems, telephony RSUs, etc.) in a specific area can provide information
regarding the health of the plant in that area. In theory, these devices
would be used to complement the coverage of monitoring actives.
Unfortunately, unless this is done right it can also lead to inconclusive
information. Attempts to use CPEs as a means of detecting partial node
outages greatly depends on a high level of penetration of these services on
fiber nodes. Thus, if you’re planning on monitoring CPEs you should be aware
of the following tradeoffs:
- CPE monitoring can utilize customer purchased devices (no additional
hardware required – sounds great but what is the catch?) as well as the same
network management station used to poll other HSD resources in exchange for
basic node outage detection.
- Coverage for all nodes may be years away. Take a city of 30,000 with a
penetration of 5% for HSD and 30% on telephony in an area where the maximum
customers per node are 250. A total of 10,500 customers would have to be
polled to determine partial node service interruptions on the city’s 120
nodes. This would mean that approximately 88 subscribers per node would
provide sufficient information to enable us to detect partial node outages.
While this is reasonable to expect good coverage across the whole node –
most broadband operators are nowhere near these numbers for both services
(HSD and telephony). Related to this is the fact that one cannot control who
takes the service. Thus there will be nodes with more than sufficient
coverage and some with little or none – and it could remain this way
forever!
- Correlating polling information with physical location so that one can draw
conclusions from devices on similar stretches of plant is also a problem.
Polling customer modems in an area is further complicated in the DOCSIS
model due to the use of combining. This is because customers with the same
network address or CMTS interface may all be on separate fiber nodes. Thus
any polling that is done among them “may” not be usable For example, two
down subscriber devices could mean that two node segments are down or only
one is down. This depends on how the plant runs through the community as two
people who live on the same street could reside on different nodes.
- There is also an issue regarding the interval of polling. For example, when
multiple modems were polled the period of time between modems geographically
close to one another may vary. If this time varies too much the samples
taken from the modems would have no relationship to one another.
- There is also the subject of scalability in terms of polling an increasing
number of RSUs and cable modems to determine partial node outages. Since
this number is directly proportional to the number of customers it will
require an increasing amount of resources (hardware, software, and perhaps
most importantly network bandwidth) to manage this system.
- This subject of ownership can negatively impact reliability. Since the
broadband operator does not maintain the CPE they cannot insure it is always
on. This means the customer may decide to rewire their home, shut off the
CPE when they go on vacation or when their done using it, etc. which can
lead to false alarms. Thus any information gathered cannot be taken as fact
but instead must be averaged to get a sense of the general health of each
node. Determining this general health requires intelligent handling of all
devices that can be always on or mostly on (unfortunately, none of these
devices are under the broadband operator’s control).
- Related to ownership is also the fact that it takes longer to poll a CPE
that is down than one that is working (as there are often timeouts
involved). Because CPEs will go up and down with regularity this will impact
the speed and efficiency that all the CPEs can be polled as well as
contribute to a theoretical limit as to how many CPEs can be polled by any
one NMS. However, since the NMS has no control over maintaining these CPEs
they will have to attempt to reach out and touch every CPE regardless of its
previous state as it needs any and all data it can get – even if the CPE in
question has had a long history of being down.
The reality of monitoring CPEs is that it cannot
possibly overcome all these technical challenges. This leaves only one other
option for broadband operators interested in detecting partial node outages
– end of line monitoring.
Because customer devices will never represent 100% of
the end of line for each and every fiber node there is a need to seek a more
reliable measure of this important operational aspect. Knowing the
operational status of ALL the end of lines replaces other incomplete
monitoring systems. End of line monitoring also represents a fixed number of
devices that can be modeled by location. The following challenges exist
before end of line monitoring can be realized:
- A rework of the way broadband operators label nodes is needed. This new
naming convention must allow broadband operators to reference specific sub
runs of a node rather than the existing naming convention that key on
various actives along the node. The naming convention must also help relate
sub runs to their respective run and amplifiers so as to leverage as many
existing naming conventions as possible.
- Although end of line monitoring represents a fixed number of devices the
number of end of line devices required for each node is staggering (anywhere
from 3-20+ per node). Thus each of these devices must be very inexpensive
(perhaps in the sub $20 range). This cost could be ‘partially’ offset by
broadband operators NOT placing end of line devices on portions of the plant
where no customers reside.
- End of line devices must be plant-powered. Since not all end of lines have
external power available, the cost of brining power to every single end of
line in a broadband distribution system would be cost prohibitive. The
availability of power on broadband networks presents an opportunity for
plant-powered devices over those requiring an external power source.
- End of line devices must also be compact. Ideally, end of line monitoring
agents should be no larger than a small filter and be able to screw into the
last tap on each run or sub run. The recent development of a cable modem on
a chip is a step towards one such end of line monitoring device.
- Since the number of end of line devices required to monitor a large system
could easily exceed the capacity of a carrier class NMS, great care must be
taken to select the appropriate polling frequency. The information gathered
during these polls (if used correctly) can help predict outage conditions.
- End of line devices should be mapped by GPS for their exact location so as
to direct service personnel to problem areas with the highest level of
accuracy. Current detection only tells plant maintenance personnel which
node – which means they’d need to traverse long stretches of plant and test
several points to find the problem. It would also greatly help service
personnel locate defective end of line devices in times of limited
visibility (e.g. snow storm, heavy rain, or fog).
- End of line devices must be spectrum focused rather than service focused.
Effective end of line monitoring cannot be achieved by service focused
devices (such as a telephony RSU or cable modem). This is because they only
monitor the health of their specific portion of the overall spectrum. The
best kind of end of line device would report the health of the entire
spectrum or perhaps just the portion that is questionable or failing. In
this way, the monitoring would be service independent and benefit all
broadband services.
The subject of total node visibility (TNV) is an area
no broadband operator is prepared to discuss; yet most claim multiple nines
(eg. 99.9%) in overall network reliability. Perhaps this reliability figure
only applies to the portion of the plant to which broadband operators have
visibility? Regardless, a significant portion of the plant that services
customers remains invisible. This is why broadband operators ‘must’ rely on
customers to detect plant problems and why to this day no vendor has stepped
forward with a cost-effective solution to this problem.
Summary
The only way that broadband operators can achieve their
stated reliability numbers is to seek total node visibility (TNV). TNV could
be achieved by monitoring CPEs or by installing end of line monitoring.
Although broadband operators regularly monitor CPEs, this method of TNV has
a limited shelf life and is riddled with problems. End of line monitoring
offers the only long-term solution for TNV.
Can Birds-Eye.Net help you or your Company?
Receive your Birds-Eye.Net articles and white
papers hot off
the presses by adding our RSS feed to your reader.
|
|
(C) Copyright Birds-Eye.Net, All rights reserved.
It is against the law to reproduce this content or any portion of it in any form without the explicit written permission of Birds-Eye Network Services, LLC. Federal copyright law (17 USC 504) makes it illegal, punishable with fines up to $100,000 per violation plus attorney's fees.
|