Axiros | Open Device & Service Management

View Original

NQI - One Of Many DOCSIS Health Indicators For Cable Modems In HFC Networks

#DOCSIS # PNM #ProactiveNetworkMaintenance #DeviceManagement #AI #Tech #DataScience

In their paper "Performance Monitoring Challenges in HFC Networks", Milan et al. say that collecting big data and processing it is integral to calculating the health of the HFC network of operators. Their KPIs include some interesting ones such as Flap List and SNR - let's take a closer look at them. Specifically, we will pay attention to a single cable modem (CM) and how to evaluate its health.

Assessing the health of a cable modem can be done in a multitude of ways. A naive way could be to fetch the downstream SNR of all channels on a given cable modem directly through SNMP, average across those channels and define a threshold under which the cable modem is deemed of marginal health or even bad health.

Another idea, in case pre-equalization is available, could be taking the main tap ratio (MTR) of the pre-equalization coefficients and, following the official CableLabs PNM guidelines and using the MTR thresholds defined in those guidelines, to classify the health of cable modems.

An interesting honorable mention is the Flap List. If one is lucky, the CMTS already manages the heavy load and is able to return a readily available list of bad cable modems.

The indicator we will be taking a close look at today is NQI, as proposed by Tungsakul et al. The main point of their approach is to take in a larger set of KPIs and build the weighted sum of them to create a single health indicator called NQI-9 (since they used 9 KPIs). Now, for us this paper was more of an inspiration - by adding more KPIs and adapting the weights, we could heavily improve it, making it much more predictive of actual issues. We call our approach NQI-X. The metric itself is a percentage which makes it easy to understand and is suitable for aggregation. For example, aggregations per CMTS/Interface/Fiber Node are possible. Additionally, comparisons between different environments are now easily possible. An environment with a 80% NQI-X is much worse than one with 99% NQI-X. Another interesting use-case: It is expected that sharp historical drops in the aggregated NQI-X of a CMTS for example could indicate a major sudden issue for a sizable chunk of the network.

In the approaches discussed so far, NQI can be seen as an extension of the first idea. It does not use pre-equalization data all at (though that is an interesting idea...).

The KPIs proposed by Tungsakul et al. are:

  • Downstream SNR

  • Upstream SNR

  • T3 timeout

  • T4 timeout

  • Downstream CER

  • Upstream CER

  • Downstream Receive Power (Rx)

  • Upstream Transmit Power (Tx)

  • Upstream Receive Power (Rx)


Note that data from both the CMTS and the CM needs to be collected.

The paper gives a nice example on how the weighted sum works. Given a CM with all green (good) KPIs except for the Downstream CER, which is red (bad), the following NQI in percent is calculated:

This calculation can now be scheduled every X minutes, where X is completely based on the customers wants and the capabilities of their CMTSes, enabling a historical overview of a single CM. For instance, 5 minutes or 60 minutes schedules are both valid scenarios. After aggregation, even sub-networks such as a whole fiber node or a whole CMTS can be compared and monitored. This can then be used to prioritize network maintenance.

Having seen a few indicators and especially the NQI indicator for a single cable modem health and the KPIs required to calculate it, it should be clear by now that Milan et al. had the right idea in 2017. Axiros is one to help here - with a large set of deep know-how sitting in the junction between Telco domain knowledge (DOCSIS/HFC) and Computer Science including Big Data (refer to other posts on the Tech Blog), we have the means to supercharge your customer experience by using the data (often large quantities such as 2M CMs) that is readily available, but which needs a smart way to handle it.


*Figure directly from Tungsakul et al.