Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

"task_health_monitor.sh" script is used for monitoring RDK-B processes. This is located at path: "/fss/gw/usr/ccsp/tad/task_health_monitor.sh". We can monitor any RDK-B processes by adding the process pid in this script.

Self-heal stores Reset Count and Reboot Count

...

In Raspberry Pi the functionality of self-heal feature is provided by systemd.


Code Flow

draw.io Diagram
diagramNameCODE FLOW 1.drawio
revision1
Image Removed

Resource Monitoring - resource_monitor.sh

draw.io Diagram
diagramNameRESOURCE MONITORING DIAG.drawio
revision1
Image Removed

  • resource_monitor.sh monitors the Memory and CPU usage
  • Average memory and CPU thresholds will be obtained from syscfg.db (default avg_cpu_threshold:100, avg_memory_threshold:100)

...

Process Monitoring - task_health_monitor.sh

draw.io Diagram
diagramNameprocess monitoring.drawio
revision1
Image Removed

  • task_health_monitor.sh monitors the status of various taks periodically and takes the corrective action
  • Default monitoring interval is 15mins and can be modified using resource_monitor_interval in syscfg.db
  • Monitors
    • Health of peer processor, in case of dual core processors
    • Other tasks added as part of the script
  • New tasks can be added by editing the script

...

Connectivity Test - self_heal_connectivity_test.sh

draw.io Diagram
diagramNameconnectivity test.drawio
revision1
Image Removed

  • Self_heal_connectivity_test.sh will run Ping and DNS tests.
  • ConnTest_PingInterval in syscfg.db specifies the frequency of the connectivity test.
  • If nothing specified, it is 60seconds by default.

...

  • runPingTest
    • Gets the IP (default_router IP) from syscfg.db
    • If no IP specified, it will try pinging to default gw
    • If ping fails, takes the corrective action, which is none by default.
  • runDNSPingTest
    • This is disabled by default. Can be enabled by selfheal_dns_pingtest_enable in syscfg.db
    • Gets the urlToVerify from syscfg.db
    • If nslookup fails, takes the corrective action, which is none by default

Objects

Self heal objects in its DML layer: 

Panel
Device.SelfHeal.X_RDKCENTRAL-COM

Self heal can be Enabled/disabled by the below data model. By default, it is enabled

Code Block
$ dmcli eRT getv Device.SelfHeal.X_RDKCENTRAL-COM_Enable
CR component name is: eRT.com.cisco.spvtg.ccsp.CR
subsystem_prefix eRT.
getv from/to component(eRT.com.cisco.spvtg.ccsp.tdm): Device.SelfHeal.X_RDKCENTRAL-COM_Enable
Execution succeed.
Parameter    1 name: Device.SelfHeal.X_RDKCENTRAL-COM_Enable
               type:       bool,    value: true


Verify the selfheal feature running status

Code Block
$ ps -Af | grep -i self
 4449 root       0:00 {self_heal_conne} /bin/sh /usr/ccsp/tad/self_heal_connectivity_test.sh
18921 root       0:00 grep -i self

Resource monitoring

The Below DM is used to verify the Average CPU threshold. By default the value is set to 100

Code Block
$ dmcli eRT getv Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgCPUThreshold
CR component name is: eRT.com.cisco.spvtg.ccsp.CR
subsystem_prefix eRT.
getv from/to component(eRT.com.cisco.spvtg.ccsp.tdm): Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgCPUThreshold
Execution succeed.
Parameter    1 name: Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgCPUThreshold
               type:       uint,    value: 100


The Below DM is used to verify the Average Memory threshold. By default the value is set to 100

Code Block
$ dmcli eRT getv Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgMemoryThreshold
CR component name is: eRT.com.cisco.spvtg.ccsp.CR
subsystem_prefix eRT.
getv from/to component(eRT.com.cisco.spvtg.ccsp.tdm): Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgMemoryThreshold
Execution succeed.
Parameter    1 name: Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgMemoryThreshold
               type:       uint,    value: 100