You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 27 Next »

This Page is under Development

1.Introduction


Selfheal
is another feature implemented in Test And Diagnostic Component. SelfHeal is a monitoring and recovery module.

Self-heal Periodically monitors the below scenarios:

  • CPU usage
  • Memory Usage
  • Critical RDK-B processes

Self-heal stores Reset Count and Reboot Count.
Self-heal takes required action like: Rebooting the device, Restarting required process based on predefined conditions.
Self-heal does connectivity test.

2.Design Considerations

Self Heal functionality is handled by a set of scripts. These scripts are available in the RDK-B RPI build by default, and customised to rpi system specificification referring to actual devices.

Please ensure that below Self heal scripts are present on the device at the path "/usr/ccsp/tad".

  • resource_monitor.sh

  • task_health_monitor.sh

  • corrective_action.sh

  • self_heal_connectivity_test.sh

Resource Monitoring

       "resource_monitor.sh" script is used for monitoring Memory and CPU usage.Monitors the resources periodically (eg: 60 seconds). If "Average Memory Used" reaches threshold value, reboot action will be executed. 

For Resource Monitor Sequence ,

                          i)  First cycle onwards -  sleep will calculate based on below commands

                                                   Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_UsageComputeWindow *60 .  For example : By default, RMInterval value as 1 . so, sleep will be 60.

Process Monitoring

  1.         "task_health_monitor.sh" script is used for monitoring all RDKB processes .Monitors the processes periodically (eg:- 60 seconds) based on it's process id (pid).
  2.           Based on the process id availability, required action will be taken such as restarting the process, rebooting the device.

       Important points to remember  :                 

  • Ccsp processes: If any of these processes crashed, it will be restarted via Self Heal.
  • "CcspCrSsp": If this process is crashed, device will be rebooted.
  • "syseventd": If syseventd is crashed, device will be rebooted.


For  task monitor sequence,

                          i)  First cycle onwards -  sleep will calculate based on below commands

                                                   Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_UsageComputeWindow *60 .  For example : By default, RMInterval value as 1 . so, sleep will be 60.


Connectivity Test

         "self_heal_connectivity_test.sh" script is used for ping test.Ping test will be done through server IP/URI (this needs to be configured). If server IP/URI is not configured, Ping test won't be executed and no action will be taken. If server is configured and ping test fails, device will stop the LAN functionality.

For Connectivity Test  sequence ,

                            i) After boot-up,Very  First cycle  - random sleep functionality call was called.

                           ii) Second cycle onwards - sleep will calculate based on below commands

                                                  Device.SelfHeal.ConnectivityTest.X_RDKCENTRAL-COM_PingInterval - value of this command * 60 . For example : By default, PingInterval value as 60 . so, sleep will be 3600.

SelfHeal Logs

          Self heal logs will be created on below folder,

                             /rdklogs/logs/SelfHeal.txt.0

Architecture

Self Heal DataModel Flow 

                             


Process Monitor Flow 

                                                    

    

Data Model

Lists of self heal supported DataModel commands 


S.NOModuleDMCLI COMMANDSDescription
1.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.X_RDKCENTRAL-COM_EnableUsed to enable/disable self heal functionality
2.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.X_RDKCENTRAL-COM_MaxRebootCountUsed to set the maximum reboot count for rebooting the rpi device once the cpu and memory threshold value was reached as 100(default value).   By default, it set as 3.  If it reaches 3 ..after that it doesn't do the reboot functionality. If we want, we can increase the reboot count also.
3.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.X_RDKCENTRAL-COM_MaxResetCountUsed to set the maximum reset count for connectivity test. for example, if  it reaches 3(3 times it stops the lan functionality), after that it doesn't stop the lan functionality. If we want , we can increase the reset count also.
4.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.X_RDKCENTRAL-COM_DNS_PINGTEST_EnableUsing this command to enable the PING function for connectivity tests.By default, it set as TRUE.
5.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.X_RDKCENTRAL-COM_DNS_URLUsing this command to set the DNS url for PING function for connectivity test , By default, it set as www.google.com
6.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.ConnectivityTest.X_RDKCENTRAL-COM_PingIntervalUsing this command to set the PING interval time for connectivity test . By default, it set as 60.  Range of ping interval is min 15 to max 1440.
7.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.ConnectivityTest.X_RDKCENTRAL-COM_CorrectiveActionUsing this command to enable/disable for Corrective Action for self heal scripts. By default, it set as TRUE.
8.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_UsageComputeWindowUsing this command to set the resource monitor interval time. By default, it set as 1.
9.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgCPUThresholdUsing this command to set the AVG CPU threshold value. By default, it set as 100
10.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.ResourceMonitor.X_RDKCENTRAL-COM_AvgMemoryThresholdUsing this command to set the AVG Memory threshold value. By default, it set as 100
11.

TDM(TestandDiagnostic)

XML Mapper  -  TestAndDiagnostic.XML

Device.SelfHeal.ConnectivityTest.X_RDKCENTRAL-COM_RebootIntervalUsing this command to set the reboot interval time for connectivity test. By default, it can be set as 28800. If DNS or WAN_IP gets down, device will stop the LAN functionality. If device will stop the LAN functionality..Internally PING functionality will check the diff of current time and last reboot time will be greater than the reboot interval time..then only device will stop the LAN functionality.
12.

PAM Module 

XML Mapper -

TR181-USGv2.XML

Device.DeviceInfo.X_RDKCENTRAL-COM_LastRebootReason

Using this command to know about why our rpi device will be rebooted,

This command value will be the current reboot status.

Limitations

             RPI doesn't have IPv6 support functionality. So we skip the ipv6 logics  from "self_heal_connectivity_test.sh" and "task_health_monitor.sh".

            

  • No labels