Overview

Maintenance Manager is one of the WPEFramework Thunder Plugin that manages the daily maintenance activities on RDK device. 

Source Code

RDKVMaintenanceManager (rdkservices main)
RDKEMaintenanceManager (entservices-softwareupdate main)

Terminology

Daily Maintenance

Set of Activities (or tasks) that needs to be executed daily without impacting the UX. Executed when the user is not actively using the CPE

Maintenance ActivityActivity (or Task) that is a part of Daily Maintenance. They are considered critical and wake up the Device from DEEPSLEEP power state to be executed
DCMDevice Configuration Manager, it fetches and updates configuration for Device
RFCRDK Runtime Feature Control, is used for remotely controlling CPE software features on RDK platforms. Release Management uses RFC for staged rollout of new features.
Log UploadUploads all the logs at the time specified in DCMSettings.conf  or when the size reaches predefined limit
FW Download (swupdate/ rdkfwupgrader)Checkis if there is a new firmware update is required and performing the update on the device (if needed)

Types of Maintenance

Unsolicited Maintenance

Also known as Maintenance on Bootup. RDK automatically performs a Maintenance as soon as the device boots up. Unsolicited maintenance occurs on each and every bootup. Device will look for critical updates on boot and performs maintenance of critical activities without application triggering maintenance. The isRebootPending will be set to false unless there is an RFC/FWUpdate needed. Below are the tasks executed as part of unsolicited maintenance.

Solicited Maintenance

Also known as Scheduled Maintenance/ Daily Maintenance. This is executed daily based on Maintenance Start Time triggered by Application. The isRebootPending is set to true irrespective of XConf configuration. Below are the tasks executed as part of Solicited Maintenance

Note: Both Unsolicited and Solicited maintenance cannot be triggered at the same time. If a Solicited Maintenance is started while an Unsolicited Maintenance is in progress, then the Solicited Maintenance will be notified and postponed to the next day.

Functionality Observations

WhoAmI

WhoAmI is used by device to determine the operating context (partner, product/experience, Regional configService) and load the right firmware so that we can activate and use the service that user signed up for. It happens only in Unsolicited Maintenance. It is not defaulted in RDKE yet (as of 25Q2). WhoAmI has dependency on components such as AuthService which has to be refactored before making it suitable for community builds usage.

Maintenance Activity/ Task

Activity/ Task NameRDKVRDKE
WrapperTaskWrapperTask
RFC/lib/rdk/Start_RFC.sh/lib/rdk/RFCbase.sh/lib/rdk/StartMaintenanceTasks.sh RFCENABLE_RFC_MANAGER flag Enabled
/usr/bin/rfcMgr
ENABLE_RFC_MANAGER flag Disbled
/lib/rdk/RFCbase.sh





SWUPDATE/lib/rdk/swupdate_utility.shFWDOWNLOAD RFC Enabled/lib/rdk/StartMaintenanceTasks.sh SWUPDATE/usr/bin/rdkvfwupgrader 0 1
/usr/bin/rdkvfwupgrader 0 1
FWDOWNLOAD RFC Disabled
/lib/rdk/deviceInitiatedFWDnld.sh





LOGUPLOAD/lib/rdk/Start_uploadSTBLogs.sh/lib/rdk/uploadSTBLogs.sh/lib/rdk/StartMaintenanceTasks.sh LOGUPLOAD/lib/rdk/uploadSTBLogs.sh

XConf Settings

The DCMScript.sh (in RDKV) or dcmd (in RDKE) will parse the /tmp/DCMSettings.conf file to get XConf Data fetched by Telemetry. Data likeurn:settings:CheckSchedule:cron is used to get start_hr and start_min values and urn:settings:TimeZoneMode is used to get the tz_mode data. Device is expected to receive the time based on the time zone confiigured in the device. This parsed data is then stored to /opt/rdk_maintenance.conf file by DCM. The data can be used to calculate the Maintenance Start Time for the next scheduled Maintenance (SOLICITED). The Maintenance Start Time is calculated using /lib/rdk/getMaintenanceStartTime.sh (in RDKV) or using CalculateStartTime() API in MaintenanceManager internally.

Architecture Overview (HLA)

Activity Diagram

@startuml
start
:Initialize();
:InitializeIARM();
:MAINTENANCE_IDLE;
if (isDeviceConnectedToInternet?) then (yes)
else (Not Connected)
    repeat
        :Sleep 30s;
    repeat while (Retry Count < 4)
endif
if (Network Plugin Active?) then (yes)
    :Subscribe to Internet Status Event;
else (no)
    :Internet Status Event Subscription Failed;
endif

:task_execution_thread;
if (isDeviceOnline?) then (yes)
else (no)
    :MAINTENANCE_ERROR;
    stop
endif

group Task Execution (RFC/ SWUPDATE/ LOGUPLOAD)
    :Start RFC/ SWUPDATE/ LOGUPLOAD Task;
    :Start Timer;
    repeat
        :Execute system() for RFC/ SWUPDATE/ LOGUPLOAD Task;
        if (RFC/ SWUPDATE/ LOGUPLOAD success? before timer ends) then (yes)
            :Stop Timer;
            :MAINTENANCE_<RFC/FWDOWNLOAD/LOGUPLOAD>_COMPLETE;
        else (no, fail or timeout)
            if (System Call Failed) then (yes)
                if (Retry Count < TASK_RETRY_COUNT) then (yes)
                    :Retry after TASK_RETRY_DELAY;
                    :Decrement Retry Count;
                    :Restart Timer;
                else (no more retries)
                    :MAINTENANCE_<RFC/FWDOWNLOAD/LOGUPLOAD>_ERROR;
                    :Stop Timer;
                    break
                endif
            else (RFC/ SWUPDATE/ LOGUPLOAD Task Timed Out)
                :MAINTENANCE_<RFC/FWDOWNLOAD/LOGUPLOAD>_ERROR;
                :Stop Timer;
                break
            endif
        endif
    repeat while (Retry Count > 0)
end group

if (Any at least One Task of RFC/ FWDOWNLOAD/ LOGUPLOAD failed) then (yes)
    :MAINTENANCE_ERROR;
else (no)
    :MAINTENANCE_COMPLETE;
endif
:Task Execution Completed;
:Thread Join;
:Deinitialize();
if (Timer Exists) then (yes)
    :Delete Timer;
    if (Signal Registered) then (yes)
        :Unregister Signal;
    endif
endif
stop
@enduml

High Level Sequence Diagram

@startuml
participant "Maintenance Manager" as mm
participant "Network Manager" as nm
participant "WPEFramework" as wpe
participant "RFC service" as rfc
participant "swupdate" as swu
participant "Log Upload" as lu

wpe -> mm: Activate Maintenance Manager Plugin
note over mm: MAINTENANCE_IDLE

== Network check ==
mm -> nm: isDeviceConnectedToInternet?
alt Success case
    nm --> mm: Connected
else Failure case
    nm --> mm: Not Connected
    loop 4 times max
        mm -> mm: Sleep 30s
        mm -> nm: isDeviceConnectedToInternet?
    end
end

alt No internet?
    mm --> mm: MAINTENANCE_ERROR
else Internet available
    mm -> mm: proceed to execution thread
end

note over mm: MAINTENANCE_STARTED

== Maintenance tasks ==
mm -> rfc: Start RFC
rfc --> mm: RFC Complete/ Error

mm -> swu: Start SWU
swu --> mm: SWU Complete/ Error

mm -> lu: Start Log upload
lu --> mm: Log upload Complete/ Error

note over mm: MAINTENANCE_COMPLETE/ MAINTENANCE_ERROR
@enduml

Maintenance Status Notifications

Distros and RFCs used in Maintenance Manager

Notable features

  1. Network check - Before triggering maintenance tasks, first it will check for internet connectivity in the device. If the device is connected with internet, it will proceed to execute maintenance tasks otherwise it will wait for 2 minutes and if still it has no internet connection, it will exit from maintenance and status will be MAINTENANCE_ERROR
  2. Maintenance Manager will subscribe for network event if device is not able to acquire network in 2mins before exiting from maintenance activities. Once device got internet, Maintenance Manager will be notified, and it will trigger RFC and xconfImageCheck.sh
  3. StopMaintenance is allowed only when the maintenance is in progress and it will set the status as MAINTENANCE_ERROR
  4. StartMaintenance is allowed only when the maintenance is already not in progress.
  5. Who Am I integration changes are done in Maintenance Manager to acquire the device Initialization context

Logs

Maintenance Manager logs can be found in wpeframework log. 

Grep string: grep -inr "MaintenanceManager.cpp" /opt/logs/maintenance.log*