Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Removed is_timeposition_accurate as always set to true
  • Removed language as not used
  • Removed stream_type as not used
  • Moved decryption info before variable length fields
  • Added Media Keys ID field for multi-CDM instance support


ParameterSize
Shared memory buffer8Mb
Max frames to request[24]
Metadata size per frame~100bytes
Metadata region size~2.4kib
Video frame region size7Mb
Max video frame size7Mb - 2.4kb
Audio frame region size1Mb
Max audio frame size1Mb - 2.4kb
Web Audio region size-


The following diagram shows schematically how the shared memory buffer is partitioned into two regions for a playback session, for audio and video data respectively. Within each region the metadata for each frame is written sequentially from the beginning and the media data is stored in the remainder of the region, the offset & length of each frame being specified in the metadata.

...

Frame metadata is stored in a fixed format in the shm region as follows, undefined values should be set to 0. 

    uint32_t                        offset;                              /* Offset of first byte of sample in shm buffer */
    uint32_t                        length;                              /* Number of bytes in sample */
    int64_t                         time_position;                       /* Position in stream in nano-seconds */
    int64_t                         sample_duration;                     /* Frame/sample duration in ns */
    uint32_t                        stream_id;                           /* stream id (unique ID for ES, as defined in attachSource()) */
    uint32_t                        extra_data_size;                     /* extraData size */
    uint8_t                         extra_data[32];                      /* buffer containing extradata */

    uint32_t                        media_keys_id;                       /* Identifier of MediaKeys instance to use for decryption. If 0 use any CDM containing the MKS ID */
    uint32_t                        media_key_session_identifier_offset; /* Offset to the location of the MediaKeySessionIdentifier */
    uint32_t                        media_key_session_identifier_length; /* Length of the MediaKeySessionIdentifier */
    uint32_t                        init_vector_offset;                  /* Offset to the location of the initialization vector */
    uint32_t                        init_vector_length;                  /* Length of initialization vector */
    uint32_t                        sub_sample_info_offset;              /* Offset to the location of the sub sample info table */
    uint32_t                        sub_sample_info_len;                 /* Length of sub-sample Info table */
    uint32_t                        init_with_last_15;

    if (IS_AUDIO(stream_id))
    {
        uint32_t                    sample_rate;                         /* Samples per second */
        uint32_t                    channels_num;                        /* Number of channels */
    }
    else if (IS_VIDEO(stream_id))
    { 
        uint32_t                    width;                               /* Video width in pixels */
        uint32_t                    height;                              /* Video height in pixels */
    }


Metadata Format Version 2

ParameterSize
Shared memory buffer8Mb
Max frames to request[24]
Metadata size per frame

Clear: Variable but <100 bytes

Encrypted: TODO

Video frame region size7Mb
Max video frame sizeTODO
Audio frame region size1Mb
Max audio frame sizeTODO
Web Audio region size10kb


The following diagram shows schematically how the shared memory buffer is partitioned into two regions for a playback session, for audio and video data respectively. Within each region there is a 4 byte versions field indicating v2 metadata followed by concatenated metadata/frame pairs.

...

V2 metadata uses protobuf to serialise the frames' properties to the shared memory buffer. This use of protobuf aligns with the IPC protocol but also allows support for optional fields and for fields to be added and removed without causing backward/forward compatibility issues. It also supports variable length fields so the MKS ID, IV & sub-sample information can all be directly encoded in the metadata, avoiding the complexities of interleaving them with the media frames and referencing them with offsets/lengths as used in the V1 metadata format.

enum SegmentAlignment {
  ALIGNMENT_UNDEFINED = 0;
ALIGNMENT_NAL = 1;
 ALIGNMENT_AU
= 2;
}

message MediaSegmentMetadata {
    required uint32                 length               = 1;             /* Number of bytes in sample */
    required sint64                 time_position        = 2;             /* Position in stream in nanoseconds */
    required sint64                 sample_duration      = 3;             /* Frame/sample duration in nanoseconds */
    required uint32                 stream_id            = 4;             /* stream id (unique ID for ES, as defined in attachSource()) */
    optional uint32                 sample_rate          = 5;             /* Samples per second for audio segments */
    optional uint32                 channels_num         = 6;             /* Number of channels for audio segments */
    optional uint32                 width                = 7;             /* Frame width in pixels for video segments */
    optional uint32                 height               = 8;             /* Frame height in pixels for video segments */
    optional SegmentAlignment       segment_alignment    = 9;            /* Segment alignment can be specified for H264/H265, will use NAL if not set */
    optional bytes                  extra_data           = 10;            /* buffer containing extradata */
    optional bytes                  media_key_session_id = 11;            /* Buffer containing key session ID to use for decryption */
    optional bytes                  key_id               = 12;            /* Buffer containing Key ID to use for decryption */
    optional bytes                  init_vector          = 13;            /* Buffer containing the initialization vector for decryption */
    optional uint32                 init_with_last_15    = 14;            /* initWithLast15 value for decryption */
    optional repeated SubsamplePair sub_sample_info      = 15;            /* If present, use gather/scatter decryption based on this list of clear/encrypted byte lengths. */
                                                                          /* If not present and content is encrypted then entire media segment needs decryption.           */
}

message SubsamplePair

{

    required uint32_t               num_clear_bytes      = 1;             /* How many of next bytes in sequence are clear */
    required uint32_t               num_encrypted_bytes  = 2;             /* How many of next bytes in sequence are encrypted */

}


Playback Control

Rialto interactions with Client & GStreamer

...

PlantUML Macro
formatSVG
titleCobalt play/pause/set speed
@startuml

autonumber

box "Container" #LightGreen
participant Cobalt
participant Starboard
participant GStreamer_client
participant rialtoClient
end box


Cobalt             ->  Starboard:        SbPlayerSetPlaybackRate(player, rate)
note over GStreamer_client
For now we assume max 1 pipeline session in
GStreamer_client and therefore store its
Rialto handle in a local variable. This
will need to be fixed to support dual
playback.
end note

opt rate == 0
Starboard          ->  GStreamer_client: Set pipeline state PAUSED
GStreamer_client   ->  rialtoClient:     pause(pipeline_session)
else rate != 0
Starboard          ->  GStreamer_client: Set pipeline state PLAYING
GStreamer_client   ->  rialtoClient:     play(pipeline_session)
end
rialtoClient       --> GStreamer_client: status
GStreamer_client   --> Starboard:        status

Starboard          ->  GStreamer_client: Set pipeline playback rate
GStreamer_client   ->  rialtoClient:     setPlaybackRate(rate)
rialtoClient       --> GStreamer_client: status
GStreamer_client   --> Starboard:        status

Starboard          --> Cobalt:           status



opt Pause->play successful
rialtoClient       -/  GStreamer_client: notifyPlaybackState(pipeline_session, PLAYBACK_STATE_PLAYING)
else Play->pause successful
rialtoClient       -/  GStreamer_client: notifyPlaybackState(pipeline_session, PLAYBACK_STATE_PAUSED)
else Play<->pause state change failed
rialtoClient       -/  GStreamer_client: notifyPlaybackState(pipeline_session, PLAYBACK_STATE_FAILURE)
end

@enduml 


Render Frame (Video Peek)

Render frame may be called whilst playback is paused either at the start of playback or immediately after a seek operation. The client must wait for the readyToRenderFrame() callback first.

...