The Audio Data characteristic is used to send data from an RVS Server to an RVS Client.
Only a single instance of this characteristic shall exist as part of the RDK Voice Service.

The characteristic UUID shall be set to: TBD

Audio Data Characteristic Descriptors

Client Characteristic Configuration Descriptor

A Client Characteristic Configuration descriptor shall be included in the Audio Data characteristic.

The Audio Data Characteristic value consists of 20 octets of encoded audio data. 
When the audio data streaming is enabled via the Audio Control Characteristic and notifications are enabled in the Client Characteristic Configuration descriptor then 20 octets of Audio Data shall be sent as a GATT notification. 
Audio data shall be stored in frames, each frame is a multiple of 20 octets and will be segmented and sent in order within a GATT notification. The format and size of the frame depends on the codec used for encoding the audio data. Only complete frames shall be sent in fixed sized batches of GATT notifications. If the Remote Control Device has to discard data (i.e. in noisy environments) it shall discard a complete audio frame.

G.726 and IMA/DVI ADPCM Frame Format

Both G.726 and IMA/DVI ADPCM Frames shall be 100 octets in length; consisting of 96 octets of encoded audio sample data and 4 octets of metadata. 
Figure 5 shows the layout of an audio frame and the how it is sent in 20 octet GATT notifications. 

The 4 octet meta data header contains information to be used by the decoder, Table below shows the format of the meta data. 

Octet

Description

0

Frame sequence number

1

Index into stepsize table for the start of the frame

2-3

Predicted value of first sample from the previous frame (little endian byte order)


Figure below shows the layout of the start of an ADPCM audio frame. 



The sequence number shall be incremented for every audio frame encoded, even if the frame was discarded (i.e. in noisy environments). The sequence number shall wrap back to 0x00 after 0xFF. 
Both audio encoders shall use 4 bits per sample, and only one audio channel shall be sampled. The 4-bit samples are packed with the first sample in the four most significant bits and the second sample in the four least significant bits.

Audio Frame Buffering

Audio data is always buffered in 100 octet frames, and the device shall ensure that only complete frames are sent (as 5 x 20 octet GATT notifications). No partial frames shall ever be transferred. This ensures that the STB host can always determine where an audio frame starts and ends.  The RVS Server shall be able to buffer a minimum of 2 audio frames.  If frame buffers are exhausted then the complete frame shall be discarded, however the frame sequence number should still be incremented for the next encoded frame.  All buffered content shall be purged when either a disconnection event occurs, audio streaming is disabled via the Audio Control Characteristic or notifications are disabled for the characteristic.


  • No labels