Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Download Word Version


Image Removed
BLE RDK Voice Service Specification
RDK-SP-BLE-RV-Service-D02-200120
Document Status: Draft
January 20, 2020
Document Status

...

Document Control Number:

PDF
name

RDK-SP-BLE-RV-Service-D02-200120

...

Document Title:

...

BLE RDK Voice Service Specification

...

Versions:

...

D01 – June 20, 2019
D02 – January 20, 2019

...

Date:

...

January 20, 2020

...

Status:

...

Document Status: Draft

...

Distribution:

...

RDK members only

...

Tables
Table 1 - Typographical Conventions
Table 2 - Terms and Definitions
Table 3 - Abbreviations and Acronyms
Table 4 - GATT Sub-Procedure Requirement
Table 5 - Voice Service Characteristics
Table 6 - Audio Codecs Supported Bit Mask
Table 7 - Audio Control Characteristic Value
Table 8 - Audio Control Characteristic Encoding Values
Table 9 - Audio Control Characteristic Enable Values
Table 10 - Audio Response Values
Table 11 - Audio Keyword Detect Characteristic
Table 12 - Audio Beamformer Characteristic
Table 13 - Beam Data Field
Table 14 - Beam Description Byte
Table 15 - Audio Frame Metadata Format

Figures
Figure 2 – Push-to-talk Voice Session
Figure 3 - Explicit Reject of Voice Session
Figure 4 - Far-Field Voice Session
Figure 5 - G.726 and IMA/DVI ADPCM Frame Format
Figure 6 - G.726 and IMA/DVI ADPCM Frame Metadata

...

The RDK Voice Service is a service for transmitting voice data over BLE in the RDK ecosystem. The RDK Voice Service is specifically designed to enable voice control of a BLE voice client.

...

This document defines detailed requirements for the RDK Voice Service. It is intended to specify transmission of voice data from an RVS server to an RVS client.

...

Typeface

Usage

Boldface

Used to call attention to a piece of information. For example:
This specification does not include headend diagnostic screens.

Boldface & Uppercase

Used to emphasize information and for readability. For example:
ENTER, MUTE, INFO, VOL +/- and other buttons on the remote control.

Italics

Used to emphasize that the information being presented is for informational purposes only and is not a requirement even though it may contain conformance language. For example:
Note: The voice controller uses the Channel Check Request to verify that the voice target has disabled frequency agility.

Uppercase

Used to define and signify a requirement. For example:
MUST, SHOULD, and MAY.

...

Version

Date

Author

Remarks

D01

09 Aug 2019

Comcast

Initial Version

D02

20 Jan 2020

Comcast

Clean-up

...

Reasonable effort is made to keep references up to date with respect to versions and release dates, however manufacturers are responsible for ensuring they have the most recent version of a reference specification (unless otherwise noted).
Where conflicts exist between requirements contained in this specification and normative references, the specification requirements govern.

...

Wiki Markup
<ac:structured-macro ac:name="anchor" ac:schema-version="1" ac:macro-id="d3a42f34-7f2c-47f2-82c4-ea1ed2c6335e"><ac:parameter ac:name="">ADPCM</ac:parameter></ac:structured-macro> \[ADPCM\] Recommended Practices for Enhancing Digital Audio Compatibility in Multimedia Systems, IMA Digital Audio Focus and Technical Working Groups; DATWG Recommendation, October 21, 1992. 
\[BLUETOOTH\] Bluetooth Core Specification version 4.0 or later 
\\

...

Term

Definition

Adaptive Differential Pulse-Code Modulation

A compression algorithm that varies the size of the quantization step, allowing further reduction of the required bandwidth for a given signal-to-noise ratio.

Opus

 

...

Abbrv

Acronym

ADPCM

Adaptive Differential Pulse-Code Modulation

RVS

RDK Voice Service

...

The RDK Voice Service exposes data and associated formatting for streaming voice audio from an RDK Remote Control Device to an RDK based STB.

...

All capabilities indicated as mandatory for this Service shall be supported in the specified manner (process-mandatory). This also applies for all optional and conditional capabilities for which support is indicated.

...

This service is not dependent upon any other services.

...

Wiki Markup
This specification is compatible with any Bluetooth core specification as defined in \[BLUETOOTH\] that includes the Generic Attribute Profile (GATT) specification and the Bluetooth Low Energy Controller specification. 

...

GATT Sub-Procedure

Requirement

Read Characteristic Value

M

Write Characteristic Value

M

Write Without Response

O

Notification

M

Read Characteristic Descriptors

M

Write Characteristic Descriptors

M

...

The service shall only operate over an LE transport.

...

This service does not define any application error codes that are used in Attribute Protocol.

...

All characteristics used with this service shall be transmitted with the least significant octet first (i.e., little endian).

...

Service Roles

A remote control or similar low power device enabled with one or multiple microphones should function as an RVS Server.
A settop box or other host capable of processing transmitted voice data should function as an RVS Client.

...

Characteristic Name

Requirement

Mandatory Properties

Optional Properties

Security Permissions

Audio Codecs

M

Read

 

None

Audio Gain

O

Read, Write, Write Without Response

 

None

Audio Control

M

Read, Write, Write Without Response

 

None

Audio Response

O

Write

 

None

Keyword Data

O

Notify

 

None

Beamformer Data

O

Notify

 

None

Audio Data

M

Notify

 

None

Notes:

  • Security Permissions of "None" means that this service does not impose any requirements.
  • Profiles utilising this Service may impose security requirements beyond those defined in Table 2.1 for all characteristics defined in Table 2.1.
  • Properties not listed as mandatory (M) or optional (O) are excluded.

 

 

 

 

...

Audio Codecs Characteristic Value

...

Bit

...

Audio Codec

...

Bits per Sample

...

Audio Sample Rate

...

Audio Channels

...

0

...

G.726-32 ADPCM

...

4

...

16000 per second

...

Single (Mono)

...

1

...

IMA/DVI ADPCM

...

4

...

16000 per second

...

Single (Mono)

...

2

...

Opus

...

 

...

 

...

Single (Mono)

...

2 - 31

...

Reserved for future use

...

The Audio Gain characteristic is used to expose the gain level of the microphone used for voice capture.
Only a single instance of this characteristic shall exist as part of the RDK Voice Service.
The characteristic UUID of the Audio Gain shall be set to:
0000EA01-BDF0-407C-AAFF-D09967F31ACD

...

The Audio Gain Characteristic value is an unsigned 8-bit value that contains the current audio gain value. The minimum value is 0 and the maximum value is 64.
The Audio Gain characteristic value shall be persistent across connections for bonded devices. The default value for the Audio Gain Characteristic value is vendor specific. Upon connection of non-bonded clients, this characteristic value is set to the default value.

...

Audio Control Characteristic Value

...

Name

Requirement

Format

Default Value

Audio Encoding

Mandatory

uint8

0x00

Audio Enable

Mandatory

uint8

0x00

...

Value

Description

0

Audio is to be encoded using the G.726 ADPCM codec

1

Audio is to be encoded using the IMA/DVI ADPCM codec

2

Audio is to encoded using the Opus codec.

2 - 255

Reserved for future use

...

Value

Description

0

Disable audio streaming

1

Enable audio streaming

2 - 255

Reserved for future use

...

When the Audio Enable Setting is toggled to 0x01 then audio data shall start to be sent via the Audio Data Characteristic provided notifications are enabled for the characteristic.

...

The Audio Response characteristic is used to allow the RVS Client to provide a reason to the RVS Server for not initiating a voice session.
The Audio Response characteristic is optional.
The characteristic UUID shall be set to **.

...

Session Response Enum

Value

Description

Busy

0x1

The device is performing other tasks that prevent it from beginning a voice session. An example of this could be another voice session from a different device is in progress already or a captive firmware update is in progress.

Voice Server Not Ready

0x2

An associated endpoint to send voice data to is not available for a voice session.

Not Supported

0x3

Device does not support voice control.

Failure

0x4

An unspecified failure has occurred on the device preventing it from beginning a voice session.

Reserved

0x5-0xFF

Reserved for future use.

...

If the RVS Server supports the Audio Response characteristic, the RVS Client MAY write the Audio Response Characteristic if it is unable to accept notifies on the Audio Data characteristic.

...

The Keyword Detect characteristic is used to allow the RVS Client to provide information to the RVS Server about a keyword detect that initiated the voice session.
The Audio Keyword Detect characteristic is optional.
The characteristic UUID shall be set to:
TBD

...

A Client Characteristic Configuration descriptor shall be included in the Keyword Detect characteristic.

...

Field

Size

Description

Pre-Keyword Sample Qty

4 Octets

The Pre-Keyword Sample Qty is a value that represents the number of samples available prior to the start of the keyword.

Keyword Sample Qty

4 Octets

The Keyword Sample Qty is a value that represents the number of samples contained in the keyword.

Estimated Direction of Arrival

2 Octets

The Estimated Direction of Arrival is a value that represents the angle at which the audio is being captured from (0-360 deg).

Standard Search Point

1 Octet

The Standard Search Point represents the sensitivity of the first level keyword detector.

High Search Point

1 Octet

The High Search Point represents the sensitivity of the second level keyword detector. This value can be set to 0xFF if the second level keyword detector does not exist or is disabled.

Dynamic Gain

1 Octet

The Dynamic Gain represents the amount of gain applied to the audio stream.

...

If the RVS Server supports the Audio Keyword Detect characteristic, the RVS Server MAY notify on the Audio Keyword Detect Characteristic if the audio session was initiated by a keyword before the Audio Data characteristic is notified on.

...

The Audio Beamformer Data characteristic is used to allow the RVS Server to provide information about beamformers used in the voice session.
The Audio Beamformer Data characteristic is optional.
The characteristic UUID shall be set to:
TBD

...

Field

Size

Description

Beam Data 1

5 Octets

Beam 1 data.

Beam Data 2

5 Octets

Beam 2 data.

Beam Data 3

5 Octets

Beam 3 data.

Beam Data 4

5 Octets

Beam 4 data.

...

Bits 8

16

16

Beam Description

Confidence

Signal Noise Ratio

...

Bits 0-3

Bit 4

Bit 5

Bits 6-7

Reserved

Selected

Triggered

Angle (0, 90, 270, 360)

...

A Client Characteristic Configuration descriptor shall be included in the Beamformer Data characteristic.

...

If the RVS Server supports the Beamformer Data characteristic, the RVS Server MAY notify on the Beamformer Data Characteristic if the audio session was initiated by a keyword before the Data characteristic is notified on.

...

The Audio Data characteristic is used to send data from an RVS Server to an RVS Client.
Only a single instance of this characteristic shall exist as part of the RDK Voice Service. The characteristic UUID shall be set to:
0000EA03-BDF0-407C-AAFF-D09967F31ACD

...

A Client Characteristic Configuration descriptor shall be included in the Audio Data characteristic.

...

The Audio Data Characteristic value consists of 20 octets of encoded audio data.
When the audio data streaming is enabled via the Audio Control Characteristic and notifications are enabled in the Client Characteristic Configuration descriptor then 20 octets of Audio Data shall be sent as a GATT notification.
Audio data shall be stored in frames, each frame is a multiple of 20 octets and will be segmented and sent in order within a GATT notification. The format and size of the frame depends on the codec used for encoding the audio data. Only complete frames shall be sent in fixed sized batches of GATT notifications. If the Remote Control Device has to discard data (i.e. in noisy environments) it shall discard a complete audio frame.

...

Octet

Description

0

Frame sequence number

1

Index into stepsize table for the start of the frame

2-3

Predicted value of first sample from the previous frame (little endian byte order)

...

Audio data is always buffered in 100 octet frames, and the device shall ensure that only complete frames are sent (as 5 x 20 octet GATT notifications). No partial frames shall ever be transferred. This ensures that the STB host can always determine where an audio frame starts and ends.
The RVS Server shall be able to buffer a minimum of 2 audio frames.
If frame buffers are exhausted then the complete frame shall be discarded, however the frame sequence number should still be incremented for the next encoded frame.
All buffered content shall be purged when either a disconnection event occurs, audio streaming is disabled via the Audio Control Characteristic or notifications are disabled for the characteristic.

.pdf