VAS_VOD Audio Transcoding

Last update:2022-04-13 13:26:58

1 VAS Intro

1.1 Brief Introduction

At present, the common audio encapsulation formats in media streaming are MP3, AAC, WMA, OGG, etc. Each encapsulation method has a unique characteristic, and it is applied in different business scenarios.

In order to help our customers to reduce technical investment and adapt to different application scenarios and different terminals, CDNetworks has rolled out audio transcoding feature, which supports any input-and-output in MP3, AAC, WMA and other audio encapsulation formats.

Audio transcoding includes the following features: audio format conversion, audio trans-rating, audio channel and sampling rate setting.

1.2 Applicable Product Line

• Media Acceleration

1.3 Application Scenarios

  • It applicable to the scenario that convert the audio input in multiple formats into audio output in mp3 formats. So, it is applicable to cross-platform, cross-terminal playback;
  • It is applicable to the scenario that that input is video file or audio file, and the output is audio only with audio transcoding configured. It applicable to the audio playback in weak network.
  • Insert audio clip to protect copyrights. For example, the customer has the demand to insert a clip which could be an advertisement audio or the audio that marks identity (it is kind of watermark). Under normal playing scenarios, when the audio file is requested by the original APP or player, the clip content will not be played; and if there is hotlinking, in most cases the audio file is requested by 3rd party player, the full audio file (with the clip inserted) will be played. CDNetworks offers service of inserting the audio clip that marks identity.
  • Volume Settings to ensure that the volume is consistent when playing different episodes of a series. Transcoding is required when writing the volume parameters into audio file, and the parameters should be complied to a unified standard and recognizable to player, so they will be read by the player and the consistency of volume is guaranteed for different files playback.

2 Feature Detail

Audio transcoding supports common video/audio encapsulation formats and encoding formats:
(1) Video formats: MP4, FLV, M3U8, TS, MKV, MOV, WMV, AVI, VP8, VP9, RealVideo, Windows Media Video, etc.
(2) Audio formats: AAC, AC-3, MP1, MP2, MP3, PCM, RealAudio, Windows Media Audio (WMA), OGG, etc.
(3) Encoding Formats: H.265, H.264, H.263, MPEG, etc.

2.1 Working Process

【Console 更新】 缓存优化配置界面更新

Figure 1 Audio Transcoding working Process

(1) The customer calls the cloud storage API to upload the audio file and assigns a transcoding instruction;
(2) After receiving the transcoding task, the cloud storage assembles customer transcoding instructions and sends them to the cloud transcoding;
(3) Cloud transcoding fetches the audio file from the cloud storage for transcoding operation, and uploads the processed audio file to the cloud storage;
(4) Cloud storage notifies the customer of the transcoded video address through API callback;
(5~8) For the customers who have enable CDN acceleration service, their users will utilize the audio address that the cloud storage sends to customer, to initiate requests to CDN PoPs and receive response.

2.2 Instruction

When customers upload a video file or an audio file, they will be configured with the transcoding parameters at one time, needless to call the API separately.

When the customers have not enabled the audio transcoding service, the audio transcoding parameters will be invalid even if they are uploaded though the API.

The audio transcoding feature can be used together with the VoD video transcoding, VoD transmuxing and VoD File processing, that is to say, multiple features can be realized through single API command concatenation.

2.3 Interface Description

2.3.1 Request Description

POST /fops

Table 1 Request Description

Parameter Required Description
Host Yes Management domain name, used to perform file operations, such as audio and video processing, and file deleting. Host could be obtained from CDNetworks Nova when the service application is approved.
Authorization Yes Access token, used to verify the legitimacy of resource management interface requests. It is generated by AK, SK, Path, Body.

Table 2 Access Token Authorization Parameter Description

Parameter Required Description
AK Yes Access key. It could be obtained from CDNetworks Nova when the service application is approved.
SK Yes Security key. It could be obtained from CDNetworks Nova when the service application is approved.
Path Yes Operation type, such as: video transcoding is /fops.
Body Yes Request content. Detailed description is as follows.

AK, SK and host(Management domain) could be obtained from CDNetworks Nova as followed:

【Console 更新】 缓存优化配置界面更新

【Console 更新】 缓存优化配置界面更新

2.3.2 Parameter Description

Request parameters are organized in the following format, submitted and shown in request content:


Table 3 Request Parameter Description

Parameter Required Description
bucket Yes Space name, the space where the original file is located
key Yes File name, the file name of the original file
fops Yes Parameter processing list. For parameter meaning, please refer to audio/video processing Ops parameter format (where elaborates the transcoding parameters such as transrating, resolution conversion, frame rate conversion, watermark adding, and subtitle adding); support one request for multiple processing operations, use “;” to separate the parameters in the list.
notifyURL No URL that receives the processing result notification. The data notification should include the detailed content information after processing, such as video bitrate, duration, etc. Please refer to data notification content description.
force No Whether to force executing data processing. It is allowed to set the following values: 0: If the specified data processing result exists, return the status that the file already exists, and there will be no processing, avoiding resource waste caused by repeated processing. 1: Force executing data processing, and overwrite the existing file The default value is 0.
seperate No Whether to notify the processing result separately. It is allowed to set the following values: 0: Notify notifyURL when all transcoding instructions are executed 1: Notify notifyURL each time a single transcoding instruction is executed The default value is 0.

Fill the parameters of audio transcoding in fops according to actual requirements. Multiple operations can be spliced together and be executed as fops parameter. In other word, fops= Urlsafe_Base64_Encode (operation 1; operation 2; …).

Filling in parameters in the following format, and fill the value following Urlsafe_Base64_Encode in the fops parameter:


Table 4 Audio Transcoding Parameters

Parameter Required Description
Yes avthumb (operation type-audio/video processing)
/acodec/ Yes Audio encoding scheme. Supported schemes: libmp3lame, libfaac, libvorbis, etc. Support the copy parameter as well, preserve the original audio encoding scheme.
/aq/ No Audio quality. The range is: 0-9 (mp3), the smaller the value is, the higher the quality is; 10-500 (aac), the bigger the value is, the higher the quality is. Only mp3 and aac are supported. It cannot be used with the audio bitrate parameter ab.
/ab/ No Audio bitrate, unit: bits per second (bit/s). Common bitrate: 64k, 128k, 192k, 256k, 320k, etc.
/ar/ No Audio sampling rate, unit: Hertz (Hz). Common sampling rate: 8000, 12050, 22050, 44100, etc. Note: flv only supports 44100, 22050, 11025
/ac/ No Set the number of audio tracks, 1 is mono, 2 is stereo
|saveas/bucket:filekey No Save as a designated file. Fill the “Space:File name” Urlsafe_Base64_Encode value in the parameter.
Audio output only
Yes Destination audio format to be output, such as mp3, aac, wma.
Video output
Yes Destination video format to be output, such as flv, mp4, m3u8, etc.
/an/ No Whether to remove the audio stream, 0 is preserve and 1 is remove. The default value is 0.
/vn/ No Whether to remove the video stream, 0 is preserve and 1 is remove. The default value is 0.

2.3.3 Response Description

  • If the request succeeds, return the Json character string as follows:

{ “persistentId”: < persistentId > }

  • If the request fails, return the Json character string as follows:

“code”: “< code string >” ,
“message”: “< ErrMsg string >”


Table 5 Response Parameter Description

Field name Required Description
persistentId Yes Upload the process ID that preprocesses or triggers persistent processing
code Yes HTTP request response code. Please refer to HTTP response status codes
message Yes Prompt message for transcoding failure

2.3.4 Example

Remove the video stream of the test.mp4 video under the vod-wcs-test001 space, use the audio only, convert the audio’s encoding scheme to libmp3lame and convert its bitrate to 64k, and store it in the vod-wcs-test001 space, and name it as test_audio.mp3.

(1) Step 1: Obtain AK and SK from CDNetworks Nova.
(2) Step 2: The path of audio transcoding is /fops
(3) Step 3: The request content Body is

Before encryption:


After encryption:


(4) Step 4: Utilize AK, SK, path, Body to generate access token Authorization.

(5) Step 5: Obtain management domain name from CDNetworks Nova, and use mgrDomain to represent the management domain name.

(6) Step 6: Execute the command.

curl -v -X POST -d"bucket=dm9kLXdjcy10ZXN0MDAx&key=dGVzdC5tcDQ=&fops=YXZ0aHVtYi9tcDMvYWNvZGVjL2xpYm1wM2xhbWUvYWIvNjRrfHNhdmVhcy9kbTlrTFhkamN5MTBaWE4wTURBeE9uUmxjM1JmWVhWa2FXOHViWEF6&force=1&separate=1" -H “Authorization:mgrAuthorization_A:mgrAuthorization_B” --url “http://mgrDomain/fops”

【Console 更新】 缓存优化配置界面更新

Figure 2 Audio Transcoding Command Execution Process

Last, query the generated test_audio.mp3 in the vod-wcs-test001 space.

3 Notice

Audio transcoding only manipulate the video/audio files stored on CDNetworks media cloud storage only, so Media Acceleration and media cloud storage feature should be enabled at the same time.

Is the content of this document helpful to you?
I have suggestion
Submitted successfully! Thank you very much for your feedback, we will continue to strive to do better!