The necessity for streaming capabilities amongst video content distributors is at an all-time high! Consumers and developers alike are racing to find and distribute the best content at their disposal. Unfortunately, this high demand for video content is often undermined by a lack of security around original content. As a result, creators and distributors alike are finding themselves in positions where they need to protect themselves; enter DRM technologies – what are they? and how do they work?
Digital Rights Management refers to the algorithms and processes that were created to enforce copyright compliance when consuming video content. Without DRM, content can be easily copied – it is, therefore, necessary in an online video distribution architecture, but it is not visible to the consumer. DRM is also used offline to provide copyright protection for CDs, DVDs, and BluRays.
The most common DRM technologies are:
- Fairplay: Cipher Block Chaining encryption, the only option for Safari and is only used by Apple devices.
- Widevine: Developed by Widevine Technologies, bought by Google. Used on Android Devices natively, in Chrome, Edge (soon), Roku, Smart TVs, uses protobuf format for metadata.
- PlayReady: developed and maintained by Microsoft. Supported on Windows, most set-top boxes and TVs, uses WRMHEADER tag objects as metadata format.
Additional DRM types can be seen in Irdeto’s graphic below:
Image source: Data Rights Management Basics Webinar ft. Irdeto & Irdeto.com
This segmented market of encryption algorithms is equally represented by a highly fragmented application, as indicated by our 2021 Video Developer Report. The following graph shows the current distribution in the application of DRM systems within the Developer community:
What DRM/Content Protection Systems do you use?
Currently, DRM can be implemented as both a software and/or hardware solution. Regardless of DRM hardware or software implementation types, all providers seeking to protect their content will see their files pass through an encryption & decryption cycle (as seen below).
The Encryption
To begin the “security” cycle, communications between the requesting playback software and the license server are encrypted. Each segment is encrypted according to the MPEG Common Encryption (CENC) specification for ISO-BMFF and/or MPEG-TS streams, where either all content is encrypted or only subsamples, like i-frames. The MPEG-CENC standard is comprised of XML style formats and requires a minimum of a key and key id to run. CENC is also used for HLS if the segments are in an fMP4 container. Standard content encryption is done according to the Advanced Encryption Standard (AES), using 128-bit keys and a Cipher Block – usually either Counter Mode (CTR) or Cipher Block Chaining (CBC). These two modes differentiate how a payload is encrypted.
It’s important to note that only the audio and video data within a segment is encrypted, but metadata is not. There are at least three types of encryption formats (algorithms) for video, most notably: Widevine, FairPlay, Playready. Their application can vary greatly based on many unique factors – having to select an algorithm that matches the content distributor’s delivery & playback needs (based on which devices are supported) can introduce a lot of complexity to the DRM implementation process. In order to improve security and decrease the risk of reverse engineering DRM systems, there are typically no clear log statements. In fact, parts of the process are treated as a black box – and as a result, debugging can be even harder on devices (for example SmartTVs or Set-Top Boxes with older versions of DRM software). The content will then be decrypted by a Content Decryption Module (CDM), which decrypts each encrypted audio and video segment.
The Decryption
When a web player identifies protected content, it calls on processes and interfaces defined by Encrypted Media Extensions (EME), which are used in browsers to initiate a license request process. License requests are generated by the CDM and passed to the player through the EME. All of the decryption work is done by the CDM, the EME is simply the interface for the module. The sessions are also updated by the CDM when the player calls the appropriate function on the EME interface. The EME interfaces with the CDM at the Operating System or browser level, AND handles the decryption of the segments. However, the EME never interfaces with the playback client application and the decrypted content is only available to the CDM.
In order to decrypt protected content, the player or playback software initializes a request to the licensing server. If the license is cached locally, this request can happen before the content is decrypted or played back. The licensing server information can be contained either in the manifest (like MPEG-DASH or embedded in HLS), in a player’s configuration, or within the individual segments. A player-issued request to use a specific component within the server must include a specific device signature, the signer data, and the content id for the server to be able to grant the license. Although it is not a requirement, the request typically includes authentication data from the requesting device. The authentication data may include information about the content’s decryption security level, for example: decrypting content using software is significantly less secure than decrypting over hardware. Once all mandatory information is provided, the server may grant a license to the player or playback software with the decryption keys necessary to allow secure distribution of the requested content on the client.
From the perspective of the content requester – the license acquisition using the EME starts from the playback client creating a key session unique to the client, device, and the metadata found in the segments. The CDM then generates a signed key message. The client then sends then secured message to the license server. The license server returns the requested license – with the resulting decision of whether or not the client is granted playback rights to the requested content; if not, playback is halted and an error is shown. In successful communications scenarios, the client updates the session data with a returned license. The content decryption is handled fully by the CDM. In some circumstances, the license is cached for a set time and can be used to playback protected content offline (ex: Netflix). The license and the decrypted data must not be accessible to clients other than the licensed content requester. Therefore, the private keys and decrypted data are kept in a secure environment within the browser, operating system, and hardware (if supported), like Trusted Execution Environments.
The usage of different container formats, like fMP4 and MPEG-2 TS, made it hard to distribute the same content across all platforms. However, the rapid adoption of CMAF and the standardization of CENC across hardware manufacturers and software developers are reducing the complexity of implementation for the industry. Although CMAF and CENC still allow AES CTR and AES CBC usage, DRM providers are gradually converging towards the use of AES CBC.
Video technology guides and articles
- Back to Basics: Guide to the HTML5 Video Tag
- What is a VoD Platform?A comprehensive guide to Video on Demand (VOD)
- Video Technology [2022]: Top 5 video technology trends
- HEVC vs VP9: Modern codecs comparison
- What is the AV1 Codec?
- Video Compression: Encoding Definition and Adaptive Bitrate
- What is adaptive bitrate streaming
- MP4 vs MKV: Battle of the Video Formats
- AVOD vs SVOD; the “fall” of SVOD and Rise of AVOD & TVOD (Video Tech Trends)
- MPEG-DASH (Dynamic Adaptive Streaming over HTTP)
- Container Formats: The 4 most common container formats and why they matter to you.
- Quality of Experience (QoE) in Video Technology [2022 Guide]