AAC (Advanced Audio Coding)
AAC (Advanced Audio Coding) is an audio compressions standard that is part of MPEG-4. It is the more efficient successor of MP3 and is widely supported in all major browsers and devices including TVs, gaming consoles, streaming devices, etc. AAC has a low complexity version AAC-LC and a high efficiency version AAC-HE. AAC-LC is typically used if you want to reach as many devices as possible or want to support specific stripped down devices that do not support AAC-HE.
ABR (Adaptive Bitrate)
Adaptive Bitrate (ABR) streaming is a technique that enables a server to offer multiple different resolutions and bitrates of the same video to the client. A video player or client application can then choose the version of the video that fits best for the deivce based on the device capablities, resolution and network conditions. During playback the client application can also change between different resolutions and bitrates based on the available bandwidth to reduce buffering and a bad user experience.
AC-3 (Audio Codec 3) is a audio compression standard developed by Dolby. Its typically used by TV stations, DVDs and Blue-rays and therefore supported by all major TVs. Not all browsers and connected devices support AC-3 which makes device compability still a problem.
AOM (Alliance for Open Media)
The Alliance for Open Media (AOM) is a non-profit organization that was founded by Google, Amazon, Cisco, Intel, Microsoft, Mozilla and Netflix. AOM focuses on providing roalty free media technologies for the internet. Its most notable project is the AV1 video codec that serves as the successor of VP9 and can achieve better visual quality than HEVC.
AOMedia Video 1 (AV1) is a royalty free video codec standardized by AOM. The AV1 video codec is the successor of VP9 and can achieve even better quality than HEVC. Compared to HEVC and most codecs from MPEG, AV1 has been specifically designed and optimized for we-based delivery and is supported already in a variaty of browsers and connected devices.
Read more: The State of AV1 Playback Support: 2022
AVC (Advanced Video Coding)
AVC stands for Advanced Video Coding, and it’s also known as H.264. It’s a video compression standard that was developed by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) in 2003.
It’s a widely used video codec, support in most consumer electronics like DVD and blue-ray players, smartTVs, streaming devices, and also most of the web browsers and mobile devices supported it.
AVC can be use in different profiles, each of them have different capability and target application. for example: Baseline Profile for low-complexity application, Main Profile for standard-definition digital TV broadcasting, High Profile for high-definition digital TV broadcasting.
The aspect ratio of a video refers to the proportional relationship between its width and height. A video’s aspect ratio can affect how it’s displayed on different devices, such as televisions, computer monitors, and mobile devices.
The most common aspect ratios for streaming videos are:
- 4:3 (1.33:1) which was common for standard definition TV and older movies.
- 16:9 (1.77:1) which is the most common aspect ratio for high-definition TV, computer monitors and also used for most of the movies.
When a video’s aspect ratio doesn’t match the aspect ratio of the device on which it’s being displayed, the video will either have black bars on the top and bottom (letterbox), or on the left and right sides (pillarbox) to maintain the original aspect ratio.
For streaming videos, providers will often create multiple versions of their videos at different aspect ratios to ensure that the videos will look good on a wide range of devices. Some streaming services also provide an option to adjust the aspect ratio on the fly based on the user’s preference.
Read more: Wikipedia – Display Aspect Ratio
A beacon in the context of video streaming is a small piece of code that is embedded in a video player or a web page, it’s designed to send analytics data about the viewer’s interactions with the player, such as play, pause, seek and stop events, to a server for analysis.
This data can be used to gain insights into how viewers interact with the video content, such as how long they watch a video, how far they are into the video, and what actions they take while watching, such as rewinding or fast-forwarding. This information can be valuable for content providers, as they can use it to improve the streaming experience and optimize their content.
Beacons can also be used to gather data on the viewer’s device, such as their device type, operating system, and browser. This information can be used to optimize the video playback for different devices and operating systems, and to identify any compatibility issues that need to be addressed.
A bitrate is a measure of the amount of data used to represent a unit of time in a video stream. It’s typically measured in “bits per second” (bps) or “kilobits per second” (kbps). In video streaming, the bitrate of a video stream directly affects its quality and the amount of data that needs to be transmitted over the internet.
A higher bitrate means that more data is used to represent a unit of time, and this results in higher video quality and larger file sizes. Conversely, a lower bitrate means that less data is used to represent a unit of time, which results in lower video quality and smaller file sizes.
When streaming videos, the available bandwidth is a major factor determining the video bitrate that can be used. Streaming providers often provide multiple versions of their videos at different bitrates, which allows the viewer to select a version that is appropriate for their device and internet connection. Adaptive bitrate streaming, is a technique that automatically adjusts the video bitrate based on the viewer’s internet connection and device capabilities, is widely used to improve the video quality and prevent buffering.
In the context of video streaming, breakdown data refers to detailed information that breaks down various aspects of a video stream, such as bitrate, resolution, and frame rate, for a specific period of time.
Breakdown data can provide detailed insights into how a video stream changes over time, such as how the bitrate and resolution vary throughout the video, and how these changes affect the overall quality of the video stream. This information can be used to identify any problems or issues that may be affecting the streaming experience, such as buffering or dropped frames.
Additionally, the breakdown data can be used to optimize the video streaming experience by adjusting encoding parameters such as resolution, bitrate or frame rate, in order to achieve the best balance of video quality and network usage, this is commonly known as adaptive bitrate streaming.
When a video buffers, it means that the video player is temporarily pausing the playback of the video in order to load more video data into the buffer. This typically happens when the internet connection is slow or unstable, and there isn’t enough data in the buffer to sustain the playback at the current video bitrate.
The player will stop the playback, display a loading icon (often a spinning wheel or buffer bar), and wait until a certain amount of data has been accumulated in the buffer before resuming playback. The longer the buffering takes, the slower the internet connection is and the more data is needed for smooth playback.
There are a number of reasons why buffering might occur, including:
- A slow internet connection: If the internet connection is slow, the video player may not be able to download video data fast enough to keep up with the playback.
- Network congestion: If there is a lot of traffic on the network, there may not be enough bandwidth available for the video player to download video data quickly.
- Server overload: If the server hosting the video is under heavy load, it may not be able to provide the video data quickly enough to keep up with the playback.
- High bitrate: The video bitrate that is being used may be too high for the viewer’s internet connection to handle, causing buffering to occur.
Adaptive Bitrate Streaming (ABR) and Per-title encoding in combination with a Content Delivery Network (CDN) typically solves all this issues.
CBR (Constant Bitrate)
Constant Bitrate (CBR) is a method of encoding video and audio files where the bitrate, which is the amount of data used to represent a unit of time, is held constant throughout the entire file. This means that the same amount of data is used to represent each second of the video or audio.
CBR is generally considered to be less efficient than Variable Bitrate (VBR) as it does not take into account variations in the complexity of the video or audio, which can lead to a wastage of data in some parts of the file. However, CBR has the advantage of providing a more consistent quality of the video or audio throughout the entire file. This is because the bitrate is held constant, resulting in a stable video or audio quality for the viewer.
Additionally, CBR is also more compatible with certain devices and broadcasting standards that require a constant bitrate for transmission.
CDM (Content Decryption Module)
A Content Decryption Module (CDM) is a software component that is used to decrypt and play back protected digital content, such as video or audio files that have been encrypted using a Digital Rights Management (DRM) system.
A CDM interacts with the video player software on a device and the DRM system on the server to decrypt the protected content and make it available for playback.
CDMs are necessary to playback protected content on a device as they provide the necessary functions to decrypt the content and make it available for playback. DRM systems use different encryption techniques and CDM’s are designed to work with a specific DRM system, so the correct CDM must be installed on the device in order to play the content.
CDN (Content Delivery Network)
A Content Delivery Network (CDN) is a network of servers distributed across multiple geographic locations that work together to deliver content to users as quickly and efficiently as possible. The main goal of a CDN is to reduce latency and improve the performance of web and video content by caching and delivering content from the server that is closest to the end user.
CDNs typically consist of a large number of servers, known as “edge servers,” that are strategically placed at the edge of the internet, close to where users are located. When a user requests content, the CDN routes the request to the edge server that is closest to the user, reducing the distance that the data has to travel and decreasing latency.
CDNs can also handle other functionalities such as load balancing, traffic management, and security features, like DDoS protection and SSL offloading, that can improve the user experience.
The use of CDN is particularly important for high-traffic websites and online video platforms, as it helps to ensure that content is delivered quickly and reliably to users around the world. CDN also helps to reduce the load on the origin server, improving scalability and reliability, and also can be used for live streaming events.
MPEG Common Encryption (CENC) is a standard for encrypting and decrypting video content using AES-128 encryption. It allows for the same set of encryption keys to be used across multiple DRM systems, making it easier for content providers to deliver their video to a wide range of devices. This helps to simplify the content packaging and distribution process, as well as ensure that the content is protected from unauthorized access. CENC is a standard that’s been adopted by a number of major players in the industry, including Google, Microsoft, and Apple, which means that it is widely supported by the devices and platforms that consumers use to watch video content. This allows for greater flexibility in the ways that video can be distributed and consumed, and helps to ensure that content providers can reach the widest possible audience.
In video streaming, a cache is a temporary storage location for video data. Caching is used to reduce the amount of data that needs to be sent over the internet to a viewer’s device, which can help to improve the overall streaming experience by reducing buffering and providing faster start times.
Caches can be placed at various points in the streaming workflow, such as on the viewer’s device, at the edge of a content delivery network (CDN), or in the cloud. When a viewer requests a piece of video content, the cache checks to see if it already has a copy of that data. If it does, it can serve the data to the viewer directly from the cache, which is much faster than having to retrieve it from the original source. This helps to ensure a smooth streaming experience and improves the overall performance.
Overall, caching plays an important role in improving the quality of the streaming experience for viewers, as well as reducing the costs associated with delivering video content over the internet.
Clipping in video streaming refers to the process of removing unwanted parts from a video, usually from the beginning and/or the end of the video. This can be done for a variety of reasons, such as removing commercials or dead air from a live broadcast, or cropping a video to focus on a specific part of the content.
Clipping can be done by using specialized software or online tools, which allows users to select the start and end points of the clip, and then export the desired segment of the video.
It can also be done as an encoding process, where the encoder will only encode and package the specific segments of video desired to be delivered to the clients. This enables the user to have smaller video files, lower latency and eliminates unwanted parts.
Clipping is a valuable tool for content creators and publishers, as it allows them to create shorter, more focused video content that is better suited to different platforms and use cases. It also helps to reduce the amount of storage space required to store video content, and can make it easier to share video content with others.
A codec is a software or hardware component that is used to compress and decompress digital video and audio data. Codecs are an essential part of video streaming because they allow the video data to be transmitted over the internet at a manageable size and quality level.
Video codecs use a variety of techniques to compress the video data, such as removing redundant information and using advanced compression algorithms. There are many different video codecs available, each with their own strengths and weaknesses. Some common codecs used in video streaming include H.264, VP9, H.265 and AV1.
One important factor to consider when choosing a codec is the balance between video quality, compression efficiency and device compatibility. Some codecs, such as H.264, are known for their high device compatibility but may produce lower-quality video compared to newer codecs like AV1.
It’s also worth noting that the codecs used in video streaming can change over time, as new technology becomes available, and devices and platforms become more powerful. Therefore, it’s important to stay up to date with the latest codecs and best practices in video streaming.
Content Management System (CMS)
A Content Management System (CMS) is a software application that is used to create, manage, and publish digital content, such as video and audio files. A CMS for video streaming is designed specifically to handle the unique needs of video content, including large file sizes, high resolution, and the ability to handle multiple video formats.
A video streaming CMS allows users to easily upload, organize, and manage their video content. It can provide features such as automatic video transcoding, video metadata management, and video analytics. Many CMS also come with built-in video players that can be customized and embedded on websites or mobile apps to allow users to watch the videos.
One of the main benefits of using a CMS for video streaming is that it allows users to manage their video content in a centralized location, making it easy to keep track of different versions of videos and manage user permissions. Additionally, a CMS can automate many of the tedious and time-consuming tasks involved in managing video content, such as transcoding, which can be done with a single click.
Another important aspect of a CMS for video streaming is the ability to monetize the content by providing paywall, subscriptions or advertising.
CMS for video streaming can also integrate with other tools such as analytics or marketing platforms, providing additional insights and capabilities.
DAM (Digital Asset Management)
A Digital Asset Management (DAM) system is a software application that is used to organize, store, and distribute digital assets, such as images, videos, and audio files. In the context of video streaming, a DAM system is a centralized repository that can be used to manage and distribute video content.
DAM systems provide a way to store, organize, and manage digital assets, including videos, in a central location. They can be used to manage the entire lifecycle of a video, from creation and editing to distribution and archiving.
DAM systems typically provide a user-friendly interface for browsing and searching for assets, as well as tools for managing metadata and permissions. They can also integrate with other systems, such as content management systems (CMS) and video players, to provide additional functionality.
One key feature of a DAM system for video streaming is the ability to handle a wide range of video formats and codecs, allowing users to store and distribute videos in different formats to meet the needs of different devices and platforms.
Additionally, DAM systems for video streaming also provide the ability to transcoding and distribution of the videos to multiple channels and platforms, ensuring the right format is delivered to the right audience.
DAM systems can also provide analytics on the performance of the videos and the access to them, allowing for better decision making in the management and distribution of the assets.
MPEG-DASH (Dynamic Adaptive Streaming over HTTP) is a technology standard for streaming video over the internet. It is an adaptive bitrate streaming technology, which means it adjusts the quality of the video based on the viewer’s internet connection and device capabilities, in order to provide the best possible viewing experience.
MPEG-DASH works by breaking a video into small segments, or chunks, that can be downloaded and played back by the viewer’s device. The chunks are encoded at different bitrates and resolutions, so the viewer’s device can choose the best quality level for the available internet connection.
One of the key advantages of MPEG-DASH is that it is an open standard, which means it is not tied to any specific platform or device. It can be used with a wide range of video codecs, such as H.264, H.265, VP9, and AV1, and it is supported by many different browsers and devices, including smartphones, tablets, and smart TVs.
MPEG-DASH also allows for a high level of flexibility in the way the video is delivered. It supports both live and on-demand streaming, and can be used in combination with other technologies, such as content delivery networks (CDN) and digital rights management (DRM) systems, to improve the delivery and security of video content.
DRM (Digital Rights Management)
Digital Rights Management (DRM) is a technology used to control access to and distribution of digital content, such as video and audio files. In the context of video streaming, DRM is used to protect the rights of content owners by controlling who can access and view the video content, and under what conditions.
DRM systems typically use encryption to protect the video content, making it unreadable to anyone who does not have the proper decryption key. The keys are typically controlled by the content owner, who can restrict access to the video content based on a variety of factors, such as user identity, device type, and geographic location.
DRM systems can also be integrated with other technologies, such as content delivery networks (CDN) and digital rights management (DRM) systems, to provide additional security and functionality, such as preventing the unauthorized playback of downloaded videos.
There are several DRM technologies available, each with their own strengths and weaknesses. Some of the more widely used DRM systems include Apple’s FairPlay, Google’s Widevine, and Microsoft’s PlayReady. These DRM systems can be integrated with a wide range of devices and platforms, including smartphones, tablets, smart TVs, and web browsers.
A DRM (Digital Rights Management) license is a set of rules and permissions that govern how protected digital content, such as a video, can be used. DRM licenses are typically issued by the content owner or their representative, and they specify what actions are allowed and prohibited with respect to the protected content.
In the context of video streaming, a DRM license controls the actions that can be performed on the protected content, such as playback, recording, and distribution. The license also specifies the conditions under which the protected content can be accessed, such as the type of device, the geographic location, and the user’s identity.
DRM licenses are typically delivered alongside the protected content, and they are used by the playback software to determine whether the content can be played, and if so, under what conditions. The licenses are encrypted and protected by a unique decryption key, and only authorized devices or users with the correct key will be able to access the content.
DRM license also can set expiration dates and revoke the access to the content if the licenses is invalid.
Video decoding is the process of converting compressed video data into a visual format that can be displayed on a screen. The compressed video data is typically encoded using a video codec, which is a set of algorithms that are used to compress and decompress video data. During video decoding, the video codec decompresses the video data so that it can be displayed by the video player.
In video streaming, video decoding is a crucial step in the process of delivering video content to viewers. As the video data is transmitted over the internet, it is often compressed to reduce the amount of bandwidth required for the stream. The video player on the viewer’s device then decodes the compressed video data so that it can be played back in real-time.
Video decoding is a computationally intensive task that requires significant processing power. It is typically performed by the computer’s central processing unit (CPU) or a specialized video decoding chip. As the resolution and quality of video content has increased, the requirements for video decoding have also increased. Today, many devices are equipped with hardware-accelerated video decoding, which uses specialized chips to offload some of the decoding process from the CPU. This allows for smooth playback of high-resolution video even on devices with lower-end CPUs.
Demultiplexing / Demuxing
Demuxing, also known as demultiplexing, is the process of separating multiple streams of data from a single container file. In the context of video streaming, demuxing refers to the process of separating the audio and video streams from a container file, such as an MP4 or MKV file.
In video streaming, video files are often stored in container formats such as MP4, which can contain multiple streams of data, such as video, audio, and subtitles. When the video is streamed over the internet, it is typically sent as a single container file. The process of demuxing is performed by a demuxer, which is a software component that separates the different streams of data into separate files.
Demuxing is an important step in the process of delivering video content to viewers. It allows the video player to access the individual streams of data (e.g. video, audio) separately and play them back simultaneously. This is important for synchronizing the video and audio streams and ensuring that the video playback is smooth.
Dolby Digital, also known as AC-3, is a digital audio compression format developed by Dolby Laboratories. It is one of the most widely used audio codecs for consumer and professional applications, and is commonly used in the cinema and home theater market.
In the context of video streaming, Dolby Digital is used to compress and deliver multi-channel audio, such as surround sound, over the internet. It is commonly used in streaming services such as Netflix, Hulu, and Amazon Prime Video to deliver high-quality audio to viewers.
Dolby Digital supports a range of bitrates and channel configurations, including mono, stereo, and surround sound. The most common configuration is 5.1 channel audio, which includes five full-range channels (left, center, right, left surround, and right surround) and a low-frequency effects (LFE) channel for bass.
Dolby Digital is a widely supported audio codec and is compatible with most consumer devices, such as televisions, Blu-ray players, and streaming devices. However, it is important to note that not all devices are capable of decoding Dolby Digital and may only support stereo audio. As streaming service providers, it’s important to also provide alternative audio codecs for devices that can not decode Dolby Digital.
Dolby Digital Plus
Dolby Digital Plus, also known as Enhanced AC-3 (E-AC-3) is an advanced, multichannel audio codec developed by Dolby Laboratories. It is an extension of the original Dolby Digital (AC-3) codec and is designed to deliver higher quality audio at lower bitrates than its predecessor.
In the context of video streaming, Dolby Digital Plus is used to deliver high-quality, surround sound audio over the internet. It is compatible with a wide range of devices and platforms, including smart TVs, streaming devices, and gaming consoles. It is widely supported by streaming services such as Netflix, Amazon Prime, and Hulu to deliver high-quality audio to their viewers.
Dolby Digital Plus supports a wide range of channel configurations and bitrates, including mono, stereo, and surround sound. It also supports various sample rates and bit depths, and uses advanced audio coding techniques to deliver high-quality audio at lower bitrates.
It’s worth noting that while Dolby Digital Plus is considered to be an improvement over its predecessor Dolby Digital, but it’s not as widely supported as the latter, some devices may only support Dolby Digital, or other audio codecs such as AAC, thus it’s important for streaming service providers to provide alternative audio codecs for viewers using those devices.
EME (Encrypted Media Extensions)
Encrypted Media Extensions (EME) is a standard for encrypting and decrypting video content during streaming. This technology allows streaming service providers to protect their video content from unauthorized access and distribution, while also allowing users to access the content on a wide range of devices.
EME works by encrypting the video content on the server-side, and then delivering the encrypted content to the viewer’s device. The device then decrypts the content using a license or key, which is provided by the streaming service provider.
This process ensures that only authorized users can access the video content, and that the content is only playable on approved devices. It also ensures that the video content is only playable for a specific duration and for specific devices, it can also be used to prevent the content from being downloaded or recorded.
EME is widely supported by web browsers, such as Google Chrome, Microsoft Edge, and Safari, as well as by popular streaming platforms, such as Netflix and Hulu. It is used in conjunction with other technologies, such as Digital Rights Management (DRM) and Media Source Extensions (MSE), to create a robust and secure video streaming experience for users.
It is worth noting that EME have some downsides, as it requires additional setup and configuration for streaming service providers, and it can make the streaming experience more complex for users. This is the trade-off made to enhance security and protection of the streaming providers content.
Encoding in video streaming refers to the process of converting raw video files into a format that can be streamed over the internet. This process involves compressing the video files, as well as packaging them into a format that can be read by the viewer’s device.
Encoding is a crucial step in the video streaming workflow, as it allows video files to be compressed to a manageable size while still maintaining an acceptable level of video quality. The video is then packaged into a container format, such as MP4 or MKV, which contains the video and audio data, as well as any additional information such as subtitles or metadata.
There are several different video codecs that can be used for encoding, such as H.264, H.265, VP9 and AV1. Each codec has its own set of advantages and disadvantages and the choice of codec will depend on the specific use case and the target device. For example, H.264 is widely supported, but it is not as efficient at compressing video as newer codecs like H.265 or AV1.
Encoding is a compute-intensive process, and it can take a significant amount of time to encode a video file, especially for large or high-resolution videos. It is important to have a good quality encoding equipment or use cloud-based encoding services to speed up the process.
In addition to traditional encoding, adaptive bitrate encoding is commonly used in video streaming. This type of encoding creates multiple versions of a video at different resolutions and bitrates. This allows the video to be dynamically adapted to the viewer’s device and network conditions, ensuring the best possible streaming experience.
An encoding ladder in video streaming refers to a set of encoded versions of a video that are created at different resolutions and bitrates. This allows the video to be dynamically adapted to the viewer’s device and network conditions, ensuring the best possible streaming experience.
An encoding ladder is created by taking the original source video and encoding it multiple times at different resolutions and bitrates. The encoded versions are then organized in a ladder-like structure with the highest resolution and bitrate at the top and the lowest resolution and bitrate at the bottom.
For example, a typical encoding ladder might have versions encoded at 1080p, 720p, 480p, and 360p resolutions with corresponding bitrates of 8Mbps, 4Mbps, 2Mbps, and 1Mbps.
The goal of the encoding ladder is to ensure that the video will look good and play smoothly on a wide range of devices, while also minimizing the overall bandwidth usage. This is achieved by providing multiple versions of the video at different resolutions and bitrates, so that the viewer’s device can choose the version that best fits its capabilities and network conditions.
Adaptive Bitrate Streaming (ABS) technologies like HLS, DASH and CMAF use this encoding ladder technique. They use a combination of the video’s resolution and bitrate to switch between different versions of the video during playback in real-time, based on the viewer’s device and network conditions.
It is worth noting that creating an encoding ladder takes a significant amount of time and computation power, but this step is crucial for the final user experience and it’s an important consideration for streaming service providers. And that’s why using cloud-based encoding services can be a good option for these providers, as they can quickly and easily create encoding ladders at a large scale.
eAC3 (Enhanced AC-3) is a digital audio codec that is an extension of the original AC-3 (Dolby Digital) codec and is used to deliver high-quality, surround sound audio over the internet. It is designed to deliver higher quality audio at lower bitrates than its predecessor, which is why it is sometimes known as Dolby Digital Plus.
In the context of video streaming, eAC3 is used to deliver high-quality, multi-channel audio to viewers. It is compatible with a wide range of devices and platforms, including smart TVs, streaming devices, and gaming consoles. Many streaming services use eAC3 to deliver high-quality audio to their viewers.
eAC3 supports a wide range of channel configurations and bitrates, including mono, stereo, and surround sound. It also supports various sample rates and bit depths and uses advanced audio coding techniques to deliver high-quality audio at lower bitrates.
It’s worth noting that eAC3 is a widely supported audio codec, but not all devices support it, and some may only support stereo audio or other codecs like AAC. That’s why is important for streaming service providers to also provide alternative audio codecs for viewers using those devices.
FFmpeg is a free, open-source software project that includes a library of tools for handling multimedia files and streams. It is commonly used for video compression, format conversion, and streaming.
In the context of video streaming, FFmpeg can be used to compress video files into a variety of different formats, such as H.264 and H.265, which are widely supported by video players and streaming platforms. It can also be used to convert video files from one format to another, such as converting a video from MKV to MP4.
FFmpeg can also be used for streaming video over the internet. It can act as a encoding and packaging engine for live streaming, providing live streams in different formats and protocols like HLS, DASH, and RTMP. it also can be used as a decoder to play these streams
FFmpeg is widely supported and available for a variety of platforms, including Windows, macOS, and Linux. It is also supported by many popular video players and streaming platforms, such as VLC.
It is also commonly used as a command-line tool, but can also be used as a library to be integrated in custom developed applications and solutions.
In video technology, a frame refers to an individual still image that, when played in sequence, creates the illusion of motion. In other words, a frame is one of the many static images that make up a video.
Frames are usually displayed at a certain rate, called the frame rate, measured in frames per second (FPS). Common frame rates include 24, 25, 30, and 60 FPS. The higher the frame rate, the smoother the motion in the video will appear.
In video streaming, frames are encoded and compressed by the encoder, then divided into smaller packets of data, called video frames or GOPs (Group of Pictures), that are sent to the player via a streaming protocol. The player then decompresses and decodes the frames to display the video.
In video compression, frames can be divided in two types: I-frames and P/B frames, I-frames are the full picture frames where other frames just hold the difference information (difference from previous frame). These types of frames work together to reduce the overall bitrate of the video without sacrificing too much of its quality.
H.264 is a video compression standard that is widely used to compress digital video files. It is known for its ability to achieve high-quality video at low bitrates, making it well-suited for streaming over the internet. The H.264 standard was developed by the Joint Video Team (JVT), a partnership between the International Telecommunications Union (ITU) and the Moving Picture Experts Group (MPEG). It is also known as MPEG-4 Part 10, or AVC (Advanced Video Coding). This standard is support by a wide range of software and hardware, making it a popular choice for video streaming across different platforms.
H.265 (High Efficiency Video Coding) is a video compression standard that is designed to improve upon its predecessor, H.264. It uses advanced algorithms to compress video files to a smaller size while maintaining similar or better visual quality. H.265 is able to achieve higher compression efficiency, particularly for high-resolution and visually complex videos. This standard was developed by the Joint Collaborative Team on Video Coding (JCT-VC), which is a partnership between the International Telecommunications Union (ITU) and the Moving Picture Experts Group (MPEG). It is also known as HEVC (High Efficiency Video Coding). Due to its improved efficiency, H.265 is commonly used for streaming video over limited bandwidth networks such as cellular and satellite connections. H.265 is a more complex standard and it requires more computing power to decode than H.264, But it is better for storage, archiving and reduce the required bandwidth for streaming.
H.266 (also known as Versatile Video Coding or VVC) is the latest video compression standard that was developed by the Joint Video Experts Team (JVET) which is a partnership between the International Telecommunications Union (ITU) and the Moving Picture Experts Group (MPEG). This standard is designed to be even more efficient than H.265 and aims to improve video quality and reduce file sizes further.
It uses advanced algorithms and techniques to achieve a greater compression ratio, which allows the same video quality to be achieved at a lower bitrate or to achieve better video quality at the same bitrate.
This standard is suitable for various use cases such as Ultra High Definition (UHD) and 8K resolution video, 360-degree video, and virtual reality (VR) video.
H.266 is still a new standard, It’s not yet widely adopted and its support is limited to specific devices and software.
HDCP (High-bandwidth Digital Content Protection) is a digital rights management (DRM) technology that is designed to protect digital video and audio content as it is transmitted over HDMI, DisplayPort, and other digital interfaces.
It’s designed to prevent unauthorized duplication or copying of premium video content as it travels across connections and devices. This is done by encrypting the video signal and requiring devices to authenticate one another before allowing the video to be displayed. If the devices do not authenticate, the video will not be displayed or will be displayed at a lower resolution.
HDCP is commonly used in streaming video services, and it’s supported by many HDMI-enabled devices such as TVs, streaming players, and game consoles. HDCP version 2.2 is the most widely adopted version, which supports higher resolutions and refresh rates, such as 4K and 8K video.
However, it is also an additional layer to take into consideration during the process of streaming and sometimes it may cause some problems, for example, if your TV doesn’t support HDCP version 2.2, you won’t be able to stream the content even if your streaming device does.
HDR (High Dynamic Range)
HDR (High Dynamic Range) is a video technology that allows for a wider range of colors, brightness, and contrast than traditional SDR (Standard Dynamic Range) video. It is designed to provide a more realistic and immersive viewing experience by reproducing colors and brightness levels that are closer to what the human eye can perceive in the real world.
An HDR video has a higher contrast ratio and wider color gamut than an SDR video. It allows for brighter whites, deeper blacks, and more vibrant colors, which results in a more dynamic and lifelike image. Additionally, HDR also helps to provide greater detail in the highlights and shadows, which allows for more detail to be seen in the darkest and brightest areas of the image.
HDR video is becoming more common in streaming services and devices, but it requires specific streaming devices, TVs, and content that are compatible with the technology. There are multiple HDR standards such as HDR10, Dolby Vision, HDR10+, and HLG, which have different color space and dynamic range, Therefore, it is important to check the compatibility of devices, content, and standards.
HEVC (High Efficiency Video Coding)
HEVC (High Efficiency Video Coding) is a video compression standard that is designed to be more efficient than its predecessor, H.264. It uses advanced algorithms to compress video files to a smaller size while maintaining similar or better visual quality. HEVC is able to achieve higher compression efficiency, particularly for high-resolution and visually complex videos.
The standard was developed by the Joint Collaborative Team on Video Coding (JCT-VC), which is a partnership between the International Telecommunications Union (ITU) and the Moving Picture Experts Group (MPEG).
It is also known as H.265, it can compress video files to up to 50% smaller than H.264 while maintaining the same level of video quality.
HEVC is well suited for streaming over limited bandwidth networks, such as cellular networks or satellite connections. However, the standard is more computationally demanding than H.264, and it may require more powerful hardware and software to decode the video. Due to the complexity of the standard, not all devices are capable of playing HEVC files, but it’s increasingly being supported by streaming devices and software.
HLS (HTTP Live Streaming)
HLS (HTTP Live Streaming) is a video streaming protocol developed by Apple. It’s designed to efficiently deliver video over the internet by breaking the video into small segments and delivering them to the viewer over HTTP. This allows the video to be adapted to different network conditions in real-time, so the viewer can continue to watch the video without interruption even if their internet connection is slow or unreliable.
HLS segments the video into small chunks, typically between 2 to 10 seconds in length, and then creates an index file in the form of an M3U8 playlist. This playlist contains information about the location of the video segments and the order in which they should be played. The viewer’s device requests these segments and plays them in sequence. This process is done by the HLS player in the device, it requests the M3U8 file which contains the segments URLs, and the player request them to be played in order.
HLS is widely supported across multiple platforms and devices, including iOS, Android, and desktop web browsers, making it a popular choice for video streaming providers. Additionally, HLS allows for dynamic stream switching and adaptive bitrate streaming, which enable the video quality to be adjusted in real-time based on the viewer’s internet connection and device capabilities.
HSL color model
HSL (Hue, Saturation, Lightness) is a color model used to represent colors in a way that is more intuitive for artists and designers. It is a cylindrical representation of colors, where the hue is represented by an angle, the saturation is represented by the distance from the center, and the lightness is represented by the height.
Hue refers to the color itself, such as red, blue, or green, and is represented by an angle from 0 to 360 degrees. Saturation is a measure of the intensity of the color and is represented by a value from 0 to 100%, where 0% is grayscale and 100% is the most intense color. Lightness is a measure of the brightness of the color and is represented by a value from 0 to 100%, where 0% is black and 100% is white.
It’s worth noting that HSL and HSV are similar color models, although their values are defined differently, and their applications may vary depending on the use case.
HSV color model
HSV (Hue, Saturation, Value) is a color model that is similar to HSL but it is more related to how humans perceive color. It is a cylindrical representation of colors, where the hue is represented by an angle, the saturation is represented by the distance from the center, and the value is represented by the height.
Hue refers to the color itself, such as red, blue, or green, and is represented by an angle from 0 to 360 degrees. Saturation is a measure of the purity or intensity of the color and is represented by a value from 0 to 100%, where 0% is achromatic (grayscale) and 100% is fully saturated. Value refers to the lightness or brightness of the color and is represented by a value from 0 to 100%, where 0% is black and 100% is white.
It’s worth noting that HSV and HSL are similar color models, although their values are defined differently, and their applications may vary depending on the use case.
HTML5 is the latest version of Hypertext Markup Language (HTML), which is the standard markup language used to create web pages. HTML5 introduces new features that allow for more dynamic and interactive web pages, such as support for audio and video playback without the need for additional plug-ins.
In video streaming, HTML5 provides a standardized way to embed video into web pages, using the <video> tag. This eliminates the need for browser plug-ins like Adobe Flash or Microsoft Silverlight, which were previously required to play videos on the web. HTML5 video is supported by most modern web browsers, including Google Chrome, Firefox, Safari, and Microsoft Edge, which means that video can be streamed to a wide range of devices and platforms.
Therefore, using HTML5 in video streaming can provide a more seamless experience for the viewer, as they don’t need to have any additional software or plugins, and it can provide a more flexible way of delivering the video by combining it with other technologies.
Ingest in video streaming refers to the process of acquiring or receiving video content, typically in a raw format, and preparing it for encoding and distribution. This process includes capturing video from a live source, transferring video files from a storage device, or pulling video from a remote source.
Ingest typically involves several steps including, but not limited to: capturing or acquiring the video source, verifying the video and audio quality, converting the video file to a format suitable for encoding, and preparing the video for storage.
In streaming, ingest points are crucial as they are the entry point of your video, this makes the quality of the ingested video essential, as a low-quality ingest will result in a low-quality output. Thus, ensuring that the ingested video meets the desired specifications is crucial for a successful streaming experience.
MPEG (Moving Picture Experts Group) is a standardization organization that develops and publishes video and audio compression standards. It is responsible for several widely used video compression standards such as MPEG-2, MPEG-4, and H.264/AVC and H.265/HEVC.
In video streaming, MPEG standards are widely used to compress video files, it’s supported by most of the video players and devices. Therefore, it provides a widely accepted standard that ensures compatibility and high video quality.
MPEG-DASH (Dynamic Adaptive Streaming over HTTP) is a video streaming protocol that allows for adaptive bitrate streaming of video over the internet. It is an international standard and is based on the HTTP protocol, which allows for easy integration with existing web infrastructure.
MPEG-DASH works by breaking the video into small segments and creating an index file in the form of a Media Presentation Description (MPD) that contains information about the location of the video segments and their associated bitrates. The viewer’s device requests these segments and plays them in sequence, and it can adapt the video quality in real-time based on network conditions.
One of the key benefits of MPEG-DASH is that it is an open standard, which means that it can be implemented by any company or organization, and is supported by multiple platforms and devices. Additionally, it allows for the use of common encryption schemes and can be easily integrated with existing content delivery networks (CDNs) and streaming servers.
MPEG-DASH is widely supported by major streaming platforms and devices and it’s becoming a widely adopted standard for streaming video over the internet, it’s used by most of the OTT platforms such as YouTube, Netflix, and Amazon Prime.
MSE (Media Source Extensions)
MSE also allows for the use of common encryption schemes such as Common Encryption (CENC) and Encrypted Media Extensions (EME) which provides a secure way to playback encrypted video content.
A manifest in video streaming is a file that contains information about the video segments and their associated bitrates that make up a video stream. It is used to inform the video player about the available segments, their location, and the order in which they should be played. The manifest file typically contains information such as video and audio codecs, video resolution and frame rate, subtitles and captions, and alternative audio tracks.
In video streaming, there are different types of manifest files, such as the M3U8 for HLS and the MPD for DASH. Both of them contain information about the video segments and their location, as well as information about encryption, subtitles and alternative audio tracks.
The manifest file plays a crucial role in the streaming process, as the player uses the information it contains to request the video segments and adapt the video quality in real-time based on network conditions and device capabilities. It also allow to implement features such as adaptive bitrate streaming, and alternative audio tracks.
The manifest file is usually generated during the encoding process, it can be stored on the same server as the video segments or on a separate server, typically, it is served over HTTP, this allows for easy integration with existing web infrastructure and content delivery networks (CDNs)
Midroll in video streaming refers to an advertisement or commercial break that is placed in the middle of a video content. It is a way for creators to monetize their video content by providing an opportunity for brands to advertise to their audience. Midrolls are typically 15-30 seconds long, but the duration can vary depending on the content and the advertiser.
Midroll advertisements can be placed at specific points in the video, such as after a specific segment or at the end of a chapter. They can be pre-recorded and added to the video during the editing process, or they can be dynamically inserted using client-side ad insertion (CSAI) or server-side ad insertion (SSAI).
Midrolls offer a way for creators to monetize their content and for brands to reach a targeted audience. However, it is important for creators to consider how many ads to include and where to place them, as too many ads can disrupt the viewing experience and cause viewers to lose interest. A good practice is to follow the industry standards on the number of ads, and to keep in mind the duration of the video, as well as the context and the audience.
It’s also worth noting that Midrolls are one of the three types of ads in video streaming, the other two being pre-rolls, which are ads placed before the video content, and post-rolls, which are ads placed after the video content.
A Multi-CDN (Content Delivery Network) in video streaming is a strategy that involves using multiple CDN providers to distribute and deliver video content. It is designed to improve the availability, performance, and security of video streaming by leveraging the strengths of multiple CDNs.
A CDN is a network of servers that are strategically placed around the world to deliver content to viewers with low latency and high throughput. CDNs cache and distribute video content to servers that are closer to the viewer, which improves the viewing experience by reducing buffering and providing a higher-quality video.
A Multi-CDN strategy involves using multiple CDN providers to deliver video content, this allows to balance the load across different networks, to improve the video quality and availability to a wider audience. It also allows to improve the redundancy and failover capabilities, which can increase the overall reliability of the video streaming service.
Multi-CDN can also provide different features and functionalities depending on the CDN providers, such as specific geographic coverage, specific video format support, and built-in security features. A Multi-CDN approach provides a way to benefit from the best features of different providers, and it can also help to prevent vendor lock-in.
Using a Multi-CDN strategy for video streaming is considered best practice for large-scale and high-demand video streaming, as it allows for a more reliable and efficient delivery of video content to a global audience.
Multiplexing / Muxing
Multiplexing, also known as muxing, in video streaming refers to the process of combining multiple video, audio, and subtitles streams into a single container format. This allows for the efficient delivery of multiple streams in one file, rather than separate files for each stream.
Multiplexing can be done in different ways, such as by interleaving the video and audio data in a specific order, or by encapsulating the video and audio data into a single stream. It can be done either during the encoding process or during packaging.
Common container formats that are used for multiplexing are MP4, MKV, and TS for example. Each container format has its own specific characteristics and may support different codecs, but all of them have the main goal of multiplexing different video and audio streams into a single file.
Multiplexing is an essential step in the video streaming process, it allows to increase the efficiency of the video delivery and reduces the complexity of the streaming infrastructure. It’s used widely across the industry and it’s a common practice in the encoding and packaging steps.
m3u8 is a file format that is used to create a playlist of video segments for HTTP Live Streaming (HLS) in video streaming. It is an extension of the m3u file format, which is commonly used to create audio playlists. The m3u8 file contains a list of URLs that point to the individual video segments and the associated playlist files.
In HLS, video is broken down into small segments, usually around 10 seconds each, and encoded at multiple bitrates. These segments are stored on a server and the m3u8 file is used to inform the player about the location of the video segments and their associated bitrates. The player then requests the video segments and plays them in sequence, adapting the video quality in real-time based on network conditions.
The m3u8 file also contains information about encryption and subtitles, which allows for a more secure and accessible streaming experience. It’s supported by most modern web browsers and devices, it’s supported by iOS, Android, and many smart TVs, which makes it a widely accepted standard for streaming video over the internet.
It’s worth noting that the m3u8 file is only the playlist file, it contains the URLs to the actual video segments which are stored in the server. The video segments should be in a supported format such as H.264 or HEVC.
OVP (Online Video Platform)
An Online Video Platform (OVP) is a service that provides the infrastructure, tools, and services needed to host, manage, and distribute video content over the internet. These platforms typically include features such as video hosting, transcoding, streaming, storage, and analytics, as well as other tools for creating, managing, and monetizing video content.
OVPs are used by a wide range of businesses and organizations, from media companies and content creators to corporations and educational institutions. They are used to host and distribute a wide range of video content, including live and on-demand video, video podcasts, and video blogs.
An OVP typically provides a set of features that can be used to improve the user experience, such as adaptive streaming, which improves the video quality based on network conditions, closed captions and subtitles, which makes the video accessible to a wider audience, and different monetization options such as ads, subscriptions and pay-per-view.
Using an OVP can simplify the process of hosting, managing, and distributing video content. It allows to focus on the content creation and monetization strategies, while the platform takes care of the underlying technical aspects.
On2 Technologies is a company that developed video compression technology, specifically, video codecs. It was founded in 1992, and it’s one of the pioneers in video compression. It developed several video codecs, the most notable being VP3 and VP4, which were among the first video codecs to be optimized for internet video streaming.
On2 Technologies was later acquired by Google in 2010, and it was used to develop the VP8 codec which is an open source codec, part of the WebM project. VP8 was later replaced by VP9, which is also open-source and became popular among browsers, especially in low-bandwidth scenarios.
On2’s codecs were widely used in the early days of internet video streaming, and they were used by several companies and platforms to deliver video over the internet, it was also widely supported by web browsers, mobile devices and smart TVs.
Although On2 codecs are not as popular as they once were, they are still widely supported and it’s still used today by some platforms, they were important in the early days of internet video streaming and they were one of the first codecs that were optimized for streaming video over the internet, their contributions to video compression are widely acknowledged.
Opus is an audio codec that was developed by the Internet Engineering Task Force (IETF) and it’s designed for interactive real-time applications such as video conferencing, streaming, and web-based interactive audio and video.
Opus is an open, royalty-free and flexible audio codec that can adapt to different types of audio, it can be used for music and speech, it’s able to deliver high-quality audio at low bitrates. It also supports a wide range of sampling rates, from narrowband to fullband audio and it’s able to handle low-delay, variable bitrate and packet loss scenarios.
Opus was standardized in 2012, it has been widely adopted and it’s now supported by most modern web browsers and operating systems, it’s also becoming a popular choice for interactive audio and video applications, it’s also supported by webRTC, which is a widely adopted standard for real-time communications in web applications.
Opus is a good alternative for other codecs such as G.711, G.722, G.722.1, G.722.1C, and G.723, especially in scenarios with low-bandwidth. It can be used in scenarios such as video conferencing, streaming audio, and interactive audio and video experiences.
In video streaming context, Opus could be used as the audio codec to complement video codecs such as H.264, HEVC and VP9. It’s also used to improve the audio quality in live-streaming and low-latency scenarios.
In the context of video streaming, an origin is a server or a group of servers that are responsible for storing and serving the original video files, it is typically the entry point for a content delivery network (CDN).
The origin server is where the original video files are stored and it is responsible for processing and responding to requests for video segments and manifests. In addition, it’s also where the video files are typically transcoded into multiple bitrates and packaged into different streaming formats (e.g., HLS, DASH, etc.).
When a viewer requests a video, the request is typically handled by a CDN edge server which is closer to the viewer, but if the edge server doesn’t have the requested video, it will forward the request to the origin server.
An origin server can also be used to serve video files directly to the viewer, but this is typically only done in cases where the video files are small or if the origin server is located close to the viewer.
Using an origin server allows to centralize the management of video files, it also allows to handle requests for video files and transcoding and packaging tasks. It also allows to integrate with different storage solutions, such as local storage, cloud storage, and object storage.
PDT (Program Date and Time)
Program Date and Time (PDT) in video streaming refers to a metadata attribute that indicates the date and time when a specific piece of content was originally broadcast. The program date and time is used to schedule, organize, and identify video content and it’s commonly used in linear television broadcast and also in live streaming.
PDT is important for live streaming as it allows to schedule the live event and it also allows to organize and find the recorded version of the live event later. It’s also commonly used in traditional linear television broadcast for the same purpose, and it also allows to identify when a specific video content was originally broadcasted.
The PDT can be encoded into the video file or it can be stored as a separate metadata file. It can also be used in combination with other metadata attributes, such as title, synopsis, or show name, and it allows to organize and search video content.
PDT is used by different platforms and devices, such as TVs, set-top boxes, and streaming devices, to provide features such as program guides, DVR, and time-shifting. It’s also used by streaming platforms to provide similar features and also for use cases such as managing the live stream events schedule.
Per-title encoding in video streaming refers to the practice of adjusting video encoding settings for each individual video based on its specific characteristics. This approach aims to provide the best video quality at the lowest possible bitrate, taking into account factors such as resolution, frame rate, and scene complexity.
Traditionally, video encoding has been done using a one-size-fits-all approach, which results in a fixed set of encoding settings for all videos. This approach often leads to suboptimal results, as some videos may require higher bitrates to maintain a certain level of quality, while others can maintain the same quality at a lower bitrate.
Per-title encoding addresses this issue by analyzing each video individually and adjusting the encoding settings accordingly. This can be done by using machine learning algorithms to analyze the video content and predict the optimal encoding settings, or by manually analyzing the video content and adjusting the settings as needed.
The main benefit of per-title encoding is that it allows to improve the video quality, it reduces the bitrate needed to deliver a certain quality of video, and it also allows to save on storage and delivery costs. It also allows to improve the user experience on different devices, as the video can be optimized for the specific device’s capabilities and the network conditions.
A postroll in video streaming refers to a segment of content that is played after the main video content has ended. This segment is commonly used to display ads, promotional content, or other types of content that is not directly related to the main video.
Postrolls are often used to monetize video content by displaying ads, sponsorships or promotions. It is also used by content creators to cross-promote their other content, such as a trailer for a new movie, or a teaser for an upcoming series. Additionally, it can also be used for other types of content such as credits, or a message to the audience.
Postrolls are typically defined in the video’s manifest file, and it is up to the player to determine whether or not to display the postroll. Some players are configured to automatically play postrolls, while others allow the viewer to skip or opt out of viewing the postroll.
Postrolls are common in live streaming, especially in gaming or sports streams where the main event has ended, but the audience stays tuned for additional commentary, highlights or after show, also in live video on demand content like events, where the organizers want to show some information of the next events or sponsors.
It’s worth noting that postrolls can be a good opportunity to monetize the content, but it should be taken into account that a postroll that is too long, or not relevant to the audience, can cause a negative experience. It’s important to find the right balance between monetizing the content and providing a good user experience.
A preroll in video streaming refers to a segment of content that is played before the main video content begins. This segment is commonly used to display ads, promotional content, or other types of content that is not directly related to the main video.
Prerolls are often used to monetize video content by displaying ads, sponsorships or promotions. It is also used by content creators to cross-promote their other content, such as a trailer for a new movie, or a teaser for an upcoming series. Additionally, it can also be used for other types of content such as public service announcements or relevant notices.
Prerolls are typically defined in the video’s manifest file, and it is up to the player to determine whether or not to display the preroll. Some players are configured to automatically play prerolls, while others allow the viewer to skip or opt out of viewing the preroll.
Prerolls are common in video on demand, where the main event is not live, also on live streaming, but the live event has not yet started, they can be used to provide information about the event, show the sponsors and advertisers, or to provide a teaser of the event.
It’s worth noting that prerolls can be a good opportunity to monetize the content, but it should be taken into account that a preroll that is too long, or not relevant to the audience, can cause a negative experience. It’s important to find the right balance between monetizing the content and providing a good user experience.
QoE (Quality of Experience)
In video streaming, Quality of Experience (QoE) is all about the end user and how they perceive the quality of a particular video service. This is dependent on a few factors that impact the user’s experience, including; buffering, frame rate, bitrate, resolution etc.
In contrast to QoS (Quality of Service), QoE is subjective and therefore can be individual to the user, being influenced by factors such as:
- The device being used for playback
- Network conditions; i.e. bandwidth and latency
- Video compression and encoding settings
- The Player, its design/UI and overall performance
- Accessibility, captioning etc
It’s important for video streaming providers and content creators to measure and optimise QoE as it can have a significant impact on user engagement, retention and ultimately revenue.
There are various techniques and metrics used to measure QoE, including; buffering rate, video freeze rate, start-up time and direct user feedback. Once problem areas have been identified, the video quality and overall user experience can be optimised.
QoS (Quality of Service)
Quality of Service (QoS) in video streaming refers to the ability of a network to deliver consistent and predictable performance for video traffic. It is typically measured in terms of metrics such as packet loss, jitter, and throughput.
QoS is an objective measure of the network’s performance and it’s different than the subjective measure of video quality, QoE (Quality of Experience), which takes into account the overall perception of the quality of a video service as perceived by the end user.
QoS is important in video streaming, as the real-time nature of video requires low latency, minimal packet loss and jitter, and sufficient bandwidth to avoid buffering. With video streaming, QoS concerns are not just about providing a certain amount of bandwidth, it’s also about making sure that video packets get priority over other traffic, and that video packets don’t get lost or delayed in the network.
To provide QoS for video streaming, network administrators use various techniques such as traffic shaping, prioritization, and buffering, this allows to optimize the video traffic and to provide a predictable and consistent performance for the video traffic.
QoS is essential for video streaming, as the real-time nature of video requires low latency, minimal packet loss, and jitter, and sufficient bandwidth to avoid buffering. Optimizing QoS can help to provide a better user experience and to reduce buffering and other issues that can cause a poor quality of experience.
RGB color model
The RGB color model in video streaming stands for Red, Green, Blue, it’s a color model used to represent colors digitally. This color model is based on the idea that all colors can be created by mixing different intensities of red, green, and blue light. Each color is represented by a set of three numbers, one for red, one for green, and one for blue.
The RGB color model is used in a wide range of digital devices, such as computer monitors, TVs, and digital cameras, as well as in video and image editing software. It is an additive color model, meaning that adding together different intensities of red, green and blue results in various shades of white, as the highest value for each component is 255.
In video streaming, the RGB color model is used as the internal color representation of the video, it’s used to define the color of each pixel in the frame. Many modern video codecs, such as H.264, H.265, and VP9, use the RGB color model as their internal color representation.
RGB color model is commonly used for broadcast and for the general use of video production, with the sRGB color space being the most widely used RGB color space. The RGB color model can also be used as a way of displaying a video on a screen, and it also used as a base color model in many video and image editing software, as well as in color correction and color grading processes.
RTMP (Real-Time Messaging Protocol) pass-through in video streaming refers to a technique used to transmit video data directly from an encoder to a streaming server, without any processing or modification of the video data in between. This method of streaming video is often used in live streaming scenarios, where the video data is transmitted in real-time and with minimal latency.
RTMP pass-through can be useful in cases where the video quality is already high, and no further processing is required, such as in professional live streaming setups. However, it also implies that the encoding, bitrate, and other settings must be well configured, as the video will not be modified or optimized by the server or other systems.
Rebuffering in video streaming refers to the process of a video player having to stop playback and reload video data due to a lack of data in the buffer. It happens when the player’s buffer runs out of video data and needs to be refilled in order to continue playing. This can be caused by a variety of factors such as insufficient bandwidth, network congestion, or a slow server response time.
Rebuffering is an interruption of the video playback, and it can be frustrating for viewers as it disrupts the viewing experience. The longer the rebuffering time, the greater the negative impact on the user’s experience. Long periods of rebuffering can cause viewers to abandon the video altogether, negatively affecting the video’s engagement, retention and revenue.
Rebuffering can be caused by a variety of factors, such as:
- Insufficient bandwidth
- Network congestion
- Slow server response time
- Limited server resources
- Limited client resources
- Quality of service limitations
In order to minimize rebuffering, streaming providers should use techniques such as Adaptive Bitrate Streaming, which adjusts the video quality to the viewer’s network conditions, and Multi-CDN, which helps to distribute the load across multiple networks. Additionally, it’s also possible to use buffer control, which allows to adjust the buffer size, the bitrate and the quality of the video, to match the network conditions.
Rendition in video streaming refers to a specific version or variation of a video, created by encoding the same source video at different resolutions, bitrates, or other settings. These different versions of the video are known as “renditions,” and they are used in Adaptive Bitrate Streaming (ABR) to provide a better viewing experience for users with different network conditions or devices. In this way, it ensures that the viewer always receives the highest quality video that their network or device can handle.
In practice, creating multiple renditions involves encoding a video multiple times, at different resolutions and bitrates. This process is often automated, using software such as video encoding software or encoding as a service.
Resolution in video streaming refers to the number of pixels that make up the width and height of a video frame. It is typically measured in pixels (e.g. 1080p, 4K) and it is an important aspect of the overall video quality. Higher resolution videos have more pixels and therefore, more detail and sharpness, but also have a higher bitrate requirement and larger file size.
Resolution can have a significant impact on the video quality, as it determines the amount of detail and sharpness that the viewer will see. A higher resolution means more pixels, which leads to a more detailed and sharp image. However, higher resolution also increases the bitrate requirement and file size.
The most common video resolutions are:
- SD (Standard Definition) with a resolution of 720×480 pixels (NTSC) or 720×576 pixels (PAL)
- HD (High Definition) with a resolution of 1280×720 pixels (720p) or 1920×1080 pixels (1080p)
- 4K (Ultra High Definition) with a resolution of 3840×2160 pixels
A slice in H.264 or H.265 video encoding refers to a group of macroblocks that are encoded together as a single unit. Both H.264 and H.265 are video compression standards that divide a video frame into smaller macroblocks (usually 16×16 pixels in size) for compression and encoding. A slice is a way of further dividing a frame into smaller, manageable pieces, allowing for more efficient processing and decoding of the video data.
Each slice contains a number of macroblocks, and each slice can be independently encoded and decoded. This allows for parallel processing of the slices, and can be useful in cases where a video stream has high resolution or high motion. Slicing in H.264 and H.265 can be controlled by a number of parameters such as slice size, slice type and slice structure, which allows for a more efficient use of the resources.
Splicing in video streaming refers to the process of seamlessly merging multiple video and/or audio segments together to create a single, continuous stream. This is often used in live streaming to switch between different camera angles, insert commercial breaks, or to switch between pre-recorded content and live content.
The process of splicing allows for the creation of a dynamic and engaging viewing experience, while also allowing for the seamless insertion of ads or other content. Splicing can be done server-side, by using specialized software that can combine the different video and audio segments together, or client-side, using a media player that is able to switch between different video sources on the fly.
Splicing can also be done in post-production, where video editor will use different videos, audio, and images to create the final desired outcome.
Stitching in video streaming refers to the process of combining multiple video streams, images, or video segments together to create a single, seamless panoramic or 360-degree video. This is often used in live streaming of VR or AR events and experiences, as well as in the creation of immersive videos.
The process of stitching involves aligning and blending multiple video streams, images or video segments together to create a single, seamless image. The result is a 360-degree video that allows the viewer to look around and experience the environment in all directions. This is done by using specialized software that analyzes the different video streams, images or video segments and then aligns and blends them together.
Stitching allows for the creation of immersive video experiences and the ability for viewer to look around and explore the environment in all direction, it’s commonly used in VR, AR and live streaming events.
Transcoding in video streaming refers to the process of converting a video from one format, codec, or bitrate to another in order to make it compatible with different devices or playback environments. It is often used in live streaming and video-on-demand (VOD) services to create multiple versions of a video with different bitrates, resolutions, and codecs, allowing the video to be played on a wide range of devices and networks with varying capabilities.
The process of transcoding involves decoding the original video, making any desired adjustments to the video’s format, codec, or bitrate, and then re-encoding the video in the desired format. The resulting transcoded video can then be streamed or downloaded to a device for playback. Transcoding can also include adding or removing captions, watermarks, or other metadata.
Transcoding allows for a broader reach of video content to different devices and network types, allowing for better compatibility and better viewing experiences for the end-users.
UGC (User Generated Content)
User-generated content (UGC) in video streaming refers to video content that is created and shared by individuals, rather than professional producers or organizations. This type of content can include personal videos, live streams, vlogs, and other forms of self-expression.
UGC is a key feature of many online video platforms and social media sites, and allows users to share their own videos with a wider audience. It has become increasingly popular with the rise of smartphones and other mobile devices that make it easy to record, edit, and share videos online.
UGC can take many forms, from short personal videos to live streams and full-length films. It often reflects the interests and passions of the creator, and can be anything from funny or informative, to personal or political.
VBR (Variable Bitrate)
Variable Bitrate (VBR) in video streaming refers to a method of encoding video where the bitrate is dynamically adjusted throughout the video to maintain a consistent level of video quality. This is in contrast to constant bitrate (CBR) encoding, where the bitrate remains constant throughout the video.
When encoding a video using VBR, the encoder will adjust the bitrate based on the complexity of the video content. For example, during complex or fast-moving scenes, the encoder will increase the bitrate to ensure that the video quality remains high. During simpler or static scenes, the encoder will decrease the bitrate to reduce the file size.
This results in a video file that is smaller in size and has a more consistent quality compared to CBR. It’s useful when the goal is to maintain a high video quality, even in changing video environments. The video file will be smaller than a comparable CBR video and better quality compared to a low CBR video.
VP8 is a video compression format and codec developed by On2 Technologies, which was later acquired by Google. It was first released in 2010 as an open-source alternative to H.264, the dominant video codec at the time. VP8 is optimized for online video and is designed to deliver high-quality video at low bitrates.
VP8 is also part of the WebM project, which provides an open-source container format for VP8 video and Vorbis audio. The WebM format is designed for use with HTML5 video, and is supported by many major web browsers, including Chrome, Firefox, and Opera.
VP9 is a video compression format and codec developed by Google as the successor to VP8. It was first released in 2013 and is an open-source alternative to H.264 and H.265. VP9 is optimized for online video and is designed to deliver high-quality video at low bitrates, while also being more efficient in terms of computation and power usage compared to its predecessor.
VP9 is also part of the WebM project, which provides an open-source container format for VP9 video and Vorbis audio. The WebM format is designed for use with HTML5 video, and is supported by many major web browsers, including Chrome, Firefox, and Opera.
VVC (Versatile Video Coding)
VVC (Versatile Video Coding) is the latest video compression standard developed by the Joint Video Experts Team (JVET) and is considered as the next-generation video codec after H.265/HEVC. It aims to provide an even more efficient video compression method, which can reduce the bitrate needed to achieve the same level of visual quality. The standard was ratified in 2020.
VVC builds on the efficiency of H.265/HEVC by introducing new coding tools such as sample adaptive offset (SAO), motion-compensated interpolation filter (MCIF), and enhanced inter prediction (EIP), among others. These tools allow for a more granular coding of video data which in turn reduces the bitrate needed to maintain the same level of visual quality.
VVC also introduces several new features that can be used to improve the encoding quality, such as the ability to encode video in higher bit depths, more efficient encoding of inter-frame dependencies, and better support for low-light and high-motion video.
It is expected that VVC will be widely used in streaming video services, reducing the overall storage and bandwidth costs for service providers and improving the video quality for end-users.
VoD (Video on Demand)
VoD (Video on Demand) is a type of streaming service that allows users to watch video content whenever they want, rather than at a specific broadcast time. With VoD, users can browse and select from a catalog of pre-recorded content, such as movies and TV shows, and start watching immediately.
VoD is often delivered via the internet and can be accessed on a variety of devices, including smartphones, tablets, smart TVs, and streaming devices like Roku and Amazon Fire TV. There are different types of VoD, including transactional (users pay for each piece of content they watch) and subscription-based (users pay a monthly fee for access to a library of content).
VoD services like Netflix, Amazon Prime Video, and Hulu have become increasingly popular in recent years, as more and more people look to cut the cord on traditional cable and satellite TV subscriptions. In addition, more and more broadcasters, television networks, and movie studios are also entering the VoD space, and are offer their own streaming services.
Vorbis is an open-source, royalty-free audio compression codec that is designed for efficient streaming and high-quality digital audio playback. It is similar to other popular audio codecs such as MP3 and AAC, but it is not encumbered by patent or licensing restrictions.
Vorbis is based on a different type of compression algorithm called a “hybrid coder” which combines both lossless and lossy compression techniques. This allows it to achieve high compression ratios while still maintaining a good audio quality. Vorbis was developed as part of the Ogg project, an open multimedia container format that is designed to provide efficient streaming of video and audio over the internet.
Vorbis is often used in streaming services like Spotify, and for online radio, and it’s also a recommended codec for web-based audio streaming by HTML5 specification. Vorbis is supported by a wide range of software and hardware players, including the popular VLC media player and several mobile platforms like Android.
WebM is a video file format that is designed for web-based video playback. It is an open-source and royalty-free format that is based on the Matroska container format (MKV) and the VP8 and VP9 video codecs. WebM was developed by Google and is widely supported by modern web browsers and a variety of other software and hardware players.
One of the main advantages of WebM is that it is designed to be highly efficient, with small file sizes and fast decoding times. This makes it well-suited for streaming over the internet and for use on mobile devices. Additionally, because it is open-source and royalty-free, WebM is a more cost-effective option for companies and organizations that need to deliver video over the web.
WebM files typically have the file extension .webm and are often used for video on demand, video streaming and for video chat applications.
WebRTC (Web Real-Time Communication) is a collection of technologies and standards that allow web browsers and mobile apps to establish real-time communication sessions for voice, video, and data transfer. It enables applications such as video conferencing, file sharing, and live streaming without the need for plug-ins or additional software.
WebRTC is based on a set of open-source libraries and APIs that provide real-time communication capabilities. It allows browsers to access the microphone and camera of a device and exchange audio, video, and data streams directly between browsers or mobile apps, without the need for a central server. This allows for low latency and high-quality communication even over unreliable networks.
WebRTC uses the VP8 and VP9 video codecs and the Opus and G.711 audio codecs. WebRTC technologies are supported by most modern web browsers, and WebRTC can be used to create various forms of interactive video streaming and communication such as live streaming, Web conferencing, Video calls and more.
A webhook is a way for an application to provide other applications with real-time information. Essentially, a webhook is a way for one application to send a message or information to another application automatically, without the need for the receiving application to periodically check for updates.
In the context of video streaming, webhooks could be used for a variety of tasks such as triggering events in response to a video event(eg. video start, video end, Video encoding completed etc.). These events can be used to notify other systems about the status of the video and take actions like triggering an email, creating a new entry in a database, or updating the user interface.
Webhooks are typically used in conjunction with APIs. An API allows other applications to interact with a system, while webhooks allow that system to actively notify other applications of new or updated data.
Webhooks can be extremely useful in video streaming scenarios, in which it can be used to automate certain tasks like notify someone that the video is ready, notify when a video is too long or notify when a video cannot be played etc.