WebVTT based thumbnails

Overview

This format consists of WebVTT files referencing JPEG/PNG images or sprites and also providing mapping information to a specific time range in the media corresponding to the referenced image or section of a sprite. This format is agnostic of the stream protocol HLS or DASH.

Specification

WebVTT is a text based format and is part of HTML5 standard commonly used for subtitles/closed captions. WebVTT based thumbnails is a de-facto standard based on WebVTT time aligned metadata and spatial dimension of media fragments (media fragments being jpeg/png thumbnail files or sprites).

The structure of VTT file consists of WEBVTT text at the beginning of file followed by a list of time aligned thumbnail items, each including:

  • The playback time range for the thumbnail. The range needs to be in HH:MM:SS.MMM format.
  • The URL of the thumbnail image for this time range. The URL is relative to the VTT file. Any of absolute, relative or root relative link can be used for thumbnail URL. Below is an example of a VTT using single thumbnail image.
WEBVTT
00:00:00.000 --> 00:00:05.000 thumbnails/128p/single/thumbnail_01.jpg
00:00:05.000 --> 00:00:10.000 thumbnails/128p/single/thumbnail_02.jpg
00:00:10.000 --> 00:00:15.000 thumbnails/128p/single/thumbnail_03.jpg
  • OR, optimally, the URL of the thumbnail sprite (which is multiple thumbnails stitched together in a single image) with each individual thumbnail referenced by appending their coordinates to the thumbnail URL using a spatial spatial dimension of media fragments. Below is an example of VTT with thumbnail sprite.
WEBVTT
00:00:00.000 --> 00:00:05.000 f08e80da-bf1d-4e3d-8899-f0f6155f6efa.jpg#xywh=0,0,120,67
00:00:05.000 --> 00:00:10.000 f08e80da-bf1d-4e3d-8899-f0f6155f6efa.jpg#xywh=120,0,120,67
00:00:10.000 --> 00:00:15.000 f08e80da-bf1d-4e3d-8899-f0f6155f6efa.jpg#xywh=240,0,120,67

In above example, coordinates #xywh=0,0,120,67 represent the thumbnail image at top left corner of the sprite with dimensions width=120 pixels and height=67 pixels. The coordinates #xywh represent the following:

x : x-axis co-ordinate in pixel
y : y-axis co-ordinate in pixel
w : Width in pixel
h : Height in pixel

Thumbnail Sprite example

Playback Support

This format is agnostic of streaming protocol(HLS/DASH) which means the VTT URL is not embedded in DASH MPD file or HLS M3U8 playlist files. The thumbnail VTT file URL is provided to player SDK through source configuration API on respective player SDK platform.

Player parses the VTT file and downloads the image and exposes the image URL and timing information via getThumbnail API which is used by default Bitmovin player UI to render the thumbnail image on progress timeline when user seeks/scrubs forward and backward. For custom UI, applications can use this API to get thumbnail information and render thumbnails. The API on respective SDK platform is listed below.

Demo/Sample

Thumbnail Seeking Sample

Sample WebVTT Thumbnail