This technical article describes the essential components and features of a commercial OTT solution. While focusing on OTT Client, it shall also cover the challenges that are involved in designing, implementing and running a full featured client.The Television industry has so far been dominated by providers offering TV in “fixed” mode, where in users had to subscribe to a Cable TV provider (Comcast, ATT) or a Dish TV provider (DISH, DirecTV). In their basic form, these solutions deliver content to a “fixed” place where a subscription is taken (barring some recent possibilities of “place-shifting”). Also in this model, internet and TV are clubbed together as they come from the same vendor.
A commercial OTT solution, on the other hand, offers TV content anywhere and also on multiple devices over regular/public internet. This creates enormous opportunities in the current scenario where internet is widely available and mobile devices (Tablets/Phones) are increasingly being used to watch serious TV. This model provides a more flexible TV watching option to subscribers who are mobile or are not willing to take overheads of a “fixed” TV service.
The following figure depicts various components of an OTT System.
OTT Clients, the most important part of OTT sub-system, stream video content Servers. Typical commercial grade clients are supported on multiple platforms and devices ranging from TVs, to SetTop Boxes, to handheld devices, to Gaming consoles. These clients also host techniques and algorithms to stream effectively over public internet.
Encoders are other significant part of OTT System which encodes content in format understood by OTT Clients. In most cases encode time and encode quality are two goals that encoders try to balance. Especially in live streams, encode time becomes highly critical.
Encoders also generate certain information about content being encoded (like Audio/Video codecs supported, size hints, encryption information etc). OTT clients make use of this information to correctly decode and decrypt the content being played.
CDNs (Content Data Networks) provide caching services to make data available to OTT clients in an efficient manner. Having CDNs reduces load on Content Servers and helps scale the OTT service.
Content Management Servers maintain Schedule and Program information for the content made available to OTT Clients. As a simple example, program schedule for a live channel can be maintained by Content Management Servers. OTT Clients make use of this information to present a program schedule (EPG) to the end user and also determine the Live position of a program while playing Live TV.
User Management Servers are used for User authorization and can allow or disallow set of services/content based on User Subscriptions.
DRM Servers host key servers which are used by Encoders/OTT Client to encrypt and decrypt the content streams, respectively.
Analytics Servers collect data from OTT Clients that can be used to monitor the quality and performance of service. This data can also be subsequently used to improve the OTT service. Data related to viewership (time spent on various channels etc) can also be used as input for Business intelligence.
Following figure depicts various components of an OTT Client.
Application part of an OTT Client primarily consists of User Interface and login based User Authentication, and EPG (Electronic programming guide) users can interact with.
These components are responsible for presenting access to only authorized content to the user as per the subscription.
EPG is an important part of OTT Client as it not only provides a structured access to content to the end user, but it also imposes a time schedule and structure onto an otherwise continuous stream of video being presented by the encoder. This is especially true in case of live channels. As an example ESPN is giving video feed to encoders 24 x 7 and 365 days an year, content itself is hard to interpret and use by end user unless a time schedule is imposed on this ongoing stream. EPG module does this work based on information provided by Content Management Server. So, essentially EPG translates an ongoing stream into discrete shows.
Player provides interface to Application to set various configuration parameters and also indicates key state changes to Application.
Typical interfaces provided by Player:
Typical information indicated by Player to Application
OTT Player is the core engine responsible for downloading, decrypting and decoding the video content.
Schedule Handler is responsible for maintaining and triggering the right video clip to play as per channel, current time (in case of Live) or User selected clip (in case of time shifted content). It instantiates Metadata Handler and Content Downloader to start getting relevant metadata and video stream from Servers. It also maintains the overall player state machine.
This module is also responsible for dynamic ad replacement as marked by Schedule Metadata.
Metadata handler is responsible for getting the relevant Channel/Program metadata and Stream Metadata from Servers. In case of Live stream, this metadata needs to be refreshed periodically as stream is dynamically being encoded and made available at the Server.
Channel Metadata has following typical information:
Stream Metadata has following typical information:
This module measures and providers network related information to Content Downloader. Content Downloader relies on this information to make use of best algorithms and download the best possible profile (bitrate) in given network conditions. Typical information provided as part of Network Measurements include: Download Bandwidth estimation, Connect and Request Latency.
Content Downloader, the most critical part of the Player, determines video playback quality and hence the User Experience. Its Goal is to download the best (highest) possible bitrate video without stalling. If algorithms are too optimistic, there is a chance that video would stall because of lack of content to play in given time. If algorithms are too pessimistic a low bitrate video shall be played hence spoiling the user experience. Hence, fine tuning the Content Downloader is the most challenging part in the Player. It uses information provided by Network Measurements and heuristics to get the best bitrate streamlet at runtime, the technique popularly called Adaptive Streaming.
Content Decryptor is responsible for decrypting the downloaded video content and passing to Device Integration layer for rendering. For certain platforms, decryption may be deferred to rendering mechanism like HLS Player on iOS and Samsung. In such cases Content Decryptor passes relevant information like Control World and IV (Initialization Vector) to HLS Player which can be used to decrypt the content before displaying.
Device Abstraction Layer (DAL), which typically will be different for different platforms, is responsible for actual rendering of video as per specific platform. Some platforms like iOS and Samsung may play MPEG Transport Stream as it is, while some platforms like Android and PC may need Audio and Video frames separated before being rendered. DAL is responsible for converting downloaded video to format suitable for rendering on target device.
Some popular rendering mechanisms are:
Video clock which drives the player is typically derived from the renderer, hence DAL is a very time critical module.
Stats Poster is responsible for reporting various statistics to Analytics Servers. Stats can be used for :
This figure illustrates typical interactions during initialization between OTT Client and Middleware/Backend.
The following figure illustrates typical interactions between OTT Client and Middleware/Backend while playing VoD content.
The following figure illustrates typical interactions between OTT Client and Middleware/Backend while playing VoD content. Basic interactions are same as playing On Demand content, but Channel and Stream information/metadata has to be refreshed periodically as content is encoded on backend in real-time.