Real-Time Streaming Protocol (RTSP) was one of the favorite video technologies in the streaming world before RTMP which is not supported anymore and the HTML5 protocols that are currently breakthrough technology in the streaming world.
RTSP is still one of the most preferred protocols for IP cameras. It also remains the standard in many surveillance and closed-circuit television (CCTV) architectures.
In this blog post, we will dive into the RTSP protocol in detail. You’ll find an answer to the question of what RTSP is and how it works. Also, you will be able to compare other protocols to RTSP and find the best alternative to RTSP streaming protocol.
Table of Contents
- What is a streaming protocol?
- What is RTSP?
- RTSP Technical Specifications
- The History of RTSP Streaming
- How Does RTSP Work?
- RTSP Requests
- Alternative to RTSP for First-Mile Delivery – Ingesting
- RTSP and IP Camera
- RTSP Streaming With Ant Media
What is a streaming protocol?
So, what is a streaming protocol? A streaming protocol is a standardized method of transmitting video or audio streaming content between devices over the internet.
A video streaming protocol sends “chunks” of video or audio content from one device to another device. The method of converting these “chunks” into replayable content on the player device is called the “reassembling” method.
For a successful process, the end device must support the protocol used by the sender. Otherwise, it will not be possible to play the broadcast.
What is RTSP?
“The Real-Time Streaming Protocol (RTSP) establishes and controls either a single or several time-synchronized streams of continuous media such as audio and video. It does not typically deliver the continuous streams itself, although interleaving of the continuous media stream with the control stream is possible In other words, RTSP acts as a “network remote control” for multimedia servers.”
Internet Engineering Task Force explains RTSP like this:
When a user starts a video stream from an IP camera using RTSP, the device sends an RTSP request to the streaming server. After the setup between IP camera and server is completed, video and audio data can be transmitted using RTP.
RTSP Technical Specifications
- Audio Codecs: AAC, AAC-LC, HE-AAC+ v1 & v2, MP3, Speex, Opus, Vorbis
- Video Codecs: H.265 (preview), H.264, VP9, VP8
- Playback Compatibility: Not widely supported and rarely used for playback (Quicktime Player and other RTSP/RTP-compliant players, VideoLAN VLC media player, 3Gpp-compatible mobile devices)
- Benefits: Low-latency and ubiquitous in IP cameras
- Drawbacks: Not optimized for quality of experience and scalability
- Latency: 2 seconds
- Variant Formats: The entire stack of RTP, RTCP (Real-Time Control Protocol), and RTSP is often referred to as RTSP
The History of RTSP Streaming
RTSP streaming has been used for quite a long time. A partnership between RealNetworks, Netscape, and Columbia University first developed and delivered the protocol in 1996-97. RTSP protocol was developed through hands-on experience of streaming practice with RealNetworks’ RealAudio and Netscape’s LiveMedia. Its main purpose is “VCR-like control” over media streams. VCR-like control is the ability to play, pause, rewind, and otherwise direct the viewing experience. It was pretty cool in the late ’90s, even if it doesn’t sound interesting right now.
RTSP was standardized in 1998 as RFC 2326 and immediately became useful as a way for users to play audio and video directly from the internet without downloading the files to their device first. People really liked it!
It was built on existing standards of the time, resembling HTTP in operation (therefore easily compatible with existing HTTP networks), and was able to use SDP (Session Description Protocol) for multimedia communication sessions.
It is an application layer protocol that communicates with a media server to create a session and send commands such as “Pause” and “Play” rather than transmitting actual streaming data. Traditionally, most RTSP servers use RTP (Real-Time Transport Protocol) and RTCP (Real-Time Control Protocol) to transmit media streams.
As I said above, RTSP was once one of the leading streaming technologies for internet audio and video streaming. Over time, HTTP-based streaming technologies and adaptive bitrate streaming solutions began to eclipse the old technologies such as RTSP and RTMP (R.I.P). Original authors Anup Rao, Rob Lanphier, and others proposed an RTSP version 2.0 in 2016, with updates intended to shorten round trip communications with the media server and address some issues with network address translation (NAT).
It also remains the protocol of choice for IP cameras, which are used in a majority of surveillance, CCTV, and conference video technologies all of which might be used as a source for the live broadcast.
How Does RTSP Work?
RTSP is conceptually similar to HTTP in function and was easily compatible with existing HTTP networks when it was first developed.
It was described as a “network remote control” for media servers. It was designed to control the streams without downloading any files. When a video stream is started, a device using the protocol sends an RTSP request to the media server that initiates the setup process.
RTSP also supports several control request operations (also known as “commands”) such as play, pause, setup, etc. (I will give you some example requests in the following section.) The first request must also inform the client about the available options via the “OPTIONS” command. After that, a user can watch, or turn off the stream. RTSP maintains an end-to-end connection with TCP and achieves a high throughput over this stable connection without requiring any local download or caching.
The protocol does not support content encryption or retransmission of lost packets, as RTSP is connected to a dedicated server for streaming and relies on RTP to transmit real media. These limitations along with scaling problems led to a drop in overall RTSP usage.
When negotiating and controlling media streams, RTSP usually uses the following commands usually sent from the client to the server:
- Options: This request determines what other types of requests the media server will accept.
- Describe: A description request identifies the URL and type of data.
- Announce: The announce method describes the presentation when sent from the client to the server and updates the description when sent from server to client.
- Setup: Setup requests specify how a media stream must be transported before a play request is sent.
- Play: A play request starts the media transmission by telling the server to start sending the data.
- Pause: Pause requests temporarily halt the stream delivery.
- Record: A record request initiates a media recording.
- Teardown: This request terminates the session entirely and stops all media streams.
- Redirect: Redirect requests inform the client that it must connect to another server by providing a new URL for the client to issue requests to.
There are also other types of RTSP requests such as ‘get parameter,’ ‘set parameter,’ and ’embedded (interleaved) binary data. You can find more information here.
Alternative to RTSP for First-Mile Delivery – Ingesting
Now let’s switch gears and learn about the other protocols that can be alternatives to RTSP. The critical point here is that each protocol has its own unique purpose, features, and way of working. So the “best streaming protocol” completely for each case depends on the usage scenario. After this section, you will be able to choose the best alternative to the RTSP streaming protocol for your needs and use case.
RTSP vs RTMP
RTMP streaming protocol, Transmission Control Protocol-based technology, was developed by Macromedia for streaming audio, video, and data over the Internet, between a Flash player and a server. Macromedia was purchased by its rival Adobe Inc. on December 3, 2005. RTMP stands for Real-Time Messaging Protocol and it was once the most popular live-streaming protocol. It can be used for first-mile delivery/ingest but can’t be used for last-mile delivery/play.
RTMP Streaming Protocol Technical Specifications
- Audio Codecs: AAC, AAC-LC, HE-AAC+ v1 & v2, MP3, Speex
- Video Codecs: H.264, VP8, VP6, Sorenson Spark®, Screen Video v1 & v2
- Playback Compatibility: Not widely supported anymore
- Limited to Flash Player, Adobe AIR, RTMP-compatible players
- No longer accepted by iOS, Android, most browsers, and most embeddable players
- Benefits: Low-latency and minimal buffering
- Drawbacks: Not optimized for quality of experience or scalability
- Latency: 5 seconds
- Variant Formats: RTMPT (tunneled through HTTP), RTMPE (encrypted), RTMPTE (tunneled and encrypted), RTMPS (encrypted over SSL), RTMFP (layered over UDP instead of TCP)
RTSP vs WebRTC
WebRTC stands for web real-time communications. WebRTC is a very exciting, powerful, and highly disruptive cutting-edge technology and streaming protocol.
WebRTC is HTML5 compatible and you can use it to add real-time media communications directly between browser and devices. And you can do that without the need for any prerequisite of plugins to be installed in the browser. WebRTC is progressively becoming supported by all major modern browser vendors including Safari, Google Chrome, Firefox, Opera, and others.
Thanks to WebRTC video streaming technology, you can embed the real-time video directly into your browser-based solution to create an engaging and interactive streaming experience for your audience without worrying about the delay. WebRTC video streaming is just changing the way of engagement in the new normal.
- Ultra-Low Latency Video Streaming – Latency is 0.5 seconds
- Platform and device independence
- Advanced voice and video quality
- Secure voice and video
- Easy to scale
- Adaptive to network conditions
- WebRTC Data Channels
RTSP vs HLS
HLS stands for HTTP Live Streaming. HLS is an adaptive HTTP-based protocol used for transporting video and audio data/content from media servers to the end-user’s device.
HLS was created by Apple in 2009. Apple announced the HLS at about the same time as the legendary device iPhone 3. Earlier generations of iPhone 3 had live streaming playback problems, and Apple wanted to fix this problem with HLS.
Features of HLS video streaming protocol
- Closed captions
- Fast forward and rewind
- Alternate audio and video
- Fallback alternatives
- Timed metadata
- Ad insertion
- Content protection
HLS Technical Specifications
- Audio Codecs: AAC-LC, HE-AAC+ v1 & v2, xHE-AAC, Apple Lossless, FLAC
- Video Codecs: H.265, H.264
- Playback Compatibility: It was created for iOS devices. But now all Google Chrome browsers; Android, Linux, Microsoft, and macOS devices; several set-top boxes, smart TVs, and other players support HLS. It is now a universal protocol.
- Benefits: Supports adaptive bitrate, reliable, and widely supported.
- Drawbacks: Video quality and viewer experience are prioritized over latency.
- Latency: HLS allows us to have 5-20 seconds latency, but the Low-Latency HLS extension has now been incorporated as a feature set of HLS, promising to deliver sub-2-second latency.
RTSP vs CMAF
Common Media Application Format (CMAF) is basically a new format to simplify the delivery of HTTP-based streaming media. It is an emerging standard to help reduce cost, complexity and provide latency around 3-5 secs in streaming.
As a result of the declining status of RTMP, other HTTP-based (Hypertext Transfer Protocol) technologies for adaptive bitrate streaming have emerged. However, different streaming standards require different file containers. Such as while MPEG-DASH uses .mp4 containers, HLS streams are delivered in .ts format.
Therefore, every broadcaster who wants to reach a wider audience must encode and store the same video file twice, because encryption creates completely different groups of files.
These two versions of the same video stream should be made either in advance or instantly. Both of these procedures require additional storage and processing costs.
Apple and Microsoft suggested Moving Pictures Expert Group create a new uniform standard called Common Media Application Format (CMAF) to reduce complexity when transmitting video online.
Let’s look at what Akamai said about this:
“These same files, although representing the same content, cost twice as much to package, twice as much to store on origin, and compete with each other on Akamai edge caches for space, thereby reducing the efficiency with which they can be delivered.”
The importance of CMAF comes into play here. As a standard streaming format across all platforms, it helps us with single-approach encoding, packaging and storage. Hence, Common Media Application Format makes the video streaming process much cheaper and less complicated.
Advantages of CMAF Streaming
CMAF streaming technology is one of the easiest ways to reduce streaming latency and complexity of streaming. CMAF streaming helps us with;
- Cutting costs
- Minimizing workflow complexity
- Reducing latency
RTSP and IP Camera
Most IP cameras use the RTSP protocol to capture the broadcast to the media server. IP cameras, which are specially used for surveillance. They also work great when you want to live stream from a fixed location. One of the great things about IP cameras is that they don’t need an extra encoder. When pairing IP cameras with a server, RTSP easily does the job for you.
RTSP Streaming With Ant Media
Ant Media provides ready-to-use, highly scalable real-time video streaming solutions for live video streaming needs. Based on customer requirements and preferences, it enables a live video streaming solution to be deployed easily and quickly on-premises or on public cloud networks such as Alibaba Cloud, AWS, and Azure.
Ant Media Server supports most of the common media streaming protocols like RTMP, HLS, DASH, WebRTC, and of course RTSP. Actually, Ant Media Server is one of the best media servers available in the market that can serve different streaming needs. Ant Media Server provides all of the features listed above.