The Real-time Transport Protocol ( RTP ) is a network protocol for sending audio and video over an IP network. RTP is widely used in communication and entertainment systems involving streaming media, such as telephones, video teleconferencing applications including WebRTC, television services and web-based push-to-talk features.
RTP usually runs on top of User Datagram Protocol (UDP). RTP is used in conjunction with RTP Control Protocol (RTCP). While RTP carries media streams (eg, audio and video), RTCP is used to monitor transmission statistics and service quality (QoS) and help synchronize multiple streams. RTP is one of the technical foundations of Voice over IP and in this context is often used in conjunction with signaling protocols such as Session Initiation Protocol (SIP) that build connections across the network.
The RTP was developed by the Audio-Video Transportation Working Group of the Internet Engineering Task Force (IETF) and was first published in 1996 as RFC 1889, replaced by RFC 3550 in 2003.
Video Real-time Transport Protocol
Ikhtisar
RTP is designed for end-to-end, real-time, streaming media transfers. This protocol provides facilities for jitter compensation and packet loss detection and out-of-order delivery, which is common during transmission on an IP network. RTP allows the transfer of data to multiple destinations via IP multicast. RTP is considered the main standard for audio/video transport in IP networks and is used with associated profile and formats of payload. The RTP design is based on an architectural principle known as a framing-layer application in which protocol functions are implemented in applications as opposed to a stack of operating system protocols.
Real-time multimedia streaming applications require timely delivery of information and often can tolerate some missing packets to achieve this goal. For example, the loss of packets in an audio application can result in the loss of a small portion of the audio data, which can be made less conspicuous with appropriate error-masking algorithms. Transmission Control Protocol (TCP), although the standard for RTP usage, is not usually used in RTP applications because TCP prioritizes reliability rather than timeliness. Instead most of the RTP implementations are built on the User Datagram Protocol (UDP). Other transport protocols specifically designed for multimedia sessions are SCTP and DCCP, though, by 2012, they are not widely used.
RTP is developed by the Audio/Video Transport working group of the IETF standard organizations. RTP is used in conjunction with other protocols such as H.323 and RTSP. The RTP standard defines a pair of protocols: RTP and RTCP. RTP is used for multimedia data transfer, and RTCP is used to periodically send control information and QoS parameters.
The RTP specification describes two sub-protocols, RTP and RTCP. Data transfer protocol, RTP, facilitates the transfer of real-time data. The information provided by this protocol includes timestamps (for synchronization), sequential numbers (for packet loss and reordering detection) and payload formats that show encoded data formats. The control protocol, RTCP, is used for service quality feedback (QoS) and synchronization between media streams. RTCP traffic bandwidth compared to small RTP, usually around 5%.
RTP sessions usually begin between communicating peers using signaling protocols, such as H.323, Session Initiation Protocol (SIP), RTSP, or Jingle (XMPP). This protocol can use the Session Description Protocol to negotiate parameters for sessions.
RTP session created for each multimedia stream. The session consists of an IP address with a pair of ports for RTP and RTCP. For example, audio and video streams use a separate RTP session, allowing the recipient to selectively accept components from a particular stream.
The specification recommends that the RTP port number be selected to be uniform and that each associated RTCP port becomes the next higher odd number. However, one port is selected for RTP and RTCP in an application that multiplies the protocol. RTP and RTCP typically use unprivileged UDP ports (1024-65535), but may also use other transport protocols, especially, SCTP and DCCP, as the protocol design is independent transport.
Maps Real-time Transport Protocol
Payload profile and format
One of RTP's design considerations is to bring in a variety of multimedia formats and allow for new formats without revising the RTP standard. For this purpose, the information required by the specific application of the protocol is not included in the generic RTP header; it is provided through the RTP profile and payload format. For each application class (e.g., Audio, video), RTP defines profile and one or more related payload formats . The full specification of RTP for the use of certain applications requires the specification of the profile and payload formats.
The profile defines the codec used to encode payload data and mapping them to the payload format code in the Payload Type field (PT) of the RTP header. Each profile is accompanied by several charge formats specifications, each of which describes the transport of certain encoded data. Audio charge formats include G.711, G.723, G.726, G.729, GSM, QCELP, MP3, and DTMF, and video payload formats including H.261, H.263, H.264, and MPEG -1/MPEG-2. MPEG-4 audio/video stream mapping to the RTP packet is specified in RFC 3016, and H.263 video payload is described in RFC 2429.
Examples of RTP profiles include:
- The RTP profile for Audio and video conferencing with minimal control (RFC 3551) defines a set of static charge type tasks, and mechanisms to map between the payload format, and the payload type identifier (in the header ) using the Session Description Protocol (SDP).
- The Secure Real Time Transport Protocol (SRTP) (RFC 3711) defines an RTP profile that provides cryptographic services for payload data transfers.
- Experimental Data Profile Control for RTP (RTP/CDP) for machine-to-machine communication.
Package header
The RTP package is created in the application layer and submitted to the transport layer for submission. Each RTP media data unit created by the application starts with the RTP packet header.
The RTP header has a minimum size of 12 bytes. After the header, optional header extensions may exist. This is followed by an RTP payload, a format specified by a particular application class. The fields in the headers are as follows:
- Version : (2 bits) Shows the protocol version. The current version is 2.
- P (Padding) : (1 bit) Used to indicate if there is an additional padding byte at the end of the RTP packet. A padding can be used to populate certain size blocks, such as those required by the encryption algorithm. The last byte of padding contains the number of bytes added padding (including itself).
- X (Extension) : (1 bit) Shows the Extension header between the standard header and payload data. This is a specific app or profile.
- CC (CSRC count) : (4 bits) Contains the CSRC identifier number (defined below) that follows the fixed header.
- M (Bookmarks) : (1 bit) Used at the application level and defined by profile. If it is set, it means that the current data has some relevance specific to the app.
- PT (Load Type) : (7 bits) Shows the payload format and determines its interpretation by app. This is determined by the RTP profile. For example, see RTP Profiles for audio and video conferences with minimal control (RFC 3551).
- Sequence Number : (16 bits) Serial number increases one for each RTP data packet sent and will be used by the receiver to detect missing packets and to return packet sequence. RTP does not specify any action on packet loss; it's submitted to the app to take the appropriate action. For example, a video application can play the last known frame in lieu of the missing frame. According to RFC 3550, the initial value of the serial number should be random to make known plaintext-attacks on encryption more difficult. RTP does not provide delivery guarantees, but the presence of sequence numbers makes it possible to detect missing packets.
- Time stamp : (32 bits) Used by the receiver to roll back the received sample at the correct time and interval. When multiple streams of media are present, timestamps may be independent in each stream. Time details are app specific. For example, an audio app that samples data every 125 Âμs (8 kHz, common sample rate in digital phones) will use that value as its clock resolution. Video stream typically uses 90 kHz clock. The hours details are one of the details specified in the RTP profile for the app.
- SSRC : (32 bits) The source identifier synchronizes uniquely identifies the stream source. Sources sync in the same RTP session will be unique.
- CSRC : (each 32 bits, number represented by field CSRC count ) Contribute source ID specifies sources that contribute to streams generated from various sources.
- Header Extensions : (optionally, attendance is indicated by Extension field) The first 32-bit word contains a profile-specific identifier (16 bits) and a long (16 bit) shows the length of the extension (EHL = extension header length) in a 32-bit unit, excluding 32 bits from the extension header.
System operation
Functional network-based systems include protocols and other standards in conjunction with RTP. Protocols such as SIP, Jingle, RTSP, H.225 and H.245 are used for session initiation, control and termination. Other standards, such as H.264, MPEG and H.263, are used to encode the payload data as determined by the applicable RTP profile.
Source of the article : Wikipedia