Pages

SIP Requests and Responses

Based on SIP specifications, there are bunch of Request Method in Request-Line and Response Codes in Response-Line. On the bottom of this post, I made up a list for Request Methods and Response Codes. And how to use the major request/response operation with examples in this post.


INVITE ( / 200 OK / ACK ) operation


When a user agent desires to initiate a session, it gets started with INVITE request. Unlike other requests, INVITE needs three-way handshake; INVITE / 200 OK / ACK. There is no separate ACK transaction, which means TU passes one directly to the transport layer for transmission and does not wait for response to it.

Other methods are expected to complete rapidly and non-INVITE transactions are known as two-way handshake. INVITE transaction is different from other methods because it takes some duration to receive the final response, 200 OK. Normally human input is required on the callee's end point system, 1xx provisional responses like 100 Trying, 180 Ringing and etc. can be used before the final response like 200 OK  is sent by the callee.

When the client receives 1xx responses, retransmission cease altogether and wait for the final response. If other responses like 3xx, 4xx, 5xx, or 6xx are received, the caller can retry INVITE.



On the above example, the caller initiates a call with INVITE and it alerts the callee's device and ring back tone can be hear on the caller side. When the callee answers the call, 200 OK final response is sent to the caller and ACK is sent from the caller as final message in this 3-way INVITE operation. And also Cseq (sequence number and method) in ACK is combined from INVITE as original request and ACK as another.


PRACK ( and UPDATE) operation


PRACK method is used by clients in order to ask callee's network to reserve network resource for call establishment. Supported header field with option tag "100rel" must be included in INVITE request and Require header field containing option tag "100rel" in non-100 responses to the request before doing PRACK related operation.

When PRACK operation is negotiated, UPDATE method can be used in order to update SDP description on both client and server sides. Depending on how to negotiate SDP features between the caller and callee, there are 2 different status types; End-to-End Status Type and Segmented Status Type.

- End-to-End Status Type: Reservation gets started on both sides for further SDP negotiation with UPDATE methods
- Segmented Status Type: Caller's resource reservation is done before INVITE request

Here is an example with PRACK operation for the first case. Supported with 100rel in INVITE request with SDP (a=curr:qos e2e none) is sent to the callee, then the callee's network resource reservation can get started with 183 Session Progress response. Since callee's reservation got started first and assure callee's sending resource is reserved, the callee asks confirmation of the caller's sending resource (a=conf:qos e2e recv). After the caller receive this, caller's sending resource reservation gets started under PRACK operation. When reservation is done, UPDATE request is sent with it's modified SDP (a=curr:qos e2e send). Lastly the callee sends 200 OK response with it's modified SDP (a=curr:qos e2e sendrecv), which meets the original destination status.



And here is another example for Segmented Status Type. In this case caller's network resource reservation can be started before INVITE is sent from the caller. Since the INVITE includes reserved resource related SDP (a=curr:qos local sendrecv, a=curr:qos remote none), the callee should reserve resource after the request received. Reservation is done and then 180 Ringing with modified SDP (a=curr:qos local sendrecv) is sent to the caller. Since SDP modification is completed, PRACK and other INVITE related operation is deployed without UPDATE operation.



Let's clarify each SDPs attributes for these 2 different types with comparison table below.




REGISTER operation


A user agent must place registration before make or receive calls through SIP protocol. In protocol message perspective, REGISTER operation is quite simple and usage of header fields is a bit different from other requests. Both To and From header fields have the same contents because REGISTER request does not have purpose to call another but register itself.




When REGISTER is shown in Wireshark log, it seems pretty much simple and 200 OK response says 1 bindings. In order to understand what this means, SIP message does not help and need to understand how server side works based on RFC 3261.



Here is another example, which I draw based on my understanding of Figure 2: REGISTER example in RFC 3261. When UA sends REGISTER request to Registrar server, To and Contact header fields are critical role on server side. Registrar asks Location server to add this user, domain name (address-of-record) in To header field and Contact address, which is so called, binding on Location server. These Registrar and Location servers could be located in the same or different position. This picture below shows logical operation how to handle REGISTER request through network. Once a UA is registered on a location server, another UA can refer to that Location server via proxy servers in order to call the first UA.



UAs can use 3 steps to determine the address to which to send a REGISTER request; 1) by configuration of phone side, 2) using the address-of-record, 3) multicast. If there is not configuration of registrar address on device side, UA should use the host part of the address-of-record in Request-URI.



Authorization/Authentication can be required while registration or call. Here is an example with REGISTER operation. UAC originates REGISTER request with Authorization field header, which does not have credentials; no nounce and response values. UAS can challenge the originator to provide credentials by rejecting it with a 401 Unauthorized status code. WWW-Authentication header field in the response contains authentication algorithm and parameters such as nounce, qop, and opaque. Once UAC receives this, the same request with more parameters in Authorization header field including generated response value. Finally 200 OK response comes out from UAS after add binding.


SUBSCRIBE / NOTIFY operation


UAs use SUBSCRIBE operation in order to refresh remote state, server's state when they would like to subscribe to resource or call state for various resources in the network. Server on network side sends notification when those stages changed.

SUBSCRIBE method includes Expires header with number and Event header with event package name. In registration procedure, Event header with "reg" tag is used in SUBSCRIBE request and the 200 OK response is sent. This SUBSCRIBE method uses a new Call-ID and a new tag, which are different from what used in previous dialog. If the Event header cannot be understood on server side, non-200 class responses like 489 Bad Event or 406 Not Acceptable are sent.

After resource is reserved and refresh state on network side, NOTIFY method is sent to the UA. This includes Event header with "reg" tag as well as Subscription-State header with state (active, pending, terminated, or so) and expires.





Lastly, here is the list for SIP Requests and Responses






SIP Structure and Header Fields

SIP protocol seems using pretty much simple terms but most terms are rolling up like a snowball, then it gets so confusing to us. Based on RFC3261, SIP protocol structure is defined, which does not belong to OSI defined layers.


SIP Protocol Structure





According to RFC3261 5 Structure of the Protocol, SIP protocol layers can be defined as follows; Syntax and Encoding, Transport layer, Transaction layer, Transaction user.


SIP - Syntax and Encoding

Encoding is specified using an augmented Backus-Naur Form grammar (BNF) as received from trasnport layer

SIP - Transport Layer

This layer defines how a client transport sends requests and receives responses and how a server is responsible for actual reception of requests and transmission of response over network. It is responsible for managing persistent connections for transport protocols like UDP and TCP over network. The opened connections are shared between the client and server transport functions. These connections are indexed by the tuple formed from address, port, transport protocol.

OSI layer also defines transport layer such as UDP, TCP, and etc., and then "port" term seems pretty much confusing here. What RFC3261 calls those UDP, TCP, and SCTP is "Transport protocol" and presents port on UDP, TCP, SCTP, or etc. "SIP - Transport layer" on top of "Transport protocol" also has a logical concept of "port"; source and receive ports. Since the source port on SIP protocol is often ephemeral, but it cannot be known whether is ephemeral or not on destination side, 2 different connections in use; one for requests and the other for responses.

SIP - Transaction Layer

This layer has a client and server side (client transaction and server transaction specifically called). Both client and server transactions are logical functions that are embedded in any number of elements. UAC, UAS, and stateful proxy have transaction layer have transaction layer and stateless proxy does not have transaction layer.

Client transaction sends requests and server transaction sends responses over network. Client transaction receives requests from TU (Transaction User) and delivers them to a server through network. Client transaction also receives responses and delivers them to the TU. Server transaction receives requests from transport layer and delivers them to the TU. Server transaction accepts responses from the TU and delivers them to transport layer over network.

SIP - Transaction User

Transaction User (TU) is SIP entities, which includes UAC core, UAS core, proxy core, and registrar core. There is no TU in stateless proxy. When a TU wishes to send a request, it creates a client transaction instance and passes the request along with destination info (destination IP address, port, and transport). When a client cancels a transaction, it requests that the server stop further processing, revert to the initial state. This is done with a CANCEL request.

In short, since Syntax and Encoding is a function, which does not have any state machine. If it is regardless, we can get clearer pictures below.



















Even though SIP protocol layers are described in RFC3261, there are more terms still confusing as follows.

Message (with Method) and Message Body (with SDP)

Message is data sent between SIP elements as part of the protocol. SIP messages are either requests or responses. Method is primary function that a request mean to invoke on a server side for example, INVITE and BYE. SIP protocol transaction can be initiated by any requests first and appropriate responses are expected all the time.

SIP message contains Message Body if necessary and Session Description Protocol is one of most common bodies in there. It contains session name, purpose, media comprising the session, and bandwidth information and etc. Based on RFC specifications, SIP protocol sequence can be understood by SDP offer and answer.

SIP messages' direction refers to who sends requests while SDP session description does to who sends offers first. More detailed SDP headers are listed up on the bottom of this post.

Client and Server

Client is any network element which sends SIP requests and receives SIP responses. Clients typically may interact directly with a human user. User agent clients and proxies are clients. Server is any network element, which receives SIP requests in order to service them and sends back responses to those requests. Examples are proxies, user agent servers, redirect servers, and registrar.

(Back-to-Back) User Agent, UAC, and UAS

Back-to-Back user agent is logical entity. User agent receives a request and processes it as a user agent server (UAS) as well as it also acts as a user agent client (UAC) and generates a request.

Upstream and Downstream

A direction of request message from a client side to a server side is defined as downstream. We might think of client -> proxy -> proxy .. proxy -> proxy -> server. Unlike this, upstream is defined as a direction of responses flow from a server side to a client side. We can simply say server -> proxy -> proxy .. proxy -> proxy -> client.

Call, Dialog, and Session

Dialog means a peer-to-peer SIP relationship between 2 UAs that persists for some times. It can be established by 2xx responses to a INVITE request and identified by a Call-ID, a local tag, and a remote tag. Call is an informal term. Session is a multimedia connection and stream flows from a sender (caller) to a receiver (callee).

Provisional responses and Final responses, and SIP Transaction

A final response terminates a SIP transaction; all 2xx, 3xx, 4xx, 5xx, and 6xx responses. A provisional responses used by a server to indicate progress but does not terminate a SIP transaction; all 1xx responses. SIP transaction occurs between a client and a server. It comprises all messages from the first request sent from the client to the server up to a final (non-1xx) responses sent from the server to the client. If the final response to INVITE request is non-2xx, the transaction also includes ACK as another sub-transaction to the response.

Loose Routing and Strict Routing

Loose routing proxy keeps the present Request-URI in requests as it is and adds Record-Route header field. In this Record-Route header, "lr" parameter indicates loose routing. Record-Route header is added each loose routing proxy. It can be referred to RFC3261. Unlike loose routing, strict routing replaces the present Request-URI with next destination of hop based on RFC2543. Generally loose routing proxies are preferred.



Now based on this understanding these terms and structure, each UA and proxy server may be configured below. In this picture, a request is transmitted through blue arrows and a response is back into Alice's phone through brown ones.

SIP Header Fields


Here is SIP header field list based RFC 3261 and RFC 2543. With the table on the bottom of this post shows up most kinds of header fields in SIP even though the table is not sufficient for us to understand those headers clearly. With more pictures and flowchart, we need clarify some frequently used headers in practical approach.


CSeq and Call-ID headers vs. SIP transactions and Call (or Dialog)


Generally SIP transaction begins with SIP request sent from server to client and any response returned back from server to client. There are few exceptions like ACK, which does not need response. CSeq header (method + sequence number) is used to trace those SIP transactions.

A caller originates a call to a callee, and this whole procedure is observed as a dialog formally or a call informally. In a single dialog or call, several transactions are in there. Call-ID is used to trace a call or dialog.



In the picture above, INVITE and BYE transactions can be traced with CSeq method. ACK transaction is completed without response. Especially INVITE transaction has several provisional responses and final response with 2xx. And ACK transaction has the same sequence number as INVITE has and ACK as method, which means INVITE transaction absorbs ACK transaction.


From, To, and Via headers vs. Contact header and Source/Destination IPs


Where packet is sent is Source object and where it is heading for is Destination object. Whenever packet is transmitted, the source and destination objects are changed in Wireshark logs, and the objects could be IP addresses and TCP/UDP port numbers.

When a SIP request is initially sent from a caller to a callee, the caller's SIP-URI with a new tag is added on From header and the callee's on To header. Those SIP-URIs in From and To are not changed even in responses to the request. Once this request arrives at the callee, a receiver's new tag is added to To header.



Via header also defined according to who sends the initial request on a dialog. So From, Via, and To header fields follow who sends a initial SIP request per each call, which means these 3 information do not change before the call ends.

Beside these three headers, Contact header field is inserted according to who send SIP messages; request or response. On top of that, Contact header field includes more specific caller's information like protocol transport (such as TCP or UDP), and port number. It is usually composed of a username at FQDN (fully qualified domain name). While an FQDN is preferred, many end systems do not have registered domain names, so IP addresses are permitted.




So Contact header and source/destination IP addresses are changed per each message (both request and response) direction while From, Via, and To headers never changed in a call. Here is populated table above, helping us understand it.


Via and Max-Forwards headers vs. Hops


Via header field indicates the transport used for transaction and identities the location where the response is to be sent. It must has SIP 2.0 and a "branch" parameter. This branch parameter is used to identify the transaction created by that request and used by both the client and the server. When a response is sent, a "received" or "rport" parameter is added into the received Via field header. When the request is transferred to another element (called as a hop), additional Via field header is added to the existing one while Max-Forwards decreases by 1. When the response is transferred back to initial request originator, the top Via is removed in the end.



Here is more practical example in Wireshark. When a IMS client device under test can be connected to a network simulator, which including IMS proxy server as well as a virtual UA. Even though the IMS proxy server and the virtual UA locate in the same IP address, the functionality are totally separated.

Let's assume your device in IP address 192.168.1.1 and both proxy server and a virtual UA are in IP address 192.168.1.2. This is the case when your device is calling to the virtual UA.



No   Time      Source          Destination    Protocol
532  35.87..  192.168.1.1  192.168.1.2  SIP/SDP  Request: INVITE sip:0123456789@anritsu-cscf.com |
Internet Protocol Version 4, Src: 192.168.1.1 (192.168.1.1), Dst: 192.168.1.2 (192.168.1.2)
User Datagram Protocol, Src Port: 45990 (45990), Dst Port: sip (5060)
Session Initiation Protocol (INVITE)
    Request-Line: INVITE sip:0123456789@anritsu-cscf.com SIP/2.0
    Message Header
        Via: SIP/2.0/UDP 192.168.1.1:45990;branch=z9hG4bK..9191;rport
        Max-Forwards: 70



The IP addresses and UDP ports for both client and server sides are determined in the Wireshark log. UDP source port would be 45990 and destination port would be 5060 in this case. When a request is sent through Via path, transport layer port is defined as 45990 from the request side in your client. When Via header field is prepared, the "rport"(response-port) without port number is transferred to the server side in proxy server.

No      Time   Source          Destination    Protocol 
533  35.88..  192.168.1.2  192.168.1.1  SIP  Status: 100 Trying | 
Internet Protocol Version 4, Src: 192.168.1.2 (192.168.1.2), Dst: 192.168.1.1 (192.168.1.1)
User Datagram Protocol, Src Port: sip (5060), Dst Port: 45990 (45990)
Session Initiation Protocol (100)
    Status-Line: SIP/2.0 100 Trying
    Message Header
        Via: SIP/2.0/UDP 192.168.1.1:45990;branch=z9hG4bK..9191;rport=45990
        Max-Forwards: 70



In this response transport, the UDP source port would be 5060 and destination port would be 45990. When server side sends response with Via header, the response port number, 45990 is added to "rport".

We can think of another example when your device receive a call from the virtual UA. In this case, your device in IP address 192.168.1.11 and both proxy server and a virtual UA are in IP address 192.168.1.12.



No      Time           Source                Destination           Protocol
27  7.59..  192.168.1.12  192.168.1.11  SIP/SDP  Request: INVITE sip:+11234567890@192.168.1.11 |
Internet Protocol Version 4, Src: 192.168.1.12 (192.168.1.12), Dst: 192.168.1.11 (192.168.1.11)
User Datagram Protocol, Src Port: 65150 (65150), Dst Port: sip (5060)
Session Initiation Protocol (INVITE)
    Request-Line: INVITE sip:+11234567890@192.168.1.11 SIP/2.0
    Message Header
        Via: SIP/2.0/UDP 192.168.1.12:65150;branch=z9hG4bK..0186;rport;transport=udp
        Via: SIP/2.0/UDP 192.168.1.12:65144;branch=z9hG4bK..375a;rport=65146
        Max-Forwards: 69



There are 2 Via header fields. The UDP source port would be 65150 and destination would be 5060 for the the top Via header. When the request is sent through Via path, transport layer port is defined as 65150 from the client side in the proxy server. On Via header field, the "rport" without port number is sent to the server side in your UA.

The lower Via header field is generated by the virtual UA. The UDP source port cannot be found in the Wireshark log but transport layer port found as 65144. When this port is transferred through NAT, the IP address and transport port are bind to others based on RFC3581 6 Example. Since the server side in the proxy server understand that the request was sent from 65146 port number of the client. So rport with 65146 is added in the Via header field in the response.

No Time    Source            Destination      Protocol
28  7.64..  192.168.1.11  192.168.1.12  SIP  Status: 100 Trying |
Internet Protocol Version 4, Src: 192.168.1.11 (192.168.1.11), Dst: 192.168.1.12 (192.168.1.12)
User Datagram Protocol, Src Port: sip (5060), Dst Port: 65150 (65150)
Session Initiation Protocol (100)
    Status-Line: SIP/2.0 100 Trying
    Message Header
        v: SIP/2.0/UDP 192.168.1.12:65150;branch=z9hG4bK..0186;rport=65150;received=192.168.1.12;transport=udp
        v: SIP/2.0/UDP 192.168.1.12:65144;branch=z9hG4bK..375a;rport=65146


In the response transport, the UDP source port would be 5060 and destination would be 65150. When the response is sent from the server side in your device, rport with port number 65150 is inserted in Via header field.


Route and Record-Route headers vs. Via header


When a request arrives at a proxy, Record-Route header field can be added by a proxy. Here is an example, Record-Route: <sip:p1.example.com;lr>. "lr" indicates loose routing proxy. After all added Record-Route header fields arrives at the request's destination, the destination's UA copies all added Record-Route headers to 200 OK with SDP response. Unlike Via header, while 200 OK is delivered, those copied Record-Route headers are not removed per proxy but delivered to the caller.

Here is an example based on RFC3261 16.12 Summary of Proxy Route Processing.




Branch in Via header, Tag in From/To headers, and Call-ID header




Supported and Require headers 


Supported header field can contains what UACs support and it prevents servers from insisting non-standard or vendor specific features. Even though RFC specifications do not allow to use reliable provisional responses for any method but INVITE, Supported header field with "100rel" option tag can define a new mechanism and make it work. As matter of fact, all INVITE should include Supported header listing "100rel" option tag.

Require header field in a new request can be asked by UACs to servers that the client expects the server to support in order to properly process the request. If a server does not understand the option, 420 Bad Extension status code is returned. Require header with "100rel" option tag must not be used by any requests but INVITE.

Example:
C->S: INVITE sip:watson@bell-telephone.com SIP/2.0
           Require: com.example.billing
           Payment: sheep_skins, conch_shells
S->C: SIP/2.0 420 Bad Extension
           Unsupported: com.example.billing


path option-tag with Path header and with Route/Record-Route header fields 


If Path, Route and Record-Route header fields are used in transaction, "path" option-tag in Supported header field should be included in UA. The Path header field is a SIP extension header field with syntax very similar to the Record-Route header field.The Path header is used in conjunction with REGISTER requests and 200 responses while the Record-Route header is inserted into INVITE requests with the path vector, which was established by Path application for future dialogs.

A Path header field may be inserted into a REGISTER by any SIP node traversed by the request. The registrar reflects the accumulated Path back into the responses toward the originating UA. The originating UA is therefore informed of the inclusion of nodes on it's registered path vector and may use information in other capacities.

Home proxy rewrites the request-URI from the incoming request with the registered contact and retransmits the request. Home proxy also copies the stored path vector associated with specific contacts in the registrar database into Route header field of outgoing request.

Here is an example in RFC3327 for REGISTER operation. UA1 sends request to Registrar, which transmits its default outbound proxy P1, an intermediate proxy P2, and firewall proxy for the home domain, P3 before Registrar. Since P1 is home network targeted to UA1 and P3 needs to be sent by requests from Registrar, P1 and P3 have configured themselves in Path header fields on REGISTER requests and P2 have not.



This example shows how INVITE transaction originate from UA2 after UA1 has registered on Registrar above. Registrar inserts preloaded Route toward UA1 and retargets the by replacing the request URI with the registered Contact. Since Registrar does not want to be static route, it does not add Record-Route into the outgoing INVITE request.




SDP media level attributes


SDP (Session Description Protocol) description is transferred while SIP negotiated based on RFC 3312. There is list what media level attributes consists of below.



Here is an example how to map each type and tag to the media level attributes.

m=audio 20000 RTP/AVP 0
a=curr:qos local none
a=curr:qos remote none
a=des:qos mandatory local sendrecv
a=des:qos mandatory remote sendrecv
a=curr:qos e2e none
a=des:qos optional e2e sendrecv

There are many type of attributes' tags and clarify some of them direction perspective; "send", "recv", "local", and "remote". All messages are divided into Requests and Responses in SIP protocol while  description, all are done into "offer" and "answer". Each request or response can be either an offer or answer. In an offer, "send" is direction offerer -> answerer and "local" is the offerer's access network. In an answer, "send" is the direction answerer -> offerer and "local" is the answer's access network.



Here is SIP header field list.



Here is SDP header field list.




SIP Overview

While following IMS over LTE technology, RFC specifications base SIP study seems necessary to understand how IMS works in detail. It could be way different from 3GPP specifications, but it would be worth going through some of them.

Let me get started with one of most popular trapezoid example in RFC3261. This Figure 1 on right hand side comes from 4 Overview of Operation in RFC3261. And the left hand side one is draw by myself based on 24.2 Session Setup, which describe full detailed SIP message about Figure 1. As you can see these 2 trapezoids look quite different but I would rather follow 24.2 Session Setup with my chart from now on. 



Go through SIP Dialog


Since Alice's softphone does not know the location of Bob's SIP phone or SIP servers in biloxi.com domain, the softphone sends INVITE to it's SIP server that servers Alice's domain, atlanta.com. The address of atlanta.com SIP server could have been configured in Alice's softphone or discovered by DHCP. The atlanta.com proxy server receives INVITE request, processes it, and send 100 Trying response back to Alice's softphone. This 100 Trying response tells the caller that INVITE has been received an the atlanta.com proxy is working on her behalf to route the INVITE to the destination. 



The atlanta.com proxy server locates the proxy server at biloxi.com domain through DNS lookup, which is not in the chart. As a result, the IP address of biloxi.com proxy server is obtained and the INVITE request is forwarded by atlanta.com proxy. The biloxi.com proxy server receives INVITE and responses with 100 Trying back to atlanta.com to indicate it was received. Generally the callee's proxy server (biloxi.com in this case) consults a database to locate the callee.



As soon as Bob's SIP phone receives the INVITE and alerts Bob to incoming call from Alice so that Bob can decide whether to answer the call. This makes Bob's phone ring while sending a 180 Ringing response, which is routed back through the 2 proxies. 


Bob decides to answer the call in this example. When he picks up the handset, a 200 OK response is sent to indicate that the call has been answered. This 200 OK contains SDP media description of session type that Bob is willing to establish with Alice. Then the requester responses ACK to 200 OK and this ACK is directly routed to the callee. After this, media session has now established and real media can be transmitted, we say.


At the end of the call, Bob hangs up first, which generate a BYE message. This BYE is routed directly to the caller, Alice bypassing those proxies. Alice confirms the receipts of BYE with 200 OK response, which terminates the session and the BYE transaction. 

So far, narrowing down some related header fields would be helpful. Here is the final detailed SIP message on the trapezoid below.




# References


  • RFC3261 4 Overview of Operation
  • RFC3261 24.2 Session Setup