Overview
IMS β the IP Multimedia Subsystem β is the session-layer architecture standardised by 3GPP that enables voice, video, and messaging services over all-IP packet networks. It is the technology that makes VoLTE (Voice over LTE) and VoNR (Voice over New Radio) possible, replacing the circuit-switched voice infrastructure of 2G and 3G with a packet-based equivalent built on the Session Initiation Protocol (SIP) and the Diameter protocol.
IMS is not a single node β it is a collection of logical functions. The core signalling functions are the Call Session Control Functions (CSCFs), which handle SIP registration, call routing, and session negotiation. The Home Subscriber Server (HSS) provides IMS subscription data and authentication. The Application Servers (AS) host supplementary services. The PCRF and Policy and Charging Enforcement Function (PCEF) manage dedicated bearer establishment for the voice media path.
The original IMS specification dates to 3GPP Release 5 (2002), where it was conceived as an overlay on GPRS. In practice, IMS remained lightly deployed until LTE made it operationally necessary: LTE has no circuit-switched domain, so any operator that wants to offer voice on LTE must use IMS. GSMA IR.92 defines the Minimum IMS Profile for Voice and SMS β the baseline interoperability profile that made VoLTE viable for mass deployment in the 2010s.
In 5G, IMS continues in the same role. Voice over NR uses the same IMS architecture, now served by 5G EPC or 5G SA cores depending on the deployment model. The IMS core itself is largely unchanged β the 5G network provides the bearer that IMS rides on, but the SIP signalling stack above it is identical.
How it works
IMS call signalling uses SIP extended with 3GPP-specific headers and procedures. The media path uses RTP, with Quality of Service guaranteed by a dedicated EPS bearer (in 4G) or a QoS Flow (in 5G) negotiated through the PCRF/PCF at the time of call setup.
Registration
Before a subscriber can make or receive IMS calls, their device must register with the IMS network. The UE sends a SIP REGISTER to the Proxy-CSCF (P-CSCF), which is the entry point into the IMS. The P-CSCF forwards the REGISTER to the Interrogating-CSCF (I-CSCF), which queries the HSS via Diameter (Cx interface) to discover which Serving-CSCF (S-CSCF) is assigned to the subscriber. The I-CSCF forwards the REGISTER to the S-CSCF. The S-CSCF authenticates the subscriber using IMS AKA β requesting authentication vectors from the HSS via Cx β and challenges the UE with a 401 Unauthorized response containing a RAND and AUTN. The UE computes its response using the ISIM credentials and re-sends the REGISTER. If authentication succeeds, the S-CSCF sends a 200 OK and downloads the subscriber's Initial Filter Criteria (iFC) from the HSS, which determine which Application Servers will be invoked for subsequent requests.
Mobile-originating call
The UE sends a SIP INVITE to the P-CSCF. The P-CSCF forwards it to the S-CSCF. The S-CSCF applies the subscriber's iFC to determine which Application Servers should handle the session, then routes the INVITE toward the called party via a Breakout Gateway Control Function (BGCF) or directly to the terminating network. Simultaneously, the S-CSCF triggers the PCRF via Rx/Gx to request a dedicated EPS bearer for the voice media stream, allocating the QoS necessary for real-time audio.
Mobile-terminating call
An incoming call arrives at the IMS network as a SIP INVITE addressed to the called subscriber's public SIP URI (e.g., sip:+441234567890@operator.ims). The I-CSCF queries the HSS (Cx) to identify the subscriber's S-CSCF, then routes the INVITE to it. The S-CSCF applies the subscriber's terminating iFC and forwards the INVITE via the P-CSCF to the UE. If the subscriber is not IMS-registered, the S-CSCF can trigger CS Fallback or voicemail depending on the iFC configuration.
Media path
SIP handles the signalling β session setup, negotiation, and teardown. The actual voice media flows via RTP, negotiated via SDP in the SIP INVITE/200 OK exchange. The Gi/SGi interface delivers the RTP packets between the UE and the far end. For calls within the same operator, the IMS network may terminate both legs and relay the media; for calls to the PSTN, a Media Gateway (MGW) converts between RTP and TDM circuits.
Architecture role
IMS sits above the bearer network β it does not manage IP addresses, bearers, or mobility. These are provided by the 4G EPC (MME, SGW, PGW, PCRF) or 5G core (AMF, SMF, UPF, PCF). IMS rides on whatever bearer the access network provides, treating it as a generic IP pipe.
P-CSCF (Proxy-CSCF): The UE's SIP proxy. All SIP signalling from the UE passes through the P-CSCF. It applies IPsec to the Gm interface (UE to P-CSCF), enforces message sanity, provides SIP compression (SigComp), and interacts with the PCRF via the Rx interface to establish dedicated bearers for media. The P-CSCF is selected by the UE during LTE attach via PCO (Protocol Configuration Options) in the PDN connectivity request.
S-CSCF (Serving-CSCF): The core SIP registrar and session controller. Maintains the subscriber's registration state, applies Initial Filter Criteria to route requests to Application Servers, and authenticates subscribers during registration using IMS AKA. All SIP sessions for a subscriber are anchored at their assigned S-CSCF for the duration of registration.
I-CSCF (Interrogating-CSCF): The entry point for inter-operator SIP traffic and the HSS query point for S-CSCF assignment. It acts as a topology-hiding proxy β hiding the internal IMS structure from external networks β and routes inbound SIP requests to the correct S-CSCF.
HSS (in IMS context): The HSS serves IMS as well as EPC. For IMS, it stores subscriber IMPIs (IMS Private Identities), IMPUs (IMS Public Identities), authentication data, and Initial Filter Criteria. The Cx interface (Diameter) connects the S-CSCF and I-CSCF to the HSS.
Application Servers: Host supplementary services: voicemail, call barring, call forwarding, conferencing, RCS (Rich Communication Services), and operator-specific value-added services. Application Servers interact with the S-CSCF via the ISC interface (SIP) and access subscriber profile data from the HSS via the Sh interface (Diameter).
Key interfaces
| Interface | Between | Protocol | Purpose |
|---|---|---|---|
| Gm | UE β P-CSCF | SIP over IPsec | Subscriber SIP signalling, registration, call setup |
| Mw | CSCF β CSCF | SIP | Inter-CSCF routing (PβS, IβS, SβI) |
| ISC | S-CSCF β AS | SIP | Application Server triggering via Initial Filter Criteria |
| Cx | S-CSCF / I-CSCF β HSS | Diameter | Registration, authentication vector fetch, subscriber data |
| Sh | AS β HSS | Diameter | Application Server access to subscriber profile data |
| Rx | P-CSCF β PCRF | Diameter | Media authorisation; triggers dedicated bearer for voice |
| Mg | CSCF β MGCF | SIP | Interconnect to PSTN Media Gateway Control Function |
| Mi | S-CSCF β BGCF | SIP | Breakout routing for PSTN-bound calls |
Security posture
IMS has a high threat level that reflects its exposure on two attack surfaces: the SIP interface toward the internet (roaming and inter-operator) and the Diameter interface toward the HSS. The SIP attack surface is broad β SIP is a text-based protocol with a large message parser attack surface, a complex state machine, and a registration mechanism that can be exploited to redirect a subscriber's session.
The primary IMS security control is IPsec on the Gm interface between the UE and P-CSCF, which prevents passive intercept of SIP signalling and provides mutual authentication at the IP layer. However, IPsec on Gm is the UE-side control only. Inter-operator SIP traffic (S8 roaming, PSTN interconnect, RCS federation) does not have equivalent protection unless bilateral TLS is configured, creating an exposure point for roaming IMS signalling.
The media path β RTP β is a separate concern. Standard RTP is unencrypted. SRTP provides media encryption, but its deployment is inconsistent, and call recording infrastructure in some operator networks requires plaintext RTP access, creating pressure against SRTP deployment.
Attack surface
SIP registration hijacking
The SIP REGISTER procedure can be targeted by an attacker who can send SIP messages to the P-CSCF claiming to be the subscriber. If authentication is weak or bypassed β for example, via a previously stolen authentication vector β the attacker can register their own contact address, causing all subsequent calls to be delivered to them.
Impact: Subscriber impersonation; all incoming calls diverted to attacker. Difficulty: Medium. Requires authentication material (ISIM credentials or intercepted auth challenge/response).
SIP signalling denial of service
The SIP parser in CSCFs has historically been a large attack surface. Malformed SIP messages, excessively long headers, or message floods can crash or degrade the P-CSCF, disrupting voice service for all subscribers served by that node.
Impact: VoLTE outage for the affected P-CSCF pool; subscribers unable to make or receive calls. Difficulty: Low. Sending malformed SIP to the P-CSCF requires only IP connectivity.
RTP media interception
Where SRTP is not deployed, the RTP media stream for an active call flows in plaintext over IP. An attacker with a man-in-the-middle position on the media path β possible in some roaming or interconnect scenarios β can record or inject audio.
Impact: Call content disclosure; audio injection. Difficulty: Medium. Requires network-level access to the media path.
Mitigations
-
IPsec on Gm interface: Enforce IPsec SA establishment between the UE and P-CSCF during IMS registration, per 3GPP TS 33.203. This protects the entire SIP signalling exchange from the UE to the network edge.
-
SRTP for media: Negotiate DTLS-SRTP as the default media protection mechanism in SDP. This encrypts RTP media end-to-end between endpoints. Where lawful intercept requires plaintext media access, implement at the IMS level (AS-mediated) rather than by disabling SRTP.
-
SIP message rate limiting at P-CSCF: Apply per-source-IP and per-subscriber rate limits on SIP REGISTER and INVITE messages. Enforce maximum SIP message size limits. Deploy a SIP application-layer firewall to validate message structure before routing.
-
GRUU validation for registration: Use GRUU (Globally Routable User Agent URI) binding to tie registrations to specific UE instances, making it harder to hijack a registration from a different device.
-
Cx interface access control: Restrict Diameter peers on the Cx interface to known S-CSCF and I-CSCF node addresses. An unauthorised node querying the HSS via Cx gains access to IMS subscriber profile data including iFC and authentication material.
Spec references
-
3GPP TS 23.228 β IP Multimedia Subsystem Stage 2. The normative IMS architecture specification. Section 4 defines the functional entities (P-CSCF, S-CSCF, I-CSCF, HSS, AS) and Section 5 defines the reference points between them.
-
3GPP TS 24.229 β IP multimedia call control based on SIP. The normative SIP protocol specification for IMS. Sections 5 and 6 define the registration and call control procedures that all IMS nodes must implement.
-
3GPP TS 26.114 β IMS Multimedia Telephony; Media handling. Defines the codec, packetisation, and SRTP requirements for VoLTE media, including the AMR-WB codec mandate and jitter buffer requirements.
-
GSMA IR.92 β IMS Profile for Voice and SMS. The interoperability baseline for VoLTE. Sections 2 and 4 define the minimum feature set and codec requirements that all IR.92-compliant VoLTE deployments must support.
Related topics
IMS is built on SIP for session control and RTP for media. The HSS provides subscriber data to the CSCF nodes via the Cx Diameter interface. QoS for the voice bearer is established via the PCRF using the Rx Diameter interface.
In 4G networks, IMS rides on the 4G EPC bearer infrastructure. In 5G, it rides on the 5G SA packet core. For legacy interworking, the IMS Media Gateway interworks with the MSC for PSTN calls. VoLTE roaming uses the roaming architecture for both the bearer and the IMS SIP interconnect.
Specifications
Relationships