UMTS Troubleshooting Guidelines
UMTS Troubleshooting Guidelines
Date:
2009-26-06
UMT/IRC/INF/026976 V01/EN UMTS RF Troubleshooting Guidelines Irfan Mahmood 26th June 2009 1.02
Page 1 of 108
Document name:
Date:
2009-26-06
Table of Contents
1. GLOSSARY OF TERMS AND ABBREVIATIONS......................................................................4 2. REFERENCES...................................................................................................................................9 3. ABOUT THIS DOCUMENT.........................................................................................................11 3.1. INTRODUCTION.................................................................................................................................11 3.2. CONTENT........................................................................................................................................11 3.3. HOW TO READ THIS GUIDE.................................................................................................................12 3.4. UTRAN/CN RELEASE AND VENDOR DEPENDENCY...............................................................................12 3.5. INTENDED AUDIENCE.........................................................................................................................12 3.6. DISCLAIMER - WHAT IS NOT COVERED.................................................................................................12 4. DESCRIPTION OF THE OPTIMISATION PROCESS.............................................................13 5. CALL SETUP...................................................................................................................................15 5.1. CALL SETUP RRC CONNECTION ESTABLISHMENT...............................................................................15 5.2. CALL SETUP FAILURES DURING THE CALL SETUP PHASE........................................................................31 5.3. CALL SETUP CORE NETWORK FAILURES...........................................................................................33 5.4. CALL SETUP RAB ESTABLISHMENT.................................................................................................36 6. CALL RELIABILITY (RETAINABILITY).................................................................................41 6.1. CALL RELIABILITY RADIO LINK FAILURE (RLF)..............................................................................42 6.2. CALL RELIABILITY DROP OF THE RAB.............................................................................................46 6.3. CALL RELIABILITY DROP OF RRC CONNECTION AFTER CALL SETUP.......................................................49 6.4. CALL RELIABILITY RF PLANNING RELATED ISSUES.............................................................................52 6.5. CALL RELIABILITY CONGESTION CONTROL .......................................................................................59 6.6. CALL RELIABILITY FAILURES IN URA_PCH/CELL_PCH MODE......................................................60 6.7. CALL RELIABILITY FAILURES IN CELL_FACH MODE.......................................................................62 6.8. CALL RELIABILITY HARDWARE AND NETWORK INTERFACE OUTAGES.......................................................65 6.9. CALL RELIABILITY INTRA FREQUENCY SOFT/ER HANDOVER...................................................................65 6.10. CALL RELIABILITY IRAT HANDOVER.............................................................................................69 6.11. CALL RELIABILITY CELL CHANGE ORDER FROM UTRAN.................................................................73 6.12. CALL RELIABILITY INTER FREQUENCY HANDOVER.............................................................................74 6.13. CALL RELIABILITY FAILURES ON THE TRANSPORT NETWORK..............................................................76 6.14. CALL RELIABILITY FAILURES ON RLC...........................................................................................76 6.15. CALL RELIABILITY HSDPA.........................................................................................................80 6.16. CALL RELIABILITY HSUPA/EDCH.............................................................................................86 6.17. CALL RELIABILITY MISCELLANEOUS FAILURES..................................................................................90 7. CALL QUALITY.............................................................................................................................94 7.1. CALL QUALITY - BLOCK ERROR RATE (BLER)..................................................................................94 7.2. CALL QUALITY QUALITY OF SERVICE (QOS).....................................................................................98 APPENDIX........................................................................................................................................104 A. MEASUREMENT DEFINITION...............................................................................................................104 B. TIME SYNCHRONISATION OF MEASUREMENT TRACES..............................................................................107
Page 2 of 108
Document name:
Date:
2009-26-06
Change Record
This table details the changes done to the document since the last version Date 28 February 2009 26th June 2009
th
Changes Updated draft after ONE team review for UA6 converged UTRAN Updated final version after SDT/ONE/SBG teams review for UA6 converged UTRAN
Page 3 of 108
Document name:
Date:
2009-26-06
Page 4 of 108
Document name:
Date:
2009-26-06
EDCH ETSI FACH FDD FM FP FSN FTP GGSN GMM GPRS GPS GSM HCS HLR HHO HO H-PLMN HSDPA HS-DSCH HSUPA HTTP H-USDPA HW IE ICMP IMCTA IP IRAT IRM KPI LA LWS MAC MAC-hs MAHO MIB
Enhanced DCH European Telecommunication Standard Institute Forward Access Channel Frequency Division Duplex Fault Management Framing Protocol First SN File Transfer Protocol Gateway GPRS Support Node GPRS MM General Packet Radio Services Global Positioning System Global System for Mobile Communication Hierarchical Cell Structure Home Location Register Hard Handover Handover Home PLMN High Speed Downlink Packet Access High Speed Downlink Shared Channel High Speed Uplink Packet Access Hyper Text Transfer Protocol High Speed Downlink Packet Access Hardware Information Element Internet Control Message Protocol Intelligent Multi-Carrier Traffic Allocation Internet Protocol Inter Radio Access Technology Intelligent Rate Matching Key Performance Indicator Location Area Lucent Worldwide Services Medium Access Control Medium Access Control high speed Mobile Assisted HO Master Information Block
Page 5 of 108
Document name:
Date:
2009-26-06
MM MMS MO MOS MSC MSS MNC MT NACK NAS NBAP NTP OAM OMC-U PCPICH PC PCH PDCP PDP PDU PHY PICH PLMN PM PPP PS PSC QE QoS RA RAB RACH RAN RANAP RB RL RLC
Mobility Management Multi Media SMS Mobile Originating Mean Opinion Score Mobile Switching Centre Maximum Segment Size Mobile Network Code Mobile Terminating Negative ACK Non access stratum NodeB Application Part Network Time Protocol Operation and Maintenance Operations and Maintenance Centre UMTS Primary CPICH Power Control Paging Channel Packet Data Convergence Protocol Packet Data Protocol Protocol Data Unit Physical Layer Paging Indication Channel Public Land Mobile Network Performance Measurement Point to Point Protocol Packet Switched Primary Scrambling Code Quality Estimate Quality of Service Routing Area Radio Access Bearer Random Access Channel Radio Access Network Radio Access Network Application Part Radio Bearer Radio Link Radio Link Control
Page 6 of 108
Document name:
Date:
2009-26-06
RLF RF RNC RNSAP RRC RRM RSSI RSCP RTP RTT RXLEV SACK SBG SC SCCPCH SCH SDU SGSN SHO SIB SIM SIR SM SMS SN SRB SRNC TB TBS TCP TFCI TGPS TM TPC TSSI TX UDP
Radio Link Failure Radio Frequency Radio Network Controller Radio Network Subsystem Application Part Radio Resource Control Radio Resource Management Received Signal Strength Indicator Received Signal Code Power Real Time Protocol Round Trip Time Receive Level (GSM) Selective ACKs Services Business Group Scrambling Code Secondary CCPCH Synchronization Channel Service Data Unit Serving GPRS Support Node Soft/softer Handover System Information Broadcast Subscriber Identity Module Signal to Interference Ratio Session Management Short Message Service Sequence Number Signalling Radio Bearer Serving RNC Transport Block Transport Block Size Transmission Control Protocol Transport Format Combination Indicator Transmission Gap Pattern Sequence Transparent Mode Transmit Power Control Transmitted Signal Strength Indicator Transmitted User Datagram Protocol
Page 7 of 108
Document name:
Date:
2009-26-06
User Equipment (mobile station) Uplink Unacknowledged Mode Universal Mobile Telecommunication Standard UTRAN Registration Area UMTS Subscriber Identity Module UMTS Terrestrial Radio Access Network Video Telephony
Page 8 of 108
Document name:
Date:
2009-26-06
2. References
[1] TS 23122 NAS Functions related to Mobile Station (MS) in idle mode [2] TS 11.11 Specification of the SIM ME interface [3] TS 25304 UE Procedures in Idle Mode and Procedures for Cell Reselection in Connected Mode [4] GSM 03.22 Functions related to Mobile Station in idle mode and group receive mode [5] TS 24008 Mobile radio interface layer 3 specification; Core Network Protocols Stage3 [6] TS 25331 RRC Protocol Specification [7] TS 25433 UTRAN Iub Interface NBAP Signalling [8] TS 24007 Mobile radio interface signalling layer 3 specification; general aspects [9] TS 25413 UTRAN Iu Interface RANAP Signalling [10] TS 25423 UTRAN Iur Interface RNSAP Signalling [11] TS 25214 Physical layer procedures (FDD) [12] TS 25922 Radio resource management strategies [13] TS 25201 User Equipment (UE) Radio transmission and reception (FDD) [14] TS 25306 UE Radio Access Capabilities [15] TS 34121 Terminal conformance specification; Radio transmission and reception (FDD) [16] HSxPA Parameter User Guide for UA6.0, https://wcdmall.app.alcatellucent.com/livelink/livelink.exe? func=ll&objId=42687602&objAction=browse&sort=name&viewType=1 [17] UMTS Parameter User Guide for UA6.0, https://wcdma-ll.app.alcatellucent.com/livelink/livelink.exe? func=ll&objId=41590109&objAction=browse&sort=name&viewType=1 [18] Actix, http://www.actix.com [19] Wireshark, documentation and download at http://www.wireshark.org/ [20] tcptrace, documentation and download at www.tcptrace.org [21] Tardis2000, www.kaska.demon.co.uk/tardis.htm [22] TS 25322 RLC protocol specification [23] TS 21905 Vocabulary for 3GPP Specifications [24] Cygwin available at http://www.cygwin.com/ [25] DR TCP available at http://www2.kansas.net/drtcp.asp [26] TS 25323 Packet Data Convergence Protocol (PDCP) Specification [27] Alcatel-Lucent 9300 W-CDMA product family counters dictionary RNC/NodeB Counters, NN20500-028PX: https://wcdma-ll.app.alcatel-
Page 9 of 108
Document name:
Date:
2009-26-06
lucent.com/livelink/livelink.exe? func=ll&objId=44811811&objAction=browse&sort=name&viewType=3 [28] TR 26975 Performance characterisation of the AMR speech codec Report [29] Performance monitoring guidelines for UA06, https://wcdma-ll.app.alcatellucent.com/livelink/livelink.exe? func=ll&objId=43991232&objAction=browse&sort=name&viewType=1 [30] Wireless Quality Aanalysis (WQA) Tool, https://wcdma-ll.app.alcatellucent.com/livelink/livelink.exe? func=ll&objId=37186755&objAction=browse&sort=name&viewType=3 [31] Feature strategy and monitoring document, 33821 & 34700, PS RRC reestablishment UA06 feature, https://wcdma-ll.app.alcatellucent.com/livelink/livelink.exe? func=ll&objId=41699995&objAction=browse&sort=name&viewType=1 [32] ITU-T J.144 Objective perceptual video quality measurement techniques for digital cable television in the presence of a full reference [33] RF Optimisation and Analysis Tool Suit http://navigator.web.lucent.com/ [34] EDCH Settings cookbook - UA5.x and UA6 https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe? func=ll&objid=35999228 [35] Technical Card How to reach HSxPA Highest Throughput, U-TC-19 http://frctfd0f06660.ad2.ad.alcatel.com/GPS_Tools/tcardsUMTS/list.php? idP=10&idR=14&idT=T [36] RNC9370 Call trace (CT) User Guide, https://wcdma-ll.app.alcatel-lucent.com/livelink/livelink.exe? func=ll&objid=37184418&objAction=browse&sort=name [37] Iub Engineering Guidelines, https://wcdma-ll.app.alcatellucent.com/livelink/livelink.exe? func=ll&objId=51388411&objAction=browse&sort=name&viewType=1
Page 10 of 108
Document name:
Date:
2009-26-06
Furthermore this guideline is cross correlating the observed occurrences to the corresponding UTRAN parameter, PM counters and KPIs of the ALU UTRAN and/or CN and gives references. All configuration parameters are given in the format OAM Object.Parameter Name to facilitate finding it.
3.2.
Content
There are five main chapters in this document: Chapter About this document is providing an introduction and an overview of the UMTS RF Troubleshooting Guideline. Chapter Description of the optimisation process is providing a short overview of the UMTS optimisation process as covered by the UMTS RF Troubleshooting Guideline. Chapter Call setup is listing all problems that might occur at the call establishment phase. Chapter Call reliability is describing failures and problems that might occur after call establishment; examples are dropped calls, radio link failures or handover problems. Chapter Call quality is dealing with quality problems as perceived by the UMTS subscriber.
Page 11 of 108
Document name:
Date:
2009-26-06
3.3.
3.4.
3.5.
Intended audience
This document is directed to system engineers, network planners, RF optimisation engineers and all SBG engineers who are optimising and troubleshooting ALU 9370 based UMTS network. This document assumes that the reader has a good understanding of the UMTS call processing and is familiar with the various troubleshooting and monitoring tools that are available like RFO, LDAT3G, CT (g, b or n), WQA, SPO in US and NPO in Global market.
3.6.
Page 12 of 108
Document name:
Date:
2009-26-06
Page 13 of 108
Document name:
Date:
2009-26-06
Pre-requisite before starting with a performance verification and optimisation is that The FM analysis shows no severe alarms that might influence the performance measurements as retrieved by the PM statistic or drive test data The RF design audit and optimisation has been finished for the region to be optimised
In case, one or both pre-requisites are not fulfilled starting with the performance investigation and troubleshooting does not make much sense. For troubleshooting and optimizing new clusters, the Drive test and interfaces traces would be more relevant than PMs that may get skewed because of small number of users.
Page 14 of 108
Document name:
Date:
2009-26-06
5. Call setup
One important user perception of a UMTS network is the success of setting-up a UMTS call. This section is describing all kind of failures and problems that might occur during the call establishment phase. The different phases during the call setup are covered step-by-step in the following subsections of this chapter.
5.1.
The whole procedure is visualised in Figure 2 below and will be explained in detail in the following subsections:
Figure 2: PLMN (re-)selection and cell (re-) selection process If the UE is in CELL_FACH or URA_PCH, the UE also performs cell reselections; however possible failures that may occur are covered in the subsection regarding failures on RACH (subsection 5.1.3) and FACH (subsection 5.1.6). In the following it is assumed that the UE is in idle mode. The NAS part is described in 2 and depends mainly on the information stored on the U-SIM 2. After power-on the UE starts with the initial cell search procedure and tries to decode the network information as broadcasted by the 2G or 3G cells on the
Page 15 of 108
Document name:
Date:
2009-26-06
BCCH. The UE is either selecting the best suitable cell (in terms of the cell selection criteria, see below) of its H-PLMN and starts with the location registration procedure or otherwise when the H-PLMN is not available the UE is selecting a non-forbidden PLMN, camping on the best suitable cell and starts with the location registration procedure. In case there is no suitable cell of a non-forbidden network (no roaming agreement, lack of coverage, SIM locked in the HLR etc.) the mobile enters the Limited Service state. In this state the UE is only allowed to initiate emergency calls in case it detects any PLMN coverage. The AS part is defined in 2 for UMTS and in 2 for GSM. Optimisation approach is to ensure that the UE camps on the best suitable cell (in terms of RF conditions, traffic distribution assumptions etc.) to setup a call. The process can be configured by OAM parameters as explained below: In case ACB is used the UE is selecting a non-barred cell based on either cell information stored on the U-SIM or after doing the initial cell search. Prerequisite for the cell selection (and also cell reselection) are that the following criteria are fulfilled: For UMTS: For GSM: Squal = Qqualmeas - Qqualmin > 0 AND Srxlev = Qrxlevmeas Qrxlevmin - Pcompensation > 0 Srxlev = Qrxlevmeas Qrxlevmin - Pcompensation > 0
The different terms in the formula are defined as follows: Qqualmeas is the measured cell quality value. The quality of the received signal expressed in CPICH
Ec/N0 (dB) for FDD cells. Not applicable for TDD cells or GSM cells.
Qrxlevmeas
is cell RX level value. This is received signal, CPICH RSCP for FDD cells (dBm), P-CCPCH RSCP for TDD cells (dBm) and RXLEV for GSM cells (dBm) Pcompensation is the defined as Max(UE_TXPWR_MAX_RACH Max(MS_TXPWR_MAX_CCH P, 0) (GSM) P_MAX, 0) (UMTS),
UE_TXPWR_MAX_RACH is the maximum allowed power for the RACH and P_MAX is the maximum power for the given mobile power class.
The different OAM parameters of the formula above are listed in Table 1 below: Parameter Description
CellSelectionInfo Minimum required quality level in the cell (dB). Not applicable for TDD cells or GSM cells, broadcasted via SIB3 and SIB4 .qQualMin CellSelectionInfo Minimum required RX level in the cell (dBm), broadcasted via SIB3 and SIB4 .qRxLevMin PowerConfClass Maximum allowed UE Tx power (dBm) broadcasted on SIB3 and SIB4 . sibMaxAllowedUlTxPo werOnRach Table 1: Parameters used for cell selection The current formulas can only be used in case HCS is not deployed i.e. FDDCell.isHcsUsed = False.
Page 16 of 108
Document name:
Date:
2009-26-06
Furthermore while camping the UE shall start to perform inter-RAT measurements if Squal <= SSearchRAT, otherwise not. SSearchRAT is a configurable UMTS parameter broadcasted on SIB3/SIB4. However note that to avoid ping ponging between UMTS and GSM the following condition should be fulfilled: FDD_Qmin > Qqualmin + SsearchRAT FDD_Qmin defines minimum UMTS quality before UE can reselect from GSM to UMTS layer. If the above condition is not satisfied, a UE will move from GSM to UMTS and immediately start monitoring neighboring GSM cells again, an undesirable condition. Furthermore frequent re-selections between UMTS and GSM can cause mobile terminating call failure in case the PLMN pages the current network while the UE is in the process of registering with the other network. In a similar way the criterion for UMTS Interfrequency measurements is defined; for this parameter Sintersearch is used and is broadcasted on SIB3/SIB4. The UE can only reselect one of the 2G or 3G cells that are defined in the reselection list that are broadcasted via SIB11/SIB12 on the BCCH. For cell reselection the target cell has to fulfill the same criteria as specified for the cell selection case. The UE ranks the cells according to the cell ranking criteria Rs (serving cell) and Rn (neighbour cell). The UE will reselect the best GSM or UMTS cell of the ranking list if at least Treselection (UMTS parameter) has elapsed when camping on the cell. For UMTS network without HCS the following formulas are used (both for GSM and UMTS neighbouring cells): Rs = Qmeas,s + Qhysts Rn = Qmeas,n - Qoffsets,n For UMTS Qmeas is based either on RSCP or Ec/No measurements of the server/neighbour cell depending on whether a first or second ranking is being performed, respectively. Qhysts is an hysteresis to avoid ping-pong effects, Qoffsets,n is an offset defined on a per-neighbour definition. . The reselection process using the mentioned parameters (Qoffsets,n = 0) is visualised in Figure 3 below:
Page 17 of 108
Document name:
Date:
2009-26-06
Table 2 below is listing the main parameters configuring the cell reselection process in case no HCS is used: Parameter CellSelectionInfo .tReselection CellSelectionInfo .sSearchRatGsm CellSelectionInfo .sInterSearch Description
Time hysteresis for the cell reselection UMTS parameter broadcasted via the SIB3/SIB4 defining whether or not to start with inter-RAT measurements (setting of SSearchRAT) UMTS parameter broadcasted via the SIB3/SIB4 defining whether or not to start with UMTS interfrequency measurements (setting of Sintersearch)
CellSelectionInfo Hysteresis to avoid ping-pong effects (RSCP, Ec/No specific respectively) .qHyst1 .qHyst2 UmtsNeighbouringRelation or UMTS parameter broadcasted via the SIB11/SIB12 defining an offset on a per neighbour basis GsmNeighbouringCell .qOffset1sn UmtsNeighbouringRelation. UMTS parameter broadcasted via the SIB11/SIB12 defining an offset on a per UMTS FDD neighbour basis qOffset2sn Table 2: Most important parameter used for cell reselection, non HCS The Location Registration procedure is initiated by the UE by sending MM/GMM Direct Transfer messages. For these kinds of failures see subsection 5.3.1. The cell selection and reselection process and its translations are covered in more details in 2. 5.1.1.2. Failure symptoms, identification and fixes for improvement
A failure of the PLMN selection/reselection during a drive test can be easily identified when the screen of the drive test mobile is showing Limited Service and the MNC of the selected cell is different from the H-PLMN. The root cause might be a network outage due to NodeB, RNC or any particular network interface like Iub or Iu (see also subsection 6.4.5 and 6.8) or when the test van is driven out of the coverage footprint of the (GSM and UMTS) network. In that case the drive test route should be checked. When the PM counters of the CN are showing a high rejection rate due to missing national roaming it may be caused by an interface problem to or an outage in the roaming networks be it UMTS or GSM. Another problem might be ACB on one or several of the surrounding GSM and/or UMTS cells. Information regarding Access Class Barring is broadcasted via SIB3 or SIB4 2. ACB is used during the integration of cells. Common problems of the cell selection/reselection procedure are non-optimised configuration of the UTRAN parameters shown in Table 1 and Table 2. As a consequence the call will be setup on a non-optimal cell or a non-optimal RAN so the call-setup might fail during the RACH procedure (subsection 5.1.3), the paging procedure (subsection 5.1.2) or during the call setup procedure (subsection 5.2). A consistency check of the parameters listed in Table 1 and Table 2 might help to find parameter misconfiguration. Parameter Qoffsets,n used for optimisation of a per-cell basis should be reviewed.
Page 18 of 108
Document name:
Date:
2009-26-06
In case of poor 3G coverage and low call setup success rate the parameter SSearchRAT might be set to a lower value so the UE will start earlier with inter-RAT measurements. Also the cell offsets for the GSM cells can be adapted to prefer call setup on the 2G layer. Another problem arises when different LA codes are defined for the GSM and UMTS networks and the Inter-RAT reselection criterion is met. This is in particular the case for subscribers inside a building where the UMTS coverage is not as strong compared to the GSM coverage, but the preference is on the UMTS network. As a consequence it is recommended to assign the same LA codes to GSM and UMTS cells that are providing coverage to the same area to avoid LAU ping-pong. Table 3 below is listing the identification techniques of PLMN/cell (re-)selection failures in drive test traces and scanner measurements: Problem
Wrong PLMN selected ACB Call setup on nonoptimal cell Call setup on nonoptimal RAN technology Ping-pong LU between 2G / 3G
Trace
Uu Uu Uu, 3G scanner Uu, 2G/3G scanner
Trigger
Any occurrence of the MNC of the cell the UE is camping on is different from the MNC of the H-PLMN Any occurrence of IE Access Class Barred = TRUE in SIB3/SIB4 The call is setup via RRCConnectionSetup message on a cell that is not on the x best cell listed by the 3G scanner within y dB window. The RXLEV of the best measured 2G cell is within a x dB window (or even better) for y seconds compared to the RSCP of the cell the UE is camping on when sending the RRC Connection Request or Cell Update message on RACH There are two consecutive LUs between 2G and 3G within x seconds and the LA codes for the cells are different.
Uu
Table 3: Identification of PLMN/cell (re-)selection failures in traces Cell selection and reselection failures cannot be detected via PMs because the process is within the UE. Failures during the Location Registration procedure are identified via CN PMs and covered in subsection 5.3.1.
5.1.2. Failures on the AICH, PICH and PCH 5.1.2.1. Concept The UTRAN might initiate the paging procedure because of the following events: The UTRAN is receiving a paging request from the CN via RANAP The UE has an established PDP context, but the UE is in URA_PCH or Cell_PCH mode and downlink PS data are scheduled to be delivered
If the UE is in idle, URA_PCH or CELL_PCH modes and the UE is receiving a Paging Indication on the PICH from the NodeB; then the UE is starting to monitor the PCH to receive the paging (Paging Type 1). In case the UE is in connected mode and is paged, then the UTRAN is sending the paging via DCCH (Paging Type 2). The CN might perform a repetition of paging process in case the UE has not answered within a certain time period. In addition the RNC might trigger the
Page 19 of 108
Document name:
Date:
2009-26-06
repetition of the UE paging in the UTRAN. The repetition timers of the RNC and CN have to be set accordantly. In the following it is assumed that the UE is not in connected mode so it has received a Paging Type 1. After the UE has successfully decoded the paging on the PCH it sends a RACH Preamble using the open loop power control algorithm. When the NodeB receives the RACH Preamble it answers by sending an indication on the AICH, the reception of the AICH is answered by the UE by sending a RRC Connection Request/Cell Update/URA Update message using the RACH (so called RACH Message Part). Upon successful decoding the NodeB forwards the RACH Message Part to the RNC. RACH failures are covered in subsection 5.1.3. The RNC sends back (on the FACH) the RRC Connection Setup/Cell Update Confirm/URA Update Confirm message (successful case). FACH failures are covered in subsection 5.1.6. 5.1.2.2. Failure symptoms, identification and fixes for improvement Non-optimal power settings of the PICH, AICH or PCH Poor radio conditions in terms of low RSCP or Ec/No because of e.g. pilot pollution (subsection 6.4.1), poor RF coverage (subsection 6.4.5), camping on a non-optimal cell (see subsection 5.1.1) etc. Congestion on the PCH UTRAN sending paging to incorrect URA area or mismatch in paging DRX cycle coefficient (sent in SIB1, RRC connection setup and RB reconfiguration messages) or UE timing issues to lock on PCH (i.e. UE still asleep when paging is sent) especially if no response from UE while RF conditions are good
Failures on the PCH, PICH and AICH are most likely due to
Table 4 below is listing the main UTRAN parameters configuring the PICH, PCH and AICH: Parameter PCH. pichPowerRelativeToPcpich SCCPCH. SccpchPowerRelativeToPcpich RACH. aichPowerRelativeToPcpich xxPagingTimer1 PCH. nrOfPagingRepetition Description
UTRAN parameter defining the power settings of the PICH UTRAN parameter defining the power settings of the SCCPCH UTRAN parameter defining the power settings of the AICH Timeout when the RNC will repeat the paging Number of Type 1 paging repetitions sent by the RNC provided isPagingRepetitionAllowed = True
Table 4: Parameter used for configuring the PICH, AICH and PCH The paging itself is sent on the PCH that is a PHY channel on Uu. The drive test equipment can record paging requests. However analysing drive test logs is not
1
Note this is a static MIB parameter and is not visible via OAM (e.g. WiPS)
Page 20 of 108
Document name:
Date:
2009-26-06
a good way to investigate paging problems because paging that is not received by the UE can only be detected via parallel Iub tracing. A better approach for analysing call setup problems due to paging failures is to use PM counters of the UTRAN. If the UE is in URA_PCH or CELL_PCH mode, the RRC connection is maintained via the common physical channels (subsection 6.6). When the UE cannot be reached via paging the UTRAN may decide to drop the RRC connection.
VS.IuReleaseReq.PS.UtranPageFail
Figure 4: Dropped RRC connection due to unsuccessful paging A solution of lowering the paging load might be to separate the FACH and PCH on two SCCPCH by introducing an additional SCCPCH channel. In addition creating smaller Location Areas / Routing Areas will also lower the paging load. Failures on the AICH or PICH (PHY channels, no corresponding Transport channels) can be detected using advanced UE log collection. In such cases UE repeats RACH preamble and there is no AICH reply even after max number of preambles exhausted; on the other hand if AICH reports NACK it means power settings for AICH is optimum but we could have UL RSSI issue which leads to maximum preamble being transmitted but still NACK received on AICH.
Page 21 of 108
Document name:
Date:
2009-26-06
Note that PHY ACK on AICH can only be recorded and analysed by certain UE tools like Qualcomm QXDM and QCAT respectively. In addition normal RF optimisation for areas with low Ec/No will improve the situation and power increase can also help if no AICH response seen on PHY. Table 5 below is listing of how failures on the PICH/AICH/PCH can be identified in network traces: Problem
RRC drop due to unsuccessful paging
Unsuccessful paging
CT
Table 6 below is listing the identification possibilities using KPIs/Counters retrieved by the UTRAN PM system. PM system
UtranCell
Table 6: PM KPIs/Counters for PICH/PCH/AICH failures Counter / KPI KPI Name / Description
(VS.NbrCellUpdates.PagingResponse / (VS.IuReleaseReqPS.UtranPageFail + VS.NbrCellUpdates.PagingResponse)) VS.UnhandledPagingRequests.OverloadControls UTRAN pagging response success rate
RNC
This measurement provides the number of paging attempts discarded by the RNC TPU due to processor load for CS and PS calls This measurement provides the number of paging attempts received by the RNC Provides the channel occupancy for the PCH channel
RNC UtranCell
VS.ReceivedPagingRequest VS.CommonMacDownlinkPcchSdu
5.1.3. Random Access Procedure 5.1.3.1. Concept The RACH Access Procedure is used when attaching to the network, setting up a call, answering to a page or performing a LA Update/RA Update. The RACH procedure has been successfully performed when the RACH Message Part is received by the RNC upon successful decoding at the NodeB. The RACH is transmitted on the PHY in two separated parts: first a certain number of RACH Preambles are sent. The power of the first RACH Preamble is relatively low and calculated using Open Loop Power Control. Each of the following RACH Preambles are transmitted with an increased power level till an ACK is received on the AICH. Then the UE transmits the RRC Connection Request (Cell Update, URA Update) message in the RACH Message Part. Figure 5 below illustrates the transmission of several RACH Preambles in different Ramping Cycles and only after the reception of an ACK on AICH, the transmission of the RACH message part:
Page 22 of 108
Document name:
Date:
2009-26-06
Figure 5: RACH procedure with RACH Preambles and Message Part When the UE sends the RRC Connection Request message for the first time, it resets its internal counter V300 to 1 and stars its internal guard timer T300 (taken from UTRAN parameter t300); if the UE has already sent one or several RRC Connection Request messages before, counter V300 is incremented by one and guard timer T300 is restarted. Upon reception of the RRC Connection Request message at the RNC, PM counter RRC.AttConnEstab.<per establishment cause> is incremented by one2. Upon expiry of timer T300 the UE may start again by sending RACH Preambles depending on the status of counter V300. If V300 <= N300 (configured by UTRAN parameter n300), the UE increments V300 by one, resets T300 and sends the RACH Preamble again. If V300 > N300, the UE stops sending on the RACH and stays in idle mode 2. For the Cell Update and URA Update procedure N302 and T302 are used (from network broadcasted parameters n302 and t302). Figure 6 below is showing as an example the Cell Update procedure:
<per establishment cause> is a placeholder for e.g. OrigConvCall, OrigStrmCall etc. A full list is available in 2.
Page 23 of 108
Document name:
Date:
2009-26-06
Failures in the RACH procedure occur if either the RACH Preamble or the RACH Message Part cannot be decoded. Possible reasons for these decoding problems are: Non optimal RACH power settings Non optimal RACH counter/timer settings RACH congestion Non optimal setting of RACH search Window3 Poor radio conditions in terms of low RSCP or Ec/No because of e.g. pilot pollution (subsection 6.4.1), poor RF coverage (subsection 6.4.5), camping on a non-optimal cell (see subsection 5.1.1) etc.
In the following only the RACH specific issues are covered, for the other (common) RF issues see the corresponding subsections. Table 7 below is listing the main UTRAN parameters configuring the RACH: Parameter RACH. constantValue RACH. powerOffsetPo RACH. preambleRetransMax RACH. preambleThreshold PowerConfClass . sibMaxAllowedUlTxPo werOnRach RACHTxParameters. mMax UeTimerCstIdelMode .t300 UeTimerCstIdelMode .n300 UeTimerCstConnected Mode .t302 UeTimerCstConnected Mode Description
Used by UE to calculate Initial Preamble Power Determines the power increment between two successive RACH Preambles Determines the maximum number of preambles allowed within one Power Ramping Cycle The threshold for preamble detection. The ratio between received preamble power during the preamble period and interference level shall be above this threshold in order to be acknowledged. This parameter is ignored by OneBTS, as it uses its internal value. This parameter defines the maximum allowed power the UE may use when accessing the cell on PRACH in idle mode
UE guard timer that is supervising the RRC Connection Setup procedure when the UE is waiting for the RRC Connection Setup message Defines the number of times the UE is allowed to send the same RRC Connection Request message UE guard timer that is supervising the Cell/URA Update procedure when the UE is waiting for the Cell Update Confirm/ URA Update Confirm message Defines the number of times the UE is allowed to send the same
Static NodeB tunable parameters for OneBTS and Class 0 parameter BTSCell.cellSize in iBTS
Page 24 of 108
Document name:
Date:
2009-26-06
Parameter .n302
Description
Cell Update/ URA Update message
Table 7: Parameter used for configuring the RACH 5.1.3.2. Failure symptoms, identification and fixes for improvement
The RACH Preambles may only be recorded in internal UE or NodeB traces, but not by normal drive test tools. In most cases only limited statistic about the PHY and MAC procedure of the RACH is listed in the drive test logs e.g. number of RACH Preambles sent, last transmitted power etc4. The RACH performance can be improved by changing of the power settings and/or changing of the timer/counter as listed in Table 7. Also refer to suggestions in section 5.1.2.2 in this regards. Table 8 below is listing the identification possibilities for network and UE traces; Table 9 below is listing the identification possibilities using KPIs retrieved by the UTRAN PM system. Problem
RACH message lost
Trace
Uu and CT
Trigger
Cross-correlation Uu/CT trace: RACH Message Part (RRC Connection Request, Cell Update or URA Update) is recorded on the Uu, but not captured in CT traces.
Counter / KPI
(VS.CommonMacUplinkCcchSdu / (VS.CommonRlcCcchDiscardSdu + VS.CommonMacUplinkCcchSdu)) VS.CommonMacUplinkDcchOverRachSdu + VS.CommonMacUplinkDtchOverRachSdu VS.NbrCellUpdates.<causes>
UtranCell UtranCell
5.1.4. Call Admission Control (CAC) 5.1.4.1. Concept The Call Admission Control (CAC) procedure admits or denies the establishment of the RRC connection upon RACH access to avoid an overload of the UMTS system. The CAC thresholds can be defined based on uplink noise
Note: It might be that in the drive test logs a RRCConnectionRequest message is listed, but the RACH message part is never transmitted via the air interface in case the RACH preamble has already failed. The higher layer (RRC) initiates the transmission of the RACH message. In case of a lower layer failure to deliver preamble it is up to the higher layer re-initiate the whole RACH procedure again (means in the RRC decoding another RACH Message would be listed).
Page 25 of 108
Document name:
Date:
2009-26-06
rise and downlink load separately. The CAC algorithms and the corresponding parameter are described in detail in 2. The CAC is started after the RNC receives the RRC Connection Request message on RACH and executes UL and DL CAC before setting up the first RL on NBAP for the initial SRB channel (see Figure 7 below):
RRC.FailConnEstab.Cong.Sum
Figure 7: CAC executed after reception of RACH Message Part If the defined thresholds for CAC are exceeded the RRC connection establishment request is denied and a RRC Connection Reject message with cause Congestion is sent back to the UE. The only optimisation approach in case of CAC rejections is to optimise the RF environment in terms of pilot pollution, neighbour list optimisation etc. In addition it should be verified that the CAC thresholds are set correctly and the power control settings dont result in consuming too much resources to support a call. Table 10 below is listing the main parameters configuring CAC before RRC connection setup is sent back from UTRAN: Parameter CacConfClass .maxUlInterferenceLevel PowerPartConfClass .callAdmissionRatio Description
Specifies the threshold for UL call admission of a RRC connection request received on RACH. Specfies the threshold for DL call admission of a first RL setup in response to RRC connection request received
5.1.4.2.
CAC failures can only be identified in a reliable manner via PM counters or internal call trace/ UE logs. Problem Trace Trigger
Page 26 of 108
Document name:
Date:
2009-26-06
Uu or CT
After the UE sends a RRC Connection Request message and RNC replies with RRC Connection Reject message with cause Congestion.
Table 11: Identification of RRC Connection Reject due to Congestion For CAC related PM KPIs see 2 however the main PM counter is given below: PM system
UtranCell UtranCell UtranCell
Counter / KPI
RRC.FailConnEstab.Cong.Sum RRC.FailConnEstab.DLPowRsrc VS.RadioLinkFirstSetupFailure.R rmRefusal
Name / Description
This measurement provides the number of RRC connection rejects sent with cause Congestion This measurement provides the number of failed RRC connection due to lack of DL power First RL setup failure caused by rejection due to lack of resources
Table 12: PM Counter for CAC failures 5.1.5. Radio Link Setup 5.1.5.1. Concept During the call establishment phase after the CAC is granted, the RNC requests the NodeB to allocate resources through the NBAP Radio Link Setup message. In case of soft handover when allocating resources on a new NodeB The Radio Link Setup procedure is initiated in two cases:
Note that after the Radio Link Setup on NBAP the RNC should initiate the establishment of the AAL2 bearer over the Iub interface using ALCAP (ALCAP Establishment Request and ALCAP Establishment Confirm). Problems on ALCAP could be due to ATM configuration and are outside the scope of this document. ATM synchronisation problems are not expected at this stage of the call because of the already successful NBAP procedure. The same is valid for the synchronisation between NodeB and RNC via the DCH-FP over AAL5 bearer.
Page 27 of 108
Document name:
Date:
2009-26-06
Figure 8: Initial RRC Setup Steps after successful CAC 5.1.5.2. Failure symptoms, identification and fixes for improvement The NBAP First Radio Link Setup procedure may fail and the NodeB sends back the Radio Link Setup Failure message. According to 2 the failure causes can be classified as follows: Radio Network Layer Cause Transport Layer Cause Protocol Cause Miscellaneous Cause
Each category has many subcauses like Transport Resources unavailable, NodeB Resources unavailable, DL Radio resources unavailable, Semantic error etc. 3GPP has defined a variety of failure causes. Here one major reason for NodeB resources problem can be UCU/CEM capacity shortage, while transport resources issue can point to the backhaul bandwidth limitation. RNC can also cancel an on-going NBAP pocedure if nbap_TimerInMsec expires. This is a static parameter in 9370 and all such parameters need a MIB patch if a change is required. Table 13 below is listing the identification possibilities for network interface traces; Table 14 is listing the identification possibilities using KPIs retrieved by the UTRAN PM system. For identification of failures during the Radio Link Setup procedure, CT traces are mandatory. Reason is that on Uu, the RRC Connection Reject message is available with only two possible failure causes (congestion and unspecified), see also subsection 5.1.4. Problem
Radio Link Setup I
Trace
Uu and CT CT
Trigger
Cross-correlation Uu/CT trace: Any occurrence of the NBAP Radio Link Setup Failure message in CT and RRC Connection Reject with cause unspecified or congestion after that Any occurrence of the NBAP Radio Link Setup Failure message in CT
Counter / KPI
(RRC.FailConnEstab.Unspec / RRC.AttConnEstab.<sum>)
UtranCell UtranCell
UtranCell RNC
Page 28 of 108
Document name:
Date:
2009-26-06
VS.IurDrncRadioLinkSetupSuccess))
Iur
Table 14: PM KPIs for Radio Link Setup procedure 5.1.6.Call setup failures on the FACH 5.1.6.1. Concept This subsection is covering only call setup related failures on FACH; for failures in CELL_FACH mode see subsection 6.7. It is assumed that the RACH Message Part has been successfully received, the CAC has been granted and the RL are established. In this case the RNC sends back either the RRC Connection Setup, Cell Update Confirm or URA Update Confirm message on FACH (successful case). Here we only discuss RRC connection procedure The RNC sends the FACH message, resets internal counter V351 and starts its guard timer T351. When the RNC receives the answer by the UE (i.e. RRC Connection Setup Complete) before T351 expires, the RNC stops T351. If the RNC does not receive the message before T351 expires, the RNC may resend the FACH message depending on the status of V351. If V351<= N351 (maximum number of retries), the RNC increments V351 by one, resets timer T351 and sends the FACH message again. If V351 > N351, the RNC will stop sending FACHs to the UE and will release the reserved resources on NBAP and ALCAP. This UE context release will be initiated by counter T352. Note that the RNC will not send any failure message on the Uu. The whole procedure is visualised in Figure 9 below:
<cause> include RRM refusal, INode refusal, timeout, RL setup failure, Iub congestion, lack of Iub CID, lack of CEM L1 resources, lack of Iub bandwidth.
Page 29 of 108
Document name:
Date:
2009-26-06
RRC.FailConnEstab.TimeoutRepeat + RRC.FailConnEstab.Reselect
Figure 9: Failures on FACH during RRC connection phase Table 15 below is listing the parameters configuring the FACH: Parameter FACH. fachTrbPowerOffset FACH. fachSrbPowerOffset CallAccessPerformanceConf. t351 CallAccessPerformanceConf. n351 CallAccessPerformanceConf. t352 Description
UTRAN parameter defining the power settings of the FACH data part UTRAN parameter defining the power settings of the FACH control part UTRAN timer to repeat RRC connection setup upon expiry
Page 30 of 108
Document name:
Date:
2009-26-06
5.1.6.2.
Failure symptoms, identification and fixes for improvement Non optimal UTRAN parameter settings (e.g. FACH signalling and traffic power) FACH message is not successfully decoded due to poor FACH coverage. This can be improved by enabling the RRC connection quick repeat and FACH power adjustment feature by setting CallAccessPerformanceConf.isQuickRepeatAllowed = TRUE Call setup not done on an optimal cell (subsection 5.1.1) The message on the FACH is successfully decoded by the UE, but afterwards the RNC cannot successfully decode the answer sent by the UE (UE is already in CELL_DCH mode, see also subsection 5.2) Rogue UE keeps retrying to setup RRC connection via RACH but does not repond to RRC connection setup from RNC. No IMSI assigned at this stage so difficult to pinpoint the UE. One can use propagation delay IE in the RRC connection setup request and LAC/RAC to validate if single UE is responsible.
Failures on the FACH can be indicated by UTRAN PM Statistics, Iub and Uu traces. On Uu FACH failures cannot be directly observed because there is no corresponding failure message sent. Table 16 below is listing the identification of FACH failures using call trace and UE logs, Table 17 the corresponding PM KPIs: Problem
Lost FACH SRB message FACH Failure
Trace
Uu and CT Uu and CT
Trigger
Cross-correlation Uu/CT trace: one or more FACH messages are recorded in CT, but not on the Uu interface Occurrence of Cell Update messages (repeated Cell Update Confirms ignored by the UE as seen in CT), then RRC Connection Release message with specified cause other than normal event sent back by the RNC on Uu
Counter / KPI
((RRC.FailConnEstab.TimeoutRepeat + RRC.FailConnEstab.Reselect) / RRC.AttConnEstab.<sum>) VS.CommonMacDownlinkDcchOverFachSdu + VS.CommonMacDownlinkDtchOverFachSdu
5.2.
Page 31 of 108
Document name:
Date:
2009-26-06
UE is configured to report the measurements of more than one NodeB by activation of Event 1A measurement on SIB 11 The measurement from more than current cell is reported RNC then directs the UE to soft/softer HO through ASU procedure
Table below is listing the parameters that are important for the call setup phase: Parameter Description
HoConfClass Object that contains the event 1A related parameters like reporting range, time to trigger, hysteresis etc broadcasted on SIB 11 .Event1AHoConfInSIB11 FDDCell. Activates the event 1A measurement to be broadcasted on SIB11 isSib11MeasReportingAllowed Table 18: Parameter important for the call setup phase For more details about the translations see 2. If the call is setup in an area where several NodeBs are providing marginal coverage and if it is not possible to add the radio legs quickly, there is a big likelihood that the call will fail before RAB is actually established. The above feature tries to minimise the wait for the reception of the Measurement Control message and helps in avoiding a call drop in such conditions. 5.2.2.Failure symptoms, identification and fixes for improvement The RRC connection might drop in this early stage due to the following reasons: Non optimal handover parameter configuring the call setup in soft/softer handover mode Non optimal power settings Poor radio conditions in terms of low RSCP or Ec/No because of e.g. pilot pollution (subsection 6.4.1), poor RF coverage (subsection 6.4.5), camping on a non-optimal cell resulting in non-optimal reselection list (see subsection 5.1.1) etc.
There are no specific PM counters available that can be used to identify issues during the call setup phase because at this point the UE is already in CELL_DCH mode so a drop of the RRC connection cannot be differentiated from an RRC drop occurred in a later stage of the call. Also the drop might occur only a very short time later, but the root cause for the failure is one of the issues mentioned above. Nevertheless it is possible to identify issues in UE traces as listed in Table 19 below: Problem
Call setup on a nonoptimal cell Not best cells in AS at call setup Drop of RRC connection
Trace
Uu, 3G scanner Uu, 3G scanner Uu
Trigger
The call is setup via RRCConnectionSetup message on a cell and at the same time the 3G scanner is reporting at least x cells that are within a y dB window compared to the best measured cell. The number of cells in the Active Set is smaller than max AS size, but one neighbouring cell is within xdB window compared to the Ec/No of the best cell in the Active Set The call is dropped within x seconds after sending the RRC Connection
Page 32 of 108
Document name:
Date:
2009-26-06
at call setup
Request
5.3.
The three protocols are sublayer protocols of the Connection Management (CM); these protocols are specified in 2 and 2. CM failures causes like CM Service Reject Cause is mapped on the Reject Cause of the Mobility Management IE 2. Note that (almost) any failure in this subsection is not UTRAN related because Direct Transfer messages are transparent to the UTRAN6. Any of the failures can be easily detected by the corresponding failure messages. Because the protocols are transparent to the UTRAN all PM KPIs are defined within the CN entities e.g. SGSN / GGSN, 3G-MSC basis.
5.3.1. Mobility Management failures 5.3.1.1. Concept The main function of the mobility management is to support the mobility of user terminals, such as informing the network of its present location and providing user identity confidentiality. A mobility management context in the SGSN or 3GMSC is a prerequisite for the initialisation of voice, data or VT services. 5.3.1.2. Failure symptoms, identification and fixes for improvement
For the root cause analysis please review the timer settings supervising the mobility management protocols as specified in 2 chapter 11.2. The settings of these timers are specified and not configurable. In addition Mobility Management failures might be due to missing roaming agreement, locked SIM card, CN problems like authentication not possible due to inaccessible HLR database etc. The failure messages are retrieved from 2 chapter 9.2 (MM/CM) and 9.4 (GMM). Table 20 below is listing the Mobility Management failures as they can be retrieved by UE or call traces:
6
Exception: there might be the case that due to a bad RF environment the direct transfer messages cannot be delivered to the other entity because the RLC layer is not able to deliver the corresponding message also after RLC retransmissions, RLC resets etc. It is up to the corresponding higher layer (e.g. CC, GMM, MM or SM) to react accordantly of the discarded message.
Page 33 of 108
Document name:
Date:
2009-26-06
Problem
MM Authentication Reject CM Service Reject CM Service Abort
Trace
Uu or CT Uu or CT Uu or CT
Trigger
Any occurrence of a MM Authentication reject message sent by the CN e.g. because of not-allowed national/international roaming Any occurrence of a CM Service reject message sent by the CN; the reject cause will give an indication of the occurred failure. Any occurrence of a CM Service abort message sent by the UE. This message is sent by the mobile station to the network to request the abortion of the first MM connection establishment in progress and the release of the RRC connection. Any occurrence of a MM Abort message sent by the CN. This message is sent by the network to the mobile station to initiate the abortion of all MM connections and to indicate the reason for the abortion. The rejection cause will give an indication about the occurred failure. Any occurrence of a MM Location updating reject message sent by the CN. The specified rejection cause will indicate the reason for the failure e.g. IMSI unknown in the HLR, illegal MS/ME, roaming not allowed etc. Any occurrence of a GMM Attach Reject message sent by the CN. The specified rejection cause will indicate the reason for the failure e.g. protocol error, wrong or incorrect IE format etc. Any occurrence of a GMM Authentication and Ciphering Failure message sent by the UE. The specified rejection cause will indicate the reason for the failure e.g. a sync failure. Any occurrence of a GMM Authentication and Ciphering Reject message sent by the CN. Any occurrence of a GMM Routing area update reject message sent by the CN. The specified rejection cause will indicate the reason for the failure e.g. protocol error, wrong or incorrect IE format etc. Any occurrence of a GMM Service reject message sent by the CN
MM Abort
Uu or CT
Uu or CT
Uu or CT
GMM Authentication and Ciphering Failure GMM Authentication and Ciphering Reject GMM Routing Area Update Reject GMM Service Reject
Uu or CT
Uu or CT Uu or CT
Uu or CT
Table 20: Identification of Mobility Management failures in interface traces For listing of the PM KPIs of the Mobility Management refer to the PM counters documentation of the 3G-MSC and SGSN from applicable vendor.
5.3.2.Call Control failures 5.3.2.1. Concept This subsection describes failures on the Call Control (CC) protocol. The CC protocol is responsible for CS call establishment and clearing procedures, calls information phase procedures etc. CC procedures can only be performed if a MM context has been established between the UE and the CN (subsection 5.3.1). 5.3.2.2. Failure symptoms, identification and fixes for improvement
Table 21 below is listing the CC failures as they can be retrieved by various traces 2; note that the specified cause might depend on the 3G-MSC/UE vendors: Problem
Abnormal Disconnect CC
Trace
Uu or CT
Trigger
Any occurrence of a CC Disconnect message (either UE or CN initiated) with specified cause other than normal event
Page 34 of 108
Document name:
Date:
2009-26-06
Abnormal Release
CC
Uu or CT
Any occurrence of a CC Release / Release Complete message (either UE or CN initiated) with specified cause other than normal event
Table 21: Identification of CC failures in interface traces For listing of the PM KPIs of the CC failures as they can be retrieved by the PM system of the 3G-MSC, refer to PM counters documentation from applicable CN vendor. Depending on the specified failure cause the failure might be due to missing resources (e.g. requested circuit/channel not available), drive test configuration issue (e.g. User busy) or protocol failure. For the root cause analysis please check the timer settings supervising the CC protocol in 2 chapter 11.3. The settings of these timers are not configurable.
5.3.3.Session Management failures 5.3.3.1. Concept The main function of SM is to support the PDP context handling of the PS services. The SM comprises procedures for identified PDP context activation, deactivation and modification. SM procedures for identified access can only be performed if a GMM context has been established between the UE and the CN (subsection 5.3.1). 5.3.3.2. Failure symptoms, identification and fixes for improvement
The failure messages are retrieved from 2. Table 22 below is listing the SM failures as they can be retrieved by either UE logs or UTRAN call trace: Problem
SM Activate PDP Context Reject
Trace
Uu or CT
Trigger
Any occurrence of a SM Activate PDP Context Reject message sent by the CN. The specified rejection cause is giving an indication of the type of failure e.g. protocol error, missing or faulty APN, lack of resources etc. Any occurrence of a SM Activate Secondary PDP Context Reject message sent by the CN. The specified rejection cause is giving an indication of the type of failure e.g. protocol error, missing or faulty APN, lack of resources etc. Any occurrence of a SM Request PDP Context Activation Reject message sent by the UE. The specified rejection cause is giving an indication of the type of failure e.g. protocol error, feature not supported, lack of resources etc. Any occurrence of a SM Modify PDP Context Reject message sent by the CN. The specified rejection cause is giving an indication of the type of failure e.g. protocol error, service option not supported, lack of resources etc.
Uu or CT
Uu or CT
Uu or CT
Table 22: Identification of SM failures in interface traces Again for listing of the PM KPIs of the SM failures as they can be retrieved by the PM system of the GGSN, refer to PM counters documentation from applicable CN vendor.
Page 35 of 108
Document name:
Date:
2009-26-06
The most common SM failures are PDP Context activation failures due to wrong or missing APN or if the user is not allowed to subscribe to PS services. This is also a typical configuration issue of the drive test equipment. For the root cause analysis please review the timer settings supervising the SM protocol in 2 chapter 11.2.3. The settings of these timers are specified and not configurable.
5.4.
Figure 10: RAB establishment procedure RAB establishment procedure is always initiated by the RANAP RAB Assignment Request and terminated by the RAB Assignment Response. The failure and failure causes of the RAB Establishment are specified in 2. In ALU UTRAN, a RRC RB reconfiguration for RLC settings change is also done straight after RNC receives the RB setup complete. And UTRAN only sends back RAB Assignment Response upon completion of this reconfiguration. This is the case if SRB with high data rate is used to setup call and later changed to lower rate and both use different RLC settings. In case low rate SRB channel is used throughout or same RLC settings employed for the two SRB then this extra reconfiguration doesnt take place and call flow as per Figure 10 applies. Table 23 below is listing how to identify failures of the RAB establishment procedure in network call trace:
Page 36 of 108
Document name:
Date:
2009-26-06
Problem
RAB establishment failure
Trace
CT
Trigger
Any occurrence of an RAB Assignment Response with specified failure cause according to 3GPP7
Table 23: Identification of RAB establishment failures in traces In the following subsections possible root causes for an unsuccessful RAB establishment are discussed in detail.
5.4.1. Intelligent Rate Matching CAC (iRM CAC) 5.4.1.1. Concept iRM CAC is used to prevent overload of the system due to high call load, in case new resources or increase in resources are requested. In case Fair sharing (feature 33694) is enabled, in addition of R99 traffic, GBR calls on HSDPA will also be taken into account, in DL power and code cell colour calculations. This check takes place During the RAB establishment after the RNC is receiving the RAB Assignment Request on RANAP During the transition of CELL_FACH/URA_PCH to CELL_DCH mode (see also subsection 6.6) after the RNC is receiving the corresponding RACH messages In case data rate increase is triggered (see also subsection 7.2.3) after the RNC measures the DL BO/throughput increase or receives a UL BO RRC Measurement Report from the UE In case data rate change is triggered due to downlink dedicated Tx code power crossing certain thresholds for a R99 PS call
Cell colour thresholds can be defined for UL (noise rise, CEM) and DL (power, code, CEM) loads separately and options exist to disregard taking into account some of these resources in overall cell colour calculations. In case iRM CAC grants the requested service the call handling proceeds as specified (depending on the phase of the call), otherwise the call handling is as follows: During the RAB establishment the RNC sends a RAB Assignment Response message on RANAP with specified cause No resource available under miscellaneous class. On Uu the following messages/outcomes will be indicating that CAC has not granted the requested service: o The assigned PS RB is smaller than the default one or the one requested in the PDP Context Activation message8; the default PS RB is configurable OR the PDP Context Activation is rejected with an appropriate specified cause like QoS not accepted or Insufficient resources o
7 8
There are a huge number of failure causes, but not all related to RAB assignment failure. The requested QoS profile in the PDP Context Activation message might be ignored and only a default one is assigned
Page 37 of 108
Document name:
Date:
2009-26-06
o o
The Voice call receives a CC Disconnect message with specified cause resource unavailable The RNC sends back the UE to idle mode with the RRC Connection Release message and specified cause congestion OR The RNC sends back to the UE either a Cell Update Confirm / URA Update Confirm message, but the RRC State Indicator is set to CELL_FACH/URA_PCH.
In case of throughput or BO measurement based data rate increase: the internal RNC BO or throughput or UE RRC Measurement is just ignored so the UTRAN keeps the current RB data rate
Not granting the requested service by iRM CAC indicates either high cell loading (reflected by Red cell colour) or an area of high interference. The approach in interference is to optimise the RF environment in terms of reducing pilot pollution, improving RF coverage, neighbour list optimisation etc. Features like iMCTA CAC can be enabled to divert the newly setup calls by triggering HHO to different 3G or 2G layer in case these experience CAC. Even iMCTA service can be configured to offload on-going calls to other 3G carriers or 2G once originating cell colour turns Red while the target 2G/3G cell is Green. One may confirm that the various resource thresholds to change cell colour are not set to trigger early transition to Red. An example threshold is given below, please refer to 2 volume 5 for detailed discussion on iRM CAC and colour thresholds. Parameter Description
IrmOnCellColourParameters. Threshold for DL power yellow2RedPLCThreshold IrmOnCellColourParameters. Threshold for DL code yellow2RedCLCThreshold DlIrmCEMParameters. Threshold for DL CEM/UCU usage yellow2RedDlCEMThresold UlIrmRadioLoadParameters. Threshold for UL noise level yellow2RedUlRadioLoadThreshold UlIrmCEMParameters. Threshold for UL CEM/UCU usage yellow2RedUlCEMThresold IrmIubTransportLoadParameter. Threshold for Iub usage yellow2RedDlTLCThresold Table 24: Cell colour threshold (Yellow to Red) for various resources 5.4.1.2. Failure symptoms, identification and fixes for improvement
Table 25 is listing the identification techniques in traces in case CAC is not granting the requested service: Problem
CAC RAB not granted on Iu
Trace
CT
Trigger
Any occurrence of a RAB Assignment Response message on RANAP with specified cause No resource available
Page 38 of 108
Document name:
Date:
2009-26-06
CAC RAB not granted on Iu and Uu CAC RAB PS not granted CAC RB Setup PS CAC RB Setup VT CAC Release RRC
CT and Uu CT or Uu Uu
Cross-correlation Uu/CT trace: Any occurrence of a RAB Assignment Response message on RANAP with specified cause No resource available Any occurrence of a SM Activate PDP Context Reject message sent by the CN to the UE and the specified cause is Insufficient resources On Uu, in the RRC RB Setup Message the IE Spreading Factor is larger than the default one and a PDP Context Activation message was sent within the last x seconds with the requested bit rate in the DL higher than the granted one The VT call has been requested, the called entity is also a UE with VT capabilities but a voice RB is setup Any occurrence of an RRC Cell Update/URA Update message following within x seconds a RRC Connection Release message with specified cause congestion and the UE is in either CELL_PCH or URA_PCH mode The UE is sending a CC Setup message and within x seconds gets a CC Disconnect with cause resource unavailable The UE is sending a Cell Update/URA Update message and the RNC is sending back within x seconds a Cell Update Confirm/URA Update Confirm message with RRC State Indicator set to CELL_PCH/URA_PCH.
Uu Uu
Uu Uu
Table 25: Identification of iRM CAC rejections in interface traces For iRM CAC related PM counters see 2 with a summarized version shown below. Note that <Cause> can be UL interference, DL code starvation or DL power. There are also counters that track the duration of time cell color was red or yellow for UL and DL. PM system
UtranCell UtranCell UtranCell
Counter / KPI
RAB.FailEstab.PS.<cause> RAB.FailEstab.CS.<cause> VS.IRMTimeCellRadioColorR ed, Yellow
Name / Description
Number of RAB Establishment Failures due to a given cause for CS domain. Number of RAB Establishment Failures due to a given cause for PS domain. Counter that tracks the percentage of time that a particular cell is considered red or yellow by iRM
5.4.2.Radio Link Reconfiguration 5.4.2.1. Concept After iRM CAC has taken place the RLs on the Iub have to be reconfigured using the Radio Link Reconfiguration procedure on NBAP. The flowchart can be seen in Figure 10. RNC tries to allocate resources on the Iub by sending a RL Reconfiguration Prepare message on NBAP. The NodeB answers by either sending a Radio Link Reconfiguration Ready (successful case) or Radio Link Reconfiguration Failure (unsuccessful case). The successful case ends in the RNC sending a Radio Link Reconfiguration Commit to the NodeB. This procedure is used to order the Node B to switch to the new configuration for the Radio Link(s) within the Node B at a given activation CFN. The whole procedure is described in 2.
Page 39 of 108
Document name:
Date:
2009-26-06
5.4.2.2.
For the failure analysis, please refer to subsection 5.1.5.2 as same failure causes are used in both cases. Table 27 below is listing the identification triggers for network traces, Table 28 the corresponding UTRAN KPIs. Problem
Radio Link Reconfiguration failure
Trace
CT
Trigger
Any occurrence of the NBAP Radio Link Reconfiguration Failure message on Iub x seconds after there was a Radio Link Reconfiguration Prepare on NBAP
Counter / KPI
(VS.RadioLinkReconfigurationPrepareSuccess / (VS.RadioLinkReconfigurationPrepareUnsuccess.<sum> + VS.RadioLinkReconfigurationPrepareSuccess)) VS.RadioLinkReconfigurationPrepareUnsuccess.<cause>9 VS.RadioLinkReconfigurationCancel
UtranCell UtranCell
5.4.3. Radio Bearer Establishment 5.4.3.1. Concept Once the required resources have been successfully reconfigured in the NodeB, RNC sends the Radio Bearer Setup message to the UE that sends back the Radio Bearer Setup Complete message upon successfully allocating resources for the new RB. The Radio Bearer Establishment procedure may fail for different reasons (see below); in that case the UE sends back a Radio Bearer Setup Failure message to the RNC. When a physical dedicated channel establishment is initiated by the UE, the UE shall start a timer T312 and wait for N312 successive in sync indications. On receiving N312 successive in sync indications, the physical channel is considered established and the timer T312 is stopped and reset. If the timer T312 expires before the physical channel is established, the UE shall consider this as a physical channel establishment failure. The whole procedure is explained in 2. Table 29 below is listing the parameters for the RB Establishment: Parameter UeTimerCstConnectedMode .t312 UeTimerCstConnectedMode
9
Description
UTRAN parameter configuring timer T312
<cause> include RRM refusal, INode refusal, NBAP timeout, RL reconfiguration failure, Iub congestion, lack of Iub CID, lack of CEM L1 resources, lack of Iub bandwidth and NodeB out of order
Page 40 of 108
Document name:
Date:
2009-26-06
5.4.3.2.
In case the UE sends back the Radio Bearer Setup Failure message to the RNC and the Radio Bearer Establishment procedure fails. Main reason for the failure can be subdivided as follows: Physical Channel Failure (i.e. T312 expiry) Unsupported or invalid configuration in the UE Code starvation (the required channelisation code is not available anymore from the code tree) Protocol Error
In general, the physical channel failure occurs when there is loss of synchronisation between UE and NodeB. This is mainly caused by poor RF conditions; see also subsection 6.1 and 6.4 for details. The other causes are expected to occur infrequently and in general are not related to RF issues. The causes of the Radio Bearer Setup Failure message are listed in chapter 10.3.3.13 in 2. Again it is up to the UE vendor, which cause out of this list is chosen for the particular failure that has occurred. Table 30 is listing the identification techniques in traces, Table 31 the corresponding PM KPIs for failures in the Radio Bearer Setup procedure: Problem
RB setup failure
Trace
Uu or CT
Trigger
Any occurrence of the RRC Radio Bearer Setup Failure message
Counter / KPI10
(RAB.FailEstab.CS.RBSetupFail / RAB.AttEstab.CS) (RAB.FailEstab.PS.RBSetupFail / RAB.AttEstab.PS)
Page 41 of 108
Document name:
Date:
2009-26-06
6.1.
Figure 11 below is showing the call handling of the RAB release in case of a dropped call:
Page 42 of 108
Document name:
Date:
2009-26-06
CN
Figure 11: RAB release call flow RLF and RL Restore in the DL: The RLF procedure in the DL is supervised on RRC on the UE side. In CELL_DCH state, the UE starts timer T313 after receiving N313 consecutive out-of-sync indications for the established DPCCH physical channel. The UE stops and resets timer T313 upon receiving successive N315 in-sync indications. If T313 expires, the RRC connection is dropped and the UE goes to idle mode. In idle mode the UE will select a suitable cell according to the cell reselection criteria and will initiate a Cell Update procedure with specified cause radio link failure (chapter 8.5.6 in 2). Subsequently the RLF in the UL will be triggered when the UE is in idle mode by the UTRAN on its own accord. Figure 12 below is showing the transitions between the different states; the initial state of a RL is defined as the state when a new RL is to setup:
Page 43 of 108
Document name:
Date:
2009-26-06
Figure 12: Transitions between different states Table 32 below is listing the parameters that are configuring the RLF and RL Restore procedure: Parameter SynchronisationConfiguration .tRLFailure SynchronisationConfiguration .noOutSyncInd SynchronisationConfiguration .noInSyncInd RadioAccessService .RlRestoreTimer RadioAccessService .rlRestoreTimerAfterRlFailure
UL
Direction
UL
Description
This parameter is defining the setting of T_RLFAILURE This parameter is defining the setting of N_OUTSYNC_IND This parameter is defining the setting of N_INSYNC_IND Configure guard timer to allow time for radio link restore to occur for the first RL when setting up the call. On expiry the RNC releases the call. Configure guard timer T_RL_RESYNC to allow time for the normal operation of the handover and power control algorithm to delete a radio link affected by a loss of synchronization or for re-synchronization to occur when the radio link is one of several associated with a UE connection. This parameter is defining the setting of T313
UL
UL
UL
UeTimerCstConnectedMode .t313 UeTimerCstConnectedMode .n313 UeTimerCstConnectedMode .t314 UeTimerCstConnectedMode .t315 UeTimerCstConnectedMode .n315
DL
DL
DL
DL
DL
Page 44 of 108
Document name:
Date:
2009-26-06
6.1.2. Failure symptoms, identification and fixes for improvement There are a variety of causes responsible for RLFs possibly resulting in dropped calls: Pilot pollution and around-the-corner effect (subsections 6.4.2 & 6.4.3) Weaknesses in the neighbour planning (subsection 6.4.4) Problems during (or before) the call establishment phase (section 5) Problems with the RF coverage (subsection 6.4.5) Problems with the SC plan (subsection 6.4.6)
RLF in the UL that is causing a removal of a radio leg can be directly identified in UTRAN traces, if there is no Measurement Report with type 1b/1c sent previously and a NBAP radio link failure indication is received with cause Synchronisation failure. If RLF with any other cause is received, the RNC should delete the RL straightaway without waiting for RL restore timer. UL RLF could also be due to DL RLC disrupttion detected at UE which turns off its PA and later sends Cell Update with cause RLC unrecoverable error. Identification of a dropped call due to RLF in the UL only with Uu traces is difficult because the RRC Connection Release message sent by the RNC does not have a unique cause id. For a reliable identification additional Iub/UTRAN tracing is required. Dropped calls due to RLF in the DL can be easily identified in UE logs or network traces with the Cell Update message sent by the UE. There might be an optional failure cause specified. Other cell update failures are covered in subsection 6.3 and 6.14.2. Table 33 below is listing the identification possibilities using UE and network traces. Problem
Dropped call due to RLF in the DL on Uu RLF and RL Restore in CT and Uu RLF and RL Deletion in CT and Uu RLF and dropped call in CT and Uu UL RLF and leg removal on Uu High UE Tx power High DL BLER
Trace
Uu
Trigger
Any occurrence of a RRC Cell Update message with specified cell update cause (not failure cause) radio link failure. Note that the dropped call is the previous call and not the current one! There might be an optional failure cause specified. Cross-correlation of Uu/CT traces: Any occurrence of an Radio Link Failure Indication on NBAP with the cause Synchronisation Failure and after x seconds a Radio Link Restore Indication on NBAP Cross-correlation of Uu/CT traces: Any occurrence of an Radio Link Failure Indication on NBAP with the cause Synchronisation Failure and after x seconds a Radio Link Deletion on NBAP and the number of radio legs is more than one Cross-correlation of Uu/CT traces: Any occurrence of an Radio Link Failure Indication on NBAP with the cause Synchronisation Failure and after x seconds a Radio Link Deletion on NBAP and the number of radio legs is equal to one Any occurrence of an Active Set Update containing any entries in the group RemovalInformationList and there was no Measurement Report within x seconds before either with specified event id 1b/1c or without any specified event id11 Any occurrence if the UE is transmitting with maximum allowed power for x seconds Any occurrence if the UE is reporting a BLER higher than x% for y seconds
Uu Uu
11
To be noted: the group eventResults containing the IE eventID is optional, for example when periodic reporting is enabled.
Page 45 of 108
Document name:
Date:
2009-26-06
Table 34 below is listing the identification possibilities using KPIs retrieved by the UTRAN PM system. Refer to Figure 13 that shows at what point during the call flow the PM counters are updated. PM system
UtranCell
Counter / KPI
(VS.RAB.Drop.CS.Cause.DL_RLF / ((RAB.AttEstab.CSV.RelocIratHO -RAB.FailEstab.CSV.RelocIratHO) + RAB.SuccEstab.CS)) (VS.RAB.Drop.CS.Cause.UL_RLF / ((RAB.AttEstab.CSV.RelocIratHO -RAB.FailEstab.CSV.RelocIratHO) + RAB.SuccEstab.CS)) (VS.RAB.Drop.PS.Cause.DL_RLF / RAB.SuccEstab.PS.<sum>) (VS.RAB.Drop.PS.Cause.UL_RLF / RAB.SuccEstab.PS.<sum>)
UtranCell
UtranCell UtranCell
PS RAB Drop Rate due to DL RLF PS RAB Drop Rate due to UL RLF
6.2.
For the reasons of these failures please refer to the corresponding sections. Note that in Figure 13, T_RL_RESYNCH is shown as radio link failure resynchronisation response timer. RAB drop due to CN reasons RAB drops that are not caused within the UTRAN can be identified by the Iu Release Command message on RANAP; the specified cause is other than Release due to UTRAN generated reason and normal-release. The specified cause is CN vendor dependent.
Page 46 of 108
Document name:
Date:
2009-26-06
Figure 13: Drop of the RAB due to RLF on single RLS 6.2.2. Failure symptoms, identification and fixes for improvement Table 35 is showing the identification techniques in call trace and UE logs: Problem
RAB drop due to UTRAN reasons on Iu RAB drop due to UTRAN reasons on Iu and Uu RAB drop due to CN reasons on Iu RAB drop due to CN reasons on Iu and Uu
Trace
CT
Trigger
Any occurrence of an Iu Release Request message with cause Release due to UTRAN generated reason on Iu Cross-correlation CT and Uu: Any occurrence of an Iu Release Request message with cause Release due to UTRAN generated reason on Iu Any occurrence of an Iu Release Command message with cause other than Release due to UTRAN generated reason or normal-release on Iu Cross-correlation CT and Uu: Any occurrence of an Iu Release Command message with cause other than Release due to UTRAN generated reason or normal-release on Iu
CT and Uu
CT CT and Uu
Table 35: Identification of RAB drops in network interface traces There are different PM KPIs describing RAB drops and can be seen in Table 36. The different PM KPIs describing RAB drops are differentiated as:
Page 47 of 108
Document name:
Date:
2009-26-06
PM system
UtranCell
CS/PS RAB drops Reason (due to UE inactivity, due to DL power, due to Inter-frequency HHO, UE Poor Quality Minimum Rate, SRNS Relocation, ) RNC level and UtranCell level Counter / KPI
VS.RAB.Drop.CSV.UESigConnRel
UtranCell
VS.RAB.Drop.CS.Cause.DL_RLF VS.RAB.Drop.CS.Cause.UL_RLF
UtranCell UtranCell
VS.RAB.Drop.CS.RelocUEInvol VS.RAB.Drop.CS.InterFreqHHO
UtranCell UtranCell
VS.RAB.Drop.CN.Init.CSV VS.RAB.Drop.CS.CodecChange
UtranCell
VS.RAB.Drop.PS.UESigConnRel
Dropped PS RAB Connections due to UE Initiated Signalling Connection Release Dropped RAB connection caused by SRNS relocation for the PS domain. Dropped PS RAB connections due to unrecoverable failures at interfrequency hard handover. CN (core network) initiated dropped PS RAB connections for Ues in Cell_DCH state per transport channel type. Dropped PS RAB connections due to successful CS IRAT HO.
UtranCell UtranCell
VS.RAB.Drop.PS.Reloc.UEInvol VS.RAB.Drop.PS.InterFreqHHO
UtranCell
VS.RAB.Drop.CN.Init.PS.CellDCH.<causes>
UtranCell
VS.RAB.Drop.PS.CsIratHo
Page 48 of 108
Document name:
Date:
2009-26-06
6.3.
Note that the IE AM_RLC error indication in the Cell Update/URA Update is specifying whether an error occurred on the RLC or not. If this IE is set to TRUE it is indicating that the RLC in the UE has detected a failure on one of its AM RLC entities that has not been resolved by e.g. resetting of the RLC 2. For more details regarding failures on the RLC see subsection 6.14. If there is a RRC Connection Release message with cause congestion the reason might be either iRM CAC (subsection 5.4.1) or Congestion Control (subsection 6.5). ALU supports the RRC connection re-establishment for PS, CS and simbearer services, where by on detection of the RLF or RLC error, the UE sends a cell update with corresponding cause and consequently old radio links are deleted and the new radio links are established by the RNC. This procedure fails if the UE does not send the cell update, a RANAP procedure has started or a NAS message is received to be forwarded to the UE. The procedure will also not occur if all the radio legs are on the Drift RNC, a RANAP procedure is in progress or UE indicates that the T314 or T315 timer has expired. Further information on activation, configuration and monitioring of this feature set can be obtained from 2. Parameter RadioAccessService .isPsRrcReestablishAllowed RadioAccessService .isCSRrcReestablishAllowed RadioAccessService .isPSRrcReestablishforICFailureAllowed RadioAccessService .rrcReestCSMaxAllowedTimer
12
Description
Activation flag for PS call reestablishment feature Activation flag for CS call reestablishment feature Activation flag for PS call reestablishment for invalid configuration failure scenario while doing UL data rate change for HSDPA call Timer started at RNC on reception of NBAP RLF for a CS call, if no cell update from UE received within this time, call is dropped
Page 49 of 108
Document name:
Date:
2009-26-06
Timer started at RNC on reception of NBAP RLF for a PS call, if no cell update from UE received within this time, call is dropped Quality threshold for deciding if PS reestablishment should take place based on EcNo reported by UE in cell update Quality threshold for deciding if CS reestablishment should take place based on EcNo reported by UE in cell update
UE
Node B
RNC
CN
1) Cell Update (Cause Radio Link Failure) 2) Radio Link Deletion Request 3) Radio Link Deletion Response 4) ALCAP Release
New radio links based upon measured Ec/I o UE Moved back to Cell DCH
5) Radio Link Setup 6) Radio Link Setup Response 7) ALCAP & FP Synch 8) Cell Update Confirm 9) Radio Bearer Reconfiguration Complete 10) UE Measurements
UE
CN
1) Radio Link Failure Indication 2) Radio Link Deletion Request 3) Radio Link Deletion Response 4) ALCAP Release
T_RL_RESYNCH
5) Cell Update (Cause Radio Link Failure) 6) Radio Link Setup 7) Radio Link Setup Response 8) ALCAP & FP Synch 9) Cell Update Confirm 10) Radio Bearer Reconfiguration Complete
New radio links based upon measured Ec/I o UE Moved back to Cell DCH
11) UE Measurements
Page 50 of 108
Document name:
Date:
2009-26-06
6.3.2.Failure symptoms, identification and fixes for improvement Table 38 and Table 39 below list the identification of dropped RRC connection and the corresponding PM KPIs respectively: Problem
Drop of RRC connection I Drop of RRC connection II Drop of RRC connection III
Trace
Uu Uu Uu
Trigger
Any occurrence of a RRC Connection Release message on Uu with specified cause unspecified or pre-emptive release Any occurrence of a RRC Connection Request message on Uu with establishment cause Call re-establishment The UE is simply going to idle mode without dropping the call in a regular way. There are no RRC/Direct Transfer messages indicating a regular/irregular call termination within x ms. The UE start monitoring the BCCH and might perform a cell re-selection following a Cell Update with cause RLF or RLC unrecoverable error (see also Table 33 on page 45). RNC sent a Cell update confirm but the UE didnt respond back with a RB reconfiguration complete within x seconds showing failure of the reestablishment
Uu
Counter / KPI
(VS.RrcReEstablishmentSuccess.<sum> / VS.RrcReEstablishmentAttempt.<sum>) VS.RrcConnectionRelease.ReestablishmentReject
UtranCell
VS.RrcReEstablishmentAttempt.PS_Other
Page 51 of 108
Document name:
Date:
2009-26-06
VS.RrcReEstablishmentAttempt.PSULRLFail VS.RrcReEstablishmentAttempt.PSDLRLFail VS.RrcReEstablishmentAttempt.PSULRlcUnrecoverErr VS.RrcReEstablishmentAttempt.PSDLRlcUnrecoverErr VS.RrcReEstablishmentAttempt.PSInvCfgFail VS.RrcReEstablishmentAttempt.CS_Other VS.RrcReEstablishmentAttempt.CSULRLFail VS.RrcReEstablishmentAttempt.CSDLRLFail UtranCell VS.RrcReEstablishmentSuccess.PS_Other VS.RrcReEstablishmentSuccess.PSULRLFail VS.RrcReEstablishmentSuccess.PSDLRLFail VS.RrcReEstablishmentSuccess.PSULRlcUnrecoverErr VS.RrcReEstablishmentSuccess.PSDLRlcUnrecoverErr VS.RrcReEstablishmentSuccess.PSInvCfgFail VS.RrcReEstablishmentSuccess.CS_Other VS.RrcReEstablishmentSuccess.CSULRLFail VS.RrcReEstablishmentSuccess.CSDLRLFail
Number of successes for RRC connection re-establishment procedure for different reestablishment types.
6.4.
This is a typical issue for RF optimisation and can be detected via Uu interface traces and 2G/3G scanner measurements of the PHY layer. In addition the
Page 52 of 108
Document name:
Date:
2009-26-06
number of cells in the active set is also a good metric of handover zone definition within the UMTS network. Table 40 is listing identification techniques in drive test and scanner measurement data while gives a way to identify areas with multiple pilot overlap at sector level: Problem
Pilot pollution I Pilot pollution II High number of cells in active set Overshooting
Trace
UE or 3G scanner UE or 3G scanner Uu UE or 3G scanner
Trigger
There are more than x cells with a measured Ec/No within x dB compared to the best measured Ec/No The aggregate Ec/No of the cells in the active set is below x dB while the measured RSCP is above y dBm for z ms The active set size is > 1 in more than x % of all measured samples13 Ec/No of a site y km away is within x dB of the best measured Ec/No
cells
Counter / KPI
VS. UeWithNRadioLinksEstCellsBts.<causes>
UtranCell
(( ( VS.UeWithNRadioLinksEstCellsBts.N1Rl * 1 ) + ( ( VS.UeWithNRadioLinksEstCellsBts.N2RL1Rc1SBts + VS.UeWithNRadioLinksEstCellsBts.N2RL1Rc1ABts ) * 2 ) + ( ( VS.UeWithNRadioLinksEstCellsBts.N3RL1Rc2SBts + VS.UeWithNRadioLinksEstCellsBts.N3RL1Rc1SBts1ABts + VS.UeWithNRadioLinksEstCellsBts.N3RL1Rc2ABts ) * 3 ) + ( ( VS.UeWithNRadioLinksEstCellsBts.N4RL1Rc2SBts1ABts + VS.UeWithNRadioLinksEstCellsBts.N4RL1Rc1SBts2ABts + VS.UeWithNRadioLinksEstCellsBts.N4RL1Rc3ABts ) * 4 ) + ( ( VS.UeWithNRadioLinksEstCellsBts.N5RL1Rc2SBts2ABts + VS.UeWithNRadioLinksEstCellsBts.N5RL1Rc1SBts3ABts + VS.UeWithNRadioLinksEstCellsBts.N5RL1Rc4ABts ) * 5 )+ ( (VS.UeWithNRadioLinksEstCellsBts.N6RL1Rc2SBts3ABts + VS.UeWithNRadioLinksEstCellsBts.N6RL1Rc1SBts4ABts + VS.UeWithNRadioLinksEstCellsBts.N6RL1Rc5ABts ) * 6 ) ) / VS.UeWithNRadioLinksEstCellsBts.<sum>)
Table 41: PM Counter for estimation of soft handover zone 6.4.3. Around-the-corner-effect 6.4.3.1. Concept Around-the-corner-effect is quite often encountered in a dense urban environment. The effect describes a moving UE where the receive level of the
13
This is not really a problem to be identified in a trace; it is more an indication for in general nonoptimal RF conditions.
Page 53 of 108
Document name:
Date:
2009-26-06
cells in the active set decreases dramatically (in terms of Ec/No and RSCP) and the receive level of cells in the monitored or detected set suddenly increases. The root cause for this problem is shadowing of buildings or other obstructions. As a consequence the quality of the call will always drop if the UE is not fast enough to adapt (via Active Set Update) to the new RF conditions. Figure 16 is showing the effect in a dense urban environment:
Active Set Pilot Interfering Pilot
Figure 16: Around-the-corner problem To overcome around-the-corner problem local optimisation of the RF environment is required. In addition the RF planer has to ensure that the parameters configuring the handover procedure is fast enough (subsection 6.9). If a drop does happen, provided its not over the Iur, RRC connection reestablishment feature will be able to recover the call. 6.4.3.2. Failure symptoms, identification and fixes for improvement
Around-the-corner effect can be detected via UE traces when analyzing the PHY layer; Table 42 is summarising the triggers in UE traces: Problem
Around-the-corner effect I Around-the-corner effect II
Trace
Uu Uu
Trigger
Sudden drop/increase of the Ec/No of cells in the active set by x dB for the next at least y ms; the average aggregate Ec/No is below z dB Sudden drop/increase of the RSCP of cells in the active set by x dB for the next at least y ms; the average aggregate RSCP is below z dBm
6.4.4. Non-optimal neighbour definitions 6.4.4.1. Concept One of the essential tasks of RF planning is neighbour list assignment. When the neighbour lists are not well defined the UE might not be on an optimal cell (or set of cells) and the call is endangered to drop. The following neighbour lists exist in the OAM:
Page 54 of 108
Document name:
Date:
2009-26-06
3G-3G soft/softer MAHO list 3G-2G neighbour MAHO list 3G-2G neighbour DAHO/blind HO list 2G-3G neighbour list
The parameters configuring the intra-frequency soft/softer HO are listed in subsection 6.9, 2G IRAT parameter settings are covered in subsection 6.10. This subsection is focused on the integrity of the different neighbour lists definitions itself. To maintain the integrity of the different HO list it is required to use a database system with the following tables: Table keeping site specific information of the UMTS cells o o o o Site id (for identification for co-located 2G/3G cells) Sector id (to check if a 2G cell is identical resulting in identical coverage footprint for a possible DAHO/ blind HO definition) Userlabel Flag borderCellToGSM
Table keeping site specific information of the GSM cells o o o Site id (for identification for co-located 2G/3G cells) Sector id (to check if a 3G cell is identical resulting in identical coverage footprint for a possible DAHO/ blind HO definition) BCCH frequency
Different neighbour lists including o Priority flag for 3G-3G HO definition in case Type 1 is the selected NL selection algorithm (see also subsection 6.9 for details) Distance between the two cells
With this kind of information the following database queries might be defined Check for symmetry or reciprocity Check for missing co-located neighbour definition (3G-3G, 3G-2G, 2G3G) Check for right Priority flag Check for missing DAHO/ blind HO definitions
Page 55 of 108
Document name:
Date:
2009-26-06
Figure 17: neighbour list checking using MS Access RF drive data analysis tools like LDAT 2 have the missing neighbour list analysis feature that can be used to debug existing network as well as suggest NL for technology overlay deployment based on GSM HO matrix:
Page 56 of 108
Document name:
Date:
2009-26-06
6.4.4.2.
Following methods can be used to fix/detect a non-optimal neighbour list assignment: Cross-correlation measurements of Uu drive test logs with 2G/3G scanner
o Missing 3G-3G neighbour definition: measured RSSI is relatively high, but the RSCP of the cells in the active set is relatively low o Missing 3G-2G neighbour definition: the UE measured RSSI is relatively low and the GSM coverage footprint is relatively strong as measured by the 2G scanner. o Missing 2G-3G neighbour definition: UE is staying in 2G although there is sufficient 3G coverage as indicated by the RSSI measurements of the 3G scanner o Analysis of the UE Measurement Reports: the UE might report cells of the detected set but these cells are not defined in the Compund NLA (see also subsection 6.9) RF prediction tool analysis like LDAT3G CTn traces 2 which specifically capture 3G-3G and 3G-2G mobility data. For example from these IRAT HO Matrix can be derived and analysed using WQA tool 2 with the focus on o o Deletion of unnecessary handover definitions Investigation of high amount of HO failures
In a similar way the intra-frequency HO matrix can be derived and analysed using CTn and WQA for discovering possible missing neighbor list entries or over-shooting sectors. Table 43 below is listing the identification possibilities for network interface traces: Problem
Missing 3G/3G neighbour definition
Trace
Uu, 3G scanner
Trigger
Any occurrence where the measured RSSI (retrieved by 3G scanner) is within a xdB window compared with the measured aggregate RSCP of the cells in the active set (measured by the UE) for y seconds; at the time of the measurement the UE is in 3G The measured RXLEV of the best 2G cell (measured by the 2G scanner) is within a xdB window compared to the measured aggregate RSCP of the cells in the active set (measured by the UE) for y seconds; at the time of the measurement the UE is in 3G Any occurrence where the measured RSSI (retrieved by 3G scanner) is within a xdB window compared with the measured RXLEV of the 2G serving cell (measured by the UE) for y seconds; at the time of the measurement the UE is in 2G
Uu, 2G scanner
Uu, 3G scanner
Page 57 of 108
Document name:
Date:
2009-26-06
6.4.5. Poor RF coverage 6.4.5.1. Concept Especially in the early days of 3G there will be many areas with a poor RF coverage. But also after the integration of the sites it might happen that due to cell breathing (especially in the busy hour) the Ec/No is not sufficient to guarantee (for some services like 384 kbit/s) sufficient RF coverage. When this happens either the radio bearer has to be reconfigured due to an increasing Tx code power in the DL when using a PS R99 data service (subsection 6.17.1) or a HHO handover towards 3G or 2G cell has to be triggered to rescue the call using iMCTA Alarm mechanism. In subsection 6.7.1 a drop of the RRC is described for a mobile in CELL_FACH mode. In subsection 6.6 a similar scenario is described for a UE in CELL_PCH/URA_PCH mode. 6.4.5.2. Failure symptoms, identification and fixes for improvement Low receive level in terms of RSSI (means low measured RSCP values for all pilots in the active set) High NodeB TX power (probably also high UE TX power)
One root cause for low RF coverage might be a NodeB outage; this has to be crosschecked with the Alarm data (see also subsection 6.8). Table 44 below is listing identification triggers for low RF coverage in various traces: Problem
Low RF coverage I Low RF coverage II Low RF coverage III Low Ec/No
Trace
3G scanner or Uu 3G scanner, Uu Uu, CT Uu
Trigger
Measured RSSI of the 3G cells is below x dBm for y seconds Measured aggregate RSCP of the cells in the active set is below x dBm for y seconds and there is no RSCP of a 3G cell measured by the 3G scanner better than z dB compared to the aggregate RSCP The reported NodeB TX power is for x second above y dBm and the measured RSCP of that NodeB is below z dBm Measured aggregate Ec/No of the cells in the active set is below x dB for y seconds
Table 44: Identification of low RF coverage in network interface traces ALU PM System does allow the monitoring of Ec/Io, RSCP and CQI values reported by the UE. Knowing the cell selection criteria (UPUG default Qqualmin = -16dB and Qrxlevmin = -115dBm) and design coverage probability of the 3G network one can roughly estimate if coverage hole exist at sector level. PM system
UtranCell UtranCell UtranCell
Counter / KPI
VS.IrmcacDistributionEcNO.<screenings> VS. IrmcacDistributionRscp.<screenings> VS. HsdpaReceivedCQI.<screenings>
Page 58 of 108
Document name:
Date:
2009-26-06
6.4.6. Poor PSC plan The PSC is used for cell identification during the initial cell search and when measuring the neighbour cells in idle and connected mode. In case proper rules are not followed, the UE may experience failures in the neighbour list measurements or in case of overlapping coverage areas of two NodeBs sharing the same PSC, interference and synchronisation issues will occur. This will be the case if an overshooting site has the same PSC as one of the cells in the active set causing co-pilot interference or if the neighbours of the two existing active set cells share the same PSC creating NL ambiguity. It is hardly possible to identify PSC issues in drive test data because the measured low Ec/No values or even RLF can also be the result of pilot pollution or around-the-corner effect (subsection 6.1 and 6.4.1). WQA should permit to detect PSC duplication using CTn traces. The following counters can also be used to detect cells that may contain ambiguous PSC entries after neighbor list compounding. For intra-frequency neighbors the PSC entries must be unique, while for inter-Freq enteries the ARFCN and PSC combination should be unique, and this may be the case for first teir neighbors. However this may not be the case when neighbor list compounding is used, as several neighbor lists will be combined to create the final neighbor list. PM system
UtranCell
Counter/KPI
VS.AggregateCellListAmbiguousCellIntraFreq
UtranCell
VS.AggrCellListAmbigCellInterFreq
Table 46: Count of NL sent with at least one ambiguous cell at RRC
6.5.
Page 59 of 108
Document name:
Date:
2009-26-06
Transit IrmPreemptionCacParams.numUsersDowngrade out of top ranked users connected to PS data services to a lower bit rate (e.g. from 384 kbit/s to 128 kbit/s) Transfer of rest of PS data users to another state e.g. from CELL_DCH to CELL_FACH or idle depending upon the setting for RadioAccessService.congestionDowngradeReleaseTarget if initial data rate is 8k/8k
The lowering of the PS data rate is done by using the RB Reconfiguration procedure (subsection 6.17.1). The state transfer is done by the RRC Connection Release procedure (transfer to idle mode, RAB is released) or by the RB Reconfiguration procedure (transfer to CELL_FACH, RAB is set to inactive); in both cases the PDP context is retained. The initiating of Congestion Control is indicating a high interference in the RF environment. 6.5.2. Failure symptoms, identification and fixes for improvement Table 47 is listing the identification techniques in traces in case of Congestion control, relevant PM KPIs are also listed below in Table 48: Problem
Congestion Control RRC PS data reduction DL
Trace
Uu, TCP/IP trace in or after CN
Trigger
Cross-correlation of interface traces on Uu and TCP/ in or after CN side: Any occurrence when either the PS data rate is reduced or the UE is transferred from CELL_DCH to CELL_FACH / CELL_PCH / URA_PCH mode and at the same time there is still data in the RLC buffer of the RNC as measured in Wireshark
Counter / KPI
VS.DataRateAtt.Dec.CongDowngrade.DL VS.DataRateAtt.Dec.CongDowngrade.UL
UtranCell UtranCell
6.6.
Page 60 of 108
Document name:
Date:
2009-26-06
always a RAB associated with the RRC connection but the RAB is marked (inside the RNC) as inactive. When there is data received from the CN side, the RLC buffer in the RNC belonging to the RAB is queues the data and the RNC initiates a state transition of the UE to deliver the DL data. For TCP applications this is appropriate because TCP traffic always starts using the Slow Start procedure, but for UDP or RTP this might result in lost data frames. The UE might indicate to the RNC if the UE RLC buffer is filled up rapidly by sending cell update with cause Uplink data transmission on RACH. ALU UTRAN initiates a state transition to Cell_FACH by sending back a cell update confirm. According to 2 the UE has to monitor the PICH and PCH, do periodical URA/PCH updates and perform cell reselections while being in URA_PCH or Cell_PCH state. It might be that URA_PCH/CELL_PCH mode is not used. Instead for a PS call when the inactivity timer T1 elapses, the RRC resources are released while maintaining the PDP context; the UE is sent to idle mode. The associated RAB is removed. The advantage of the URA_PCH/CELL_PCH mode compared to the idle mode is that the re-establishment can be done faster because the RAB and RRC connection does not need to be re-established again. Disadvantage is that there are still some (very low) UTRAN resources that the RNC has to maintain. Figure 19 below is showing the transition phases between different UE states:
Figure 19: Transition phases between the different UE states 6.6.2. Failure symptoms, identification and fixes for improvement Failures and dropped RRC connections when the UE is in URA_PCH or CELL_PCH mode might occur in the cell selection/reselection process (subsection 5.1.1), failures due to periodical URA updates (subsection 5.3.1). For Call admission (iRM CAC) failures see subsection 5.4.1. Failures due to PCH/AICH/PICH or the RACH procedure might lead to a drop of the RRC connection as described in subsection 5.1.2. In this case the RAB will be removed. If RNC traces show that cell update was sent in good RF (better than -10dB) but UE repeatedly didnt respond to cell update confirm, this can either be a UE or
Page 61 of 108
Document name:
Date:
2009-26-06
an RNC bug. It probably causes the UE to ignore the cell update confirm, so after sending cell update five times the UE drops this call and establishes a new PS call. This results in CN sending Iu release command for the old call to the RNC. Following table shows the PM counters useful in monitoring the URA_PCH transitions and success rate PM system
UtranCell
Counter / KPI
VS.UEStateTransAtt.UraPCH.CellDCH VS.UEStateTransSucc.UraPCH.CellDCH
UtranCell
VS.UEStateTransAtt.UraPCH_CellDCH.DCH_HSDSCH VS.UEStateTransFail.UraPCH_CellDCH.DCH_HSDSCH
Table 49: PM Counters for URA_PCH Transitions Failures due to the RB Reconfiguration procedure are described in subsection 6.17.1.
6.7.
Page 62 of 108
Document name:
Date:
2009-26-06
Page 63 of 108
Document name:
Date:
2009-26-06
Figure 22: Parameters govering various UE states controlled through AO The RNC may decide to release the RRC connection due to extended data inactivity especially if URA_PCH is disabled. In this case the RNC sends a RRC Connection Release message on FACH and the UE sends back a RRC Connection Release Complete message on RACH before transiting to idle mode. In parallel the RAB will be released on Iu with cause: user-inactivity. 6.7.2. Failure symptoms, identification and fixes for improvement A drop of the RRC connection might occur if the UE is leaving the RF coverage area and upon selecting a cell the UE has to inform the UTRAN by sending a Cell Update message with cause Re-enter service area. This happens when the UE cant find a suitable cell to camp on for at least 4 seconds. In the meantime the UTRAN might already have dropped the RRC if it had tried and failed to send PS data in the DL. It is recommended to make SRB and TRB in FACH/RACH robust by making the RLC timers long, so to avoid call drop in case of short disruption in data transfer. Since FACH is a slow channel and there is no dedicated link so it can also suffer from contention between different users. As a rule of thumb the RLC UL/DL timeout should be >15 sec for TRB_FACH/RACH and greater than timeout for DCCH_3.4k for SRB_FACH/RACH. The following failures might occur for UEs in CELL_FACH mode or during the transition from/to CELL_FACH mode: Failures related to the cell selection / reselection (subsection 5.1.1) Failures related to the Random Access Procedure (subsection 5.1.3) Failures related to the FACH (subsection 5.1.6) Failures related to the setup of the RL on NBAP (subsection 5.1.5) Failures related to the Radio Bearer Reconfiguration procedure on RRC (subsection 6.17.1)
Page 64 of 108
Document name:
Date:
2009-26-06
Table 50 is listing failures for UEs in CELL_FACH mode and how to identify it in traces: Problem
Dropped call in CELL_FACH
Trace
Uu
Trigger
Any occurrence when the RRC connection dropped while the UE was in CELL_FACH state
Table 50: Failure identification in traces if the UE is in CELL_FACH mode There are a lot of PM counters available counting the number of attempts and failures for the state transitions, see 2 for details.
6.8.
Trace
Iub Iub Iu Iu
Trigger
Missing STAT PDUs on AAL5 for more than 5 seconds Any occurrence of an AuditRequiredInformation on NBAP Missing STAT PDUs on AAL5 for more than 5 seconds Any occurrence of a Reset on RANAP
Table 51: Identification of outages in network interface traces Transport engineering guidelines 2 may also have useful information about transport outage.
6.9.
Page 65 of 108
Document name:
Date:
2009-26-06
triggered rather than periodically. All intra-frequency measurement reporting events (1a to 1j) are defined in 2. According to 2 the soft/softer HO procedure consists of the following steps: Cell search and measurements of cells in the active set and the monitored set Reporting of measurement results by the UE (RRC Measurement Report message including specified event id) SHO decision Allocation/release/change of network resources on NBAP Execution of the HO (RRC Active Set Update message) by the RNC If necessary execution of RNS relocation procedure (subsection 6.17.2) Active Set Update Complete message on RRC from UE (successful case) RNC updates the measurement parameters including cells belonging to the new monitored set and other measurement parameters via the RRC Measurement Control Message
The different steps are configurable using UTRAN RRM parameters. As an example Figure 23 below is visualising the HO parameter like time to trigger (T) and the HO hysteresis for the Measurement Report events 1a, 1b and 1c:
Figure 23: HO parameter for event 1a, 1b and 1c The call handling depends on the type of event; as an example Figure 24 below is showing a flowchart for an intra-RNC Active Set Update procedure of type event 1a (the grey box contains the RL deletion in case of event 1c):
Page 66 of 108
Document name:
Date:
2009-26-06
Figure 24: Call handling flowchart of Active Set Update event 1a (event 1c)
6.9.2. Failure symptoms, identification and fixes for improvement There are many different reasons why the HO procedure might fail or not be executed in an optimal manner: Measurement problems of the cells in the active and monitored set. These failures are most likely due to RF planning issues like nonoptimal neighbour definitions, pilot pollution, weak PSC plan etc. (see subsection 6.4 for details) Misconfiguration of UTRAN parameter setting up the filtering, timing and SHO algorithm Problems with the allocation of network resources on NBAP: Radio Link Setup procedure in case no RL exists to the particular (new) NodeB (subsection 5.1.5) and Radio Link Addition procedure in case there is already a RL to the NodeB Problems during RNS relocation procedure are covered in subsection 6.17.2 Failures during the release of network resources on NBAP (e.g. event 1c); these failures should occur very rarely (subsection 6.17.3)
Page 67 of 108
Document name:
Date:
2009-26-06
Measurement Control Failure message (e.g. the UTRAN instructs the UE to perform a measurement that is not supported by the UE) RRC Active Set Update failure message from UE in case of o o o o o o Unsupported or invalid configuration Incompatible simultaneous reconfiguration Invalid Active Set Update message UE in non Cell_DCH state to receive that message Protocol error Physical channel error
The filtering, timing and SHO algorithm are configurable by UTRAN parameters. Especially in dense urban environment these parameter have to be optimised e.g. to react faster to the around-the-corner effect or in areas with weak coverage (in 3G border areas) to trigger the 3G-2G HO quickly. Table 52 below is summarising how to identify these issues in network interface traces. Note that the handover delay can be confused with missing RRC messages (check event id of Measurement Report with removal/addition list of ASU message). As a general point LDAT3G allows delay between two RRC messages to be quantified using the UDR Time difference option under Report Menu. Long handover delays can result in dropped calls and in a decrease of the overall UMTS RF conditions. ALU RNC does have blocking phases that means that an on-going procedure like RB Reconfiguration may cause the SHO to be blocked. Enabling the RadioAccessService.shoAfterBlockingPhaseEnable will ensure that all received reports are queued for processing once blocking ends. Problem
Intra Delay Frequency Handover
Trace
Uu
Trigger
Any occurrence where the UE sends a Measurement Report 1x and the RNC does not reply with an Active Set Update message within y seconds Any occurrence where the UE is sending an Active Set Update Failure message Any occurrence where the RNC is not sending the Measurement Control message within y seconds after the UE has sent the Active Set Update Complete message and the event ID of the last Measurement Report has been event 1x14 Any occurrence of a dropped call within y seconds after the RNC has sent an the Active Set Update message and the event ID of the last Measurement Report has been event 1x There is one (or more) intra-frequency cell measured by the 3G scanner that is not in the active set and its Ec/No is for x seconds better than y dB compared to the best cell in the active set and the UE is not sending within that time period a Measurement Report with id 1a or 1c Whenever a cell is added to the active set (event 1a) , it is removed within x seconds again (event 1b or 1c) or vice versa
Active Set Update Failure Long delay of Measurement Control message after Active Set Update Complete for event 1x Dropped call during event 1x
Uu Uu
Uu
Uu, 3G scanner
Ping-pong HO
Uu
14
In case of e.g. periodic reporting an update via Measurement Control message is not required
Page 68 of 108
Document name:
Date:
2009-26-06
Uu
Table 52: Identification handover issues in traces PM KPIs related to the intra-frequency handover process are available in 2.
Page 69 of 108
Document name:
Date:
2009-26-06
Phases
Figure 25: Flow chart of successful UMTS to GSM voice handover 6.10.2. Failure symptoms, identification and fixes for improvement (UMTS->GSM) UMTS to GSM Handover failure may occur during one of the phases as following: Relocation procedure failures (subsection 6.17.2/phase 1 in figure) Handover procedure failures in GSM network (phase 2 in figure) Release procedure failures (subsection 6.17.3/phase 3 in figure)
Upon successful completion of the relocation procedure, the SRNC sends the Handover From UTRAN Command including the GSM Handover Command to the UE. If the UE fails to complete the requested handover then SRNC receives a Handover From UTRAN Command Failure message from the UE. According to 2 the failure causes specified within this message can be subdivided as follows: Physical channel failure Unacceptable configuration Protocol error
The first cause refers to the case when there is loss of synchronisation between UE and 2G-NodeB. This is mainly caused by poor RF conditions, especially if the coverage of the co-located 2G site/neighbour is not as good as that for 3G network or if UE reported 2G neighbors BSIC is ambiguis and CN prepared a different 2G NodeB. This problem can also occur due to incorrect provisioning in 2G network. The last two causes are expected to occur seldom and in general are not related to RF issues. The IRAT HO can be configured with the parameters as described in 2. In case of a high failure rate during the IRAT handover procedure it should be checked if the HO has to be triggered earlier under better 2G and 3G
Page 70 of 108
Document name:
Date:
2009-26-06
conditions. However this may increase the proportion of CS calls going over to 2G, which may be against customer expectations. Table 53 below is listing the identification triggers for IRAT HO problems in traces: Problem
Delayed IRAT HO after UE report Handover From UTRAN Command Failure RRC drop compressed mode in
Trace
Uu Uu
Trigger
Any occurrence of a periodic Measurement Report sent by the UE, but there is no Handover From UTRAN Command within x seconds Any occurrence of a Handover From UTRAN Command Failure message sent by the UE Any occurrence of a drop of the RRC connection when the UE was in compressed mode
Uu
Counter / KPI
(IRATHO.SuccOutCS.<sum> / IRATHO.AttRelocPrepOutCS) VS.AggrCellListAmbigCellInterRAT
Table 54: PM KPI for outgoing CS IRAT success rate For counters dealing with preparation phase during IRAT-HO, refer to section 6.17.2.
6.10.3.
The IRAT for GSM to UMTS would allow the operator to make use of the 3G coverage in case of GSM network overload or simply to maximise the usage of UMTS network. However the HO is actually initiated by the GSM network and hence not discussed any further. This HO is limited to CS calls and in case of combined CS/PS call the UE is required to setup the PS part of the call upon successful completion of CS handover. The following figure shows HO execution signaling flow that starts with the RNC receiving Relocation Request from 3G MSC and ends when the RNC sends back Relocation Complete after receiving Handover to UTRAN Complete RRC message from the UE. From UTRAN perspective RadioAccessService.is2gto3gCSHandoverAllowedWithinRNC is used to ensure that RNC will accept the incoming relocation procedure involving SCCP connection initiated by the CN.
Page 71 of 108
Document name:
Date:
2009-26-06
6.10.4. Failure symptoms, identification and fixes for improvement (CS GSM ->UMTS) Some main reasons as to why the GSM to UMTS handover procedure may fail can be as follows. The GSM to UMTS handover feature is not enabled in UTRAN target cell The UE does not support the target cell frequency band The requested radio resources cannot be established, e.g. radio link setup fails on Iub or the ALCAP Iu transport bearer cannot be established The RNC does not receive a HANDOVER TO UTRAN COMPLETE message from the UE, because the UE has received an invalid HANDOVER TO UTRAN COMMAND message or it does not support the configuration included in the message. In this case the timer expires The MSC cancels the relocation by releasing the Iu connection PM KPIs related to the IRAT Handover process are detailed in 2 with some example shown below: PM system
UtranCell
Counter / KPI
((VS.IuRelocationRequests.Cs2Gto3GRelocation VS.IuRelocationRequestFailuresCs.2Gto3G.<sum>) / VS.IuRelocationRequests.Cs2Gto3GRelocation)
Page 72 of 108
Document name:
Date:
2009-26-06
Figure 27: Flow chart of successful UMTS to GSM PS handover The SRNS context transfer procedure is not fully supported by the ALU source RNC (i.e. the messaging is supported but the PDU counters are not transferred). Furthermore data forwarding is also not supported by the ALU SRNC. Therefore, some packets will be lost during the handover. End-to-end reliability is supposed to be provided by end-to-end transport layer (e.g. TCP). 6.11.2. Failure symptoms, identification and fixes for improvement
In case the UE cannot successfully complete the procedure and T309 expires, the UE will
Page 73 of 108
Document name:
Date:
2009-26-06
Re-establish the UTRAN physical channel(s) used at the time for reception of cell change order from UTRAN and transmit the cell change order from UTRAN failure message and set the IE "Inter-RAT change failure" to "physical channel failure" OR when not successful in re-establishing the UTRA channels, perform a cell update procedure with cause "Radio link failure"
Table 56 below is listing the parameter for the cell change order from UTRAN procedure: Parameter UeTimerCstConnectedMode. t309 Description
Timer starts upon reception of CELL CHANGE ORDER FROM UTRAN message, and stops when a successful establishment is made in the new 2G cell.
Table 56: Parameter used for configuring the cell change order from UTRAN Table 57 below is listing the identification in interface traces possibilities for the cell change order from UTRAN procedure: Problem
Cell Change Order from UTRAN I Cell Change Order from UTRAN II
Trace
Uu Uu
Trigger
Any occurrence of the RRC message CellChangeOrderFromUTRANFailure Any occurrence of the RRC message CellChangeOrderFromUTRAN and within x seconds there is a cell update message with cause "Radio link failure"
Table 57: Identification of cell change order from UTRAN failures in traces PM KPIs related to the process are available in 2 with an example below. PM system
UtranCell
Counter / KPI
(IRATHO.SuccOutPSUTRAN.<sum> / VS.RrcCellChgOrderUtranCmd.<sum>)
Page 74 of 108
Document name:
Date:
2009-26-06
Detection of the need for inter-Frequency HO HO algorithm selection and measurement report setup Measurement event report reception and HO execution If necessary execution of RNS relocation procedure (subsection 6.17.2)
DAHO (or blind) algorithm is only used when handing over from a Micro to a Macro site. Otherwise MAHO is recommended for most scenarios. Irrespective of the reason for initiation, the call flow follows slightly different sequence if the HO is inter/intra-NodeB and inter/intra-RNC. Furthermore RB reconfiguration message is used to performe this HHO. 6.12.2. Failure symptoms, identification and fixes for improvement
The reasons for inter-frequency HO failures are similar to the ones that may be encountered during intra-frequency or IRAT HO, as constituent procedures are the same, however some salient failure mechanisms are: Target Node B is unable to allocate the resources requested. Then it returns a NBAP Radio Link Addition Failure or Radio Link Setup Failure message to the SRNC (section 5.1.5). The UE may not be able to perform the new configuration and returns a Radio Bearer Reconfiguration Failure. The newly allocated resources on the target cell are released by means of the NBAP Radio Link Deletion procedure by the RNC. The call continues on the current configuration. If the Inter-Frequency Handover Timer Imcta.measurementGuardTimerFdd expires and no neighbouring cell is found to be suitable candidate for HHO (no inter-frequency cell has been reported or the reported cells are not found to be elligible). Then Compressed Mode is reactivated if Event 2f for same quantity (i.e EcNo or RSCP) has not been received which triggered the measurements in the first place in the form of Event 2d, i.e. trigger was Alarm. However, UE remains on the original frequency if trigger was due to CAC or Service without invoking CM again.
The user plane interruption is likely to be longer for the UL as DL data is sent on both the old and new RL while UL is only sent on old RL until either it fails or the new RL is restored. Table 59 shows some failures that can be identified using network traces Problem
Inter Frequency HO Delay
Trace
Uu
Trigger
Any occurrence where the UE sends a inter-frequency Measurement Report and the RNC does not reply with RB Reconfiguration message immediately RNC sends a RB Reconfiguration message but the UE does not respond back with either complete or failure message within RNC static timer T361. This will be followed by RNC initiaiting Iu release procedure with cause Unspecified.
Uu and CT
Table 59: Identification of inter Freq HO failures from traces UTRAN Some important KPIs/Counters pegged during this process are given below:
Page 75 of 108
Document name:
Date:
2009-26-06
PM system
UtranCell UtranCell
Counter / KPI
(HHO.SuccOutInterFreq / HHO.AttOutInterFreq) (VS.IncomInterFreqHoSuc.<trigger>15 / VS.IncomInterFreqHoAtt.<trigger>)
TM data transfer
Triggers include Rescue (Alarm due to 2d/2f), Service (due to Red cell colour) and NoRsrcAvailCacFailure (no resources available)
Page 76 of 108
Document name:
Date:
2009-26-06
o o
No protocol overhead added; transparent to the RLC Used for signalling SRB (e.g. broadcast SRB on BCCH, paging SRBs on PCH), voice services and CS data Buffer control of RLC SDUs for smoothing data rate variations introduced by burst-traffic sources (e.g. TCP flow control) and lower layer variations Segmentation, concatenation and padding into RLC PDUs. Each PDU is transferred as one physical layer TB. Reassembly of PHY data from TB into RLC PDUs and RLC SDUs Used for fast signalling (e.g. SRB1 on DCCH)
UM data transfer o
o o o
AM data transfer o o o UM data transfer features plus Error control feedback, retransmission of erroneous or lost PDUs and in sequence delivery of RLC PDUs by ARQ Used for signalling (SRB 2-4) and PS data services
There is one pair of AM RLC entities per RB. In the following TM is not considered any further because there is no performance impact due to RLC. Figure 28 below is showing the UMTS protocol stack of the user plane for a TCP/IP data application:
Figure 28: UMTS protocol stack of the userplane for a TCP/IP application TCP has its own flow control and ARQ algorithms so the OAM parameter of RLC has to be adapted to interwork with TCP in an optimal way. Because the TCP settings could be different on each client PC (and the corresponding server in the Internet or corporate business network) a reference client-server system should be defined and used to optimise the RLC settings.
Page 77 of 108
Document name:
Date:
2009-26-06
A RLC PDU for PS RB has a size of 42 bytes 16 (40 byte payload and 2 byte header), which is relatively small compared to a TCP/IP packet size of around 1000 byte17. As a consequence retransmission on RLC results in a retransmission of relatively small amount of data compared to that on TCP/IP layer. Furthermore if a data PDU is not completely filled with data of one SDU, concatenation and/or padding are applied. For each TB set, the PHY is performing a CRC check; in the UL the NodeB is adding the CRCI to each TB set (see also subsection 7.1.2.1). Furthermore the physical frames on Iub are protected by additional CRCs. If one of both CRC fails, lower layer discards the whole frame on Iub / the whole TB set. It is up to the RLC of how to react on lost data and possibly initiate retransmission. RLC ARQ mechanism For identification each PDU has (for DL and UL and per RLC entity separately) an increasing SN (0,, 4095 for AM, 0,, 127 for UM). At the TX the data PDUs are stored in a retransmission buffer when they are submitted to the MAC and PHY layer. If a data PDU is NACKed it can be quickly retransmitted. ARQ is using the following mechanism: Status reporting on the RX: the RX sends a status report in so-called STATUS PDUs containing a detailed list of received and missing PDUs. STATUS PDUs have priority over retransmitted data. They can be sent periodically or unsolicited e.g. after loss detection Polling from TX: the TX can request a status report by setting a poll bit in the RLC-PDU header forcing the acknowledgement of previous PDU by the RX Window mechanism: a sliding window allows the TX to transmit new PDUs while waiting for the ACKs till end of the window size. SDU discard function: when the delivery of a SDU cannot be managed because of e.g. repeated errors, the transmission of SDUs is stopped and discarded on both TX and RX side. Data PDUs carrying poll requests and status or other control PDUs require a special ACK and are protected by timers When timer protected PDUs are not acknowledged before the timer elapses these PDUs are retransmitted If timer protected PDUs are retransmitted and still no ACK received o If data PDU retransmission did not succeed, go either to SDU discard or RLC reset of the RLC connection between the two entities o If SDU discard does not succeed, go to RLC reset of the RLC connection between the two entities o If RLC reset does not succeed, signal unrecoverable error to higher layers. In this case the RRC might be dropped and the UE performs a Cell Update and the IE AM_RLC error indication is set to TRUE (subsection 6.3.1)
16 17
Size of signaling SRB is 16 bytes plus 2 bytes header Size of the TCP/IP packet is depending on the MSS negotiated for each TCP session during the connection setup. In addition it might be that the IP packet is further segmented by one Internet server
Page 78 of 108
Document name:
Date:
2009-26-06
Parameters configuring the RLC are available in 2 along with features that can improve RLC performance. Reason for problems on the RLC might be due to 6.14.2. RF related issues like pilot pollution, incorrect neighbouring definitions Lower layer problems on the Iub, ATM cell discarding occurs causing the packets to be lost Forced decrease of the data rate due to congestion control resulting in SDU discarding in RNC UE or RNC software bugs where the behaviour does not follow 3GPP standard Failure symptoms, identification and fixes for improvement
The retransmission on RLC layer can be easily identified by a not-in-sequence delivery of RLC PDUs on Iub; this information is normally not available in Uu traces. The RX acknowledges in its status reports all PDUs with a SN < LSN. However, CTg or CTb with RLC information can also be used in this investigation especially if explicit Iub tracing is not possible or allowed. For better identification on Iub the particular call has to be extracted so as not to mix up with RLC PDUs of other calls. In addition special ASCII files downloaded via FTP can be used to easily identify retransmission (only possible when PPP and PDCP compression techniques as well as ciphering is disabled, see also subsection 7.2.3). However these limitations dont apply to CT tracing feature as RNC recorded each call separately. Another (but quite complicated) possibility is the analysis of the BITMAP in the status reports of the RX. The BITMAP is giving the TX an indication about which PDUs have been successfully received and which not starting from the First SN (number of octets determined by LENGTH) 2. A dropped (CS or PS) call due to a RLC error can be easily identified by a Cell Update message in UE log with cell update cause RLC unrecoverable error. Note that an RLC error in the SRB2-4 (represented by cell update IE AM_RLC error indication = True for RB2-4) cannot be reconnected for CS voice and always results in drop call as per 3GPP 2. The SDU discard function removes the RLC PDU from the buffer on the transmitter side, when the transmission of the RLC PDU does not succeed for a long time. Hence it allows avoiding buffer overflow. There will be several alternative operation modes of the RLC SDU discard function, and which discard function to use will be given by the QoS requirements of the Radio Access Bearer. Table 61 is listing problems that can be detected in interface traces and Table 62 the corresponding KPIs in the PM system: Problem
RLC Resets RLC retransmission SDU discard with explicit signalling Dropped call due to RLC error
Trace
Iub Iub Iub Uu
Trigger
Any occurrence of RLC Resets in Iub traces Any occurrence of retransmission of RLC PDUs per RLC session Any occurrence of a Move Receiving Window (MRW) command indicating a SDU discard and/or a MRW-ACK Any occurrence of a RRC Cell Update message with specified cell update cause (not failure cause) RLC unrecoverable error. The IE AM_RLC error
Page 79 of 108
Document name:
Date:
2009-26-06
might be set to True depending upon if error occurred on SRB/ RB2-4 or TRB/RB5 and above.
Counter / KPI
VS.NbrCellUpdates.RlcUnrecoverableError
UtranCell
UtranCell
Figure 29 below is visualising the changes in the UMTS protocol stack in order to support HSDPA:
Page 80 of 108
Document name:
Date:
2009-26-06
UE
Uu
No de B
Iu b
RNC
Iu p s
SGSN
Gn
GGSN
SM MM P MM SM R R C P DC P G TP -U RRC R LC R LC ALC AP M A C s S T C .2 N B A P -h H SDS CH FP PHY AA L2 ATM E 1 / S T -1 M S S C -U N I F SSCOP A A L5 M AC MAC A LC A P N B A P S T C .2 N B A P A LC A P P h yu p P h yu p IP SCCP SCCP M T P -b 3 SSC F-N SSCF SSCOP IP SCCP S C C P Q 2 1 5 0 .1 M T P -b 3 M T P 3 BM T P 3 B S S C -N F SSCF SSCF SSCOP SSCOP SSCOP AAL5 L2 A AL5 AAL5 AAL2 L1 AT M AT M STM -1 E1 UDP R AN AP FP R AN AP G TP G TP -U -C Q 2 1 5 0 .1 UDP Iu U P IP Q 21 50. 1 MT P 3 B SSCF SSCOP L2 AA L5 L1 Q IP1 5 0 . 1 2 G TP GTP -C -U UDP Q 2 1 5 0 .1
C o n tr o l Plan e
U ser Plan e
T r an sp or t Plan e
Comm on
Figure 29: HSDPA protocol stack enhancements The following subsections are describing different aspects of HSDPA data calls.
6.15.2. 6.15.2.1.
Mobility aspects of HSDPA Concept For the UL the mobility procedures are largely mostly the same as for PS calls over DCH (e.g. soft/softer HO triggered via event 1a, 1b and 1c) For the DL the HS-DSCH for a given UE belongs to only one of the radio links of one sector of the NodeB where the UL is connected. As a consequence only Hard Handovers (Cell Changes) are triggered based on the reception of Event 1d.
The RNC is forwarding the DL application data to the NodeB from the MAC layer to the new MAC-hs layer that is scheduling the data for delivery. In case of a Hard Handover the NodeB discards data that has not been transmitted yet. In this case it is up to the higher layer protocols (RLC or TCP) to retransmit lost data. As a consequence too many serving HS-DSCH Cell Changes within a short period of time (Ping-Pong handovers) may cause a reduced throughput. A typically scenario might look as follows: UE connected to NodeB A, NodeB B is becoming stronger and stronger UE sends Measurement Report with Event 1a RNC adds NodeB B to the Active Set via Active Set Update procedure UE sending Measurement Report with Event 1d RNC triggers Hard Handover via Radio Bearer Reconfiguration procedure
Page 81 of 108
Document name:
Date:
2009-26-06
UE sends Measurement Report with Event 1b to remove NodeB A from the active set HSDPA cell change should not be performed too late, when the UE has already moved 'far' into the area of another cell where it could have better throughput. HSDPA Hard Handovers should not be executed too early, so that it immediately changes back to the previous cell if the radio conditions vary (Ping-Pong effect).
If the new primary cell does not support HSDPA or suffers from CAC failure, then the HS-DSCH RB(s) is (are) reconfigured to DCH (if iRM CAC is successful). In case of CAC failure for the DCH then the PS RAB(s) is (are) released. For parameters configuring HSDPA see 2. Again like DCH calls iMCTA CAC can also trigger (if DCH fallback not enabled or fails) to transfer the call to another inter-frequency 3G or inter-system 2G cell. 6.15.2.2. Failure symptoms, identification and fixes for improvement
HSDPA performance degradations due to mobility issues can be best observed by analysing drive test data. It is very important to trigger handover at the right time as too late and UE may be served by a NodeB that is much worse compared to the best cell in the active set or too early and frequent handovers can hurt throughput as well. Furthermore non-optimal handover settings might cause unnecessary transitions from HS-DSCH to DCH if too many HSDPA users; as a result the benefits from HSDPA will not be available to a HS capable UE. Finally during the Hard Handover there might be major transmission gaps including TCP retransmission. The reason might be synchronisation problems or not optimal timing during the handover procedure e.g. the timing when the RNC stops forwarding data towards the old NodeB. This problem can be easily detected when correlating RRC with TCP/IP data. Figure 30 below shows an example cross-correlated by Actix 2; in the upper left part of the picture the RRC protocol is shown, the lower left picture shows the TCP SQN recorded at the client site by Wireshark (previously Ethereal):
Page 82 of 108
Document name:
Date:
2009-26-06
Figure 30: Hard handover problems identified by cross-correlated RRC and TCP data Table 63 below is listing the identification techniques for HSDPA mobility problems: Problem
HSDPA ping-pong Transmission gap during HO in HSDPA call
Trace
Uu Uu, TCP
Trigger
There are two consecutive Radio Bearer Reconfiguration procedures within x seconds Cross-correlation Uu and TCP trace: during a Radio Bearer Reconfiguration procedure there is a transmission gap on TCP layer in the DL for x seconds
6.15.3.
RF related issues
RF related issues on the air interface are one of the main reasons for performance throughput degradations of HSDPA calls. The optimisation has to be done on a per-cell basis using UE drive test data. In the following subsections the most important measures are summarised. Due to the fact that in the downlink there is no gain from soft/softer HO a UE in HSDPA mode is more sensitive regarding pilot pollution (subsection 6.4.2). 6.15.3.1. RF related issues - CQI
The DL quality of the HSDPA shared channel is reflected by the channel quality indicator (CQI) UE sends back to the Node B in the UL HS-DPCCH. The CQI ranges from 0 to 30, with greater values indicating better quality. It is based on the instantaneous measurements of the RF conditions. NodeB decides based upon the reported CQI values which Transport Format Resource Combination
Page 83 of 108
Document name:
Date:
2009-26-06
(TFRC) can be transmitted given a certain transmit power and an expected error rate that is directly impacting the expected throughput. 3GPP 2 defines the meaning of the reported CQI values for each UE category. In 2 requirements for the accuracy of the channel quality measurements are given. The UE shall assume for the purpose of CQI reporting a total received HS-PDSCH power PHSDPSCH = PCPICH + + in dB where the total received power is evenly distributed among the HS-PDSCH codes which correspond to the reported CQI. The measurement power offset is signaled by the RNC and the reference power adjustment is given for each UE category in 2. PCPICH is the transmit power of the Primary CPICH. It should be noted that the 3GPP specification does not demand that P CPICH + is equal to the total available HSDPA power. The aim of analysing CQI is to understand in the even of throughput degradation, if the scheduler is efficiently utilising the air-interface by cross correlating CQI with codes, modulation and TB used to achieve a target HARQ BLER. Such information is readily available in UE logs. Figure 31 below show as a graphical distribution of the throughput versus CQI; the test has been done stationary, the cell was unloaded and application was FTP download via TCP/IP:
1800 1600
CQI
Figure 31: HSDPA - throughput versus CQI for TCP download Note: when the CQI is exceeding 15 there is no obvious throughput improvement observed anymore because the UE capability of 12 is in this case the limiting the maximum TBS (see also subsection 6.15.4). 6.15.3.2. RF related issues Ec/No
For the same test case as described in previous subsection the HSDPA throughput versus Ec/No were analysed. Again a strong correlation between both measures has been recorded as visualised in Figure 32:
Page 84 of 108
Document name:
Date:
2009-26-06
1800 1600 1400 1200 1000 800 600 400 200 0 -20 -18 -16 -14 -12 -10 -8 -6 -4
Ec/No [dB]
Figure 32: HSDPA - throughput versus Ec/No for TCP download To be noted: the Ec/No is never exceeding (excluding single measurement samples) around 6 dB because the No term includes the HSDPA traffic of the user. Furthermore for Ec/No values exceeding around 8 dB no throughput performance could be observed indicating UE limitations. 6.15.3.3. RF related issues other optimisation problems
For any other optimisation problems as neighbour list planning, access parameters or power control settings please take a look in the corresponding subsections of this guideline. 6.15.4. UE limitations
HSDPA capable terminals with resulting peak data rates ranging from 1.2 Mbit/s to 14.4 Mbit/s at physical layer, see also 2 and 2. Depending on the terminal type different maximum number of HS-DSCH codes, different maximum TBS or modulation schemes are supported. As a consequence the maximum achievable throughput is terminal dependent and should be taken into consideration when analysing HSDPA UE traces especially in good RF. 6.15.5. Capacity issues
Because the HS-DSCH is a shared channel the throughput of one UE highly depends on the overall HSDPA traffic in the particular NodeB. Two cases can be differentiated: 6.15.5.1. Capacity issues sharing of the bandwidth
When sharing the HSDPA bandwidth with other users the application throughput will not be optimal due to the fact that The bandwidth provided by the HS-DSCH is limited The bandwidth on the backhaul transport network is limited
Page 85 of 108
Document name:
Date:
2009-26-06
Indirectly by execution of UE performance tests during the busy hour and a comparison to the non-busy hour; another good test method might be static automatic tests over a day By evaluation of PM counter statistics to examine the frequency of UEs of certain category being scheduled and how often multiple users were scheduled per TTI Evaluation of Iub traces in regards to the HS-DSCH FP flow control and congestion management especially if number of E1/T1 are low.
6.15.5.2. Capacity issues HSDPA call cannot be established on a particular NodeB Failed establishment of HSDPA call on a NodeB can be due to following and are easily identifiable in the CT as HsdpaCacFailure. Hard limits During call set up, HS-DSCH serving cell change via hard handover and transition from URA_PCH/CELL_FACH to CELL_DCH with HSDPA, the number of active HSDPA users is checked on a cell level against the parameter hsdpaCellClass.maximumNumberOfUsers. HSDPA hardware and processing resources are limited in the NodeB, for more details see 2. For ALU UA6.0x the UCU-III/xCEM hardware limitation due to UL (and default parameter setting) is 32, although more than 64 users have been shown to be supported per sector-carrier under lab conditions and with very low UL data rates of 8kbps. Soft limits Each time when a UE tries to establish a HSDPA call on a new NodeB via a RadioBearerReconfiguration procedure iRM CAC is also checking the soft limitations for the associated DCH. For ALU UTRAN the corresponding parameter and algorithm configuring iRM CAC are explained in 2. In case Fair-sharing (33694) is truned on, the CAC will depend upon the resources reserved for the existing HSDPA users and can trigger a failure even with less number of users per cell than above. So this depends upon the QoS requirements of existing users and can change with the dynamics of the traffic mix. HSDPA related PM counters are available in 2.
Page 86 of 108
Document name:
Date:
2009-26-06
Figure 33: HSUPA changes done to the Protocol Stack The following subsections are describing different aspects of HSUPA data call. 6.16.2. 6.16.2.1. Mobility aspects of HSUPA Concept
The mobility aspect of a HSUPA user is as follows: In general the mobility procedures are the same as for PS calls over DCH (e.g. soft/softer HO triggered via event 1a, 1b and 1c). However for networks supporting EDCH Macrodiversity, event 1j is also configured. This event is used to keep the DCH active set and the EDCH active set consistent when certain active set updates take place 2 However one of the radio links acts as the serving cell which is selected to be the same as for HSDPA in the DL In HSUPA serving cell is responsible for issuing absolute serving grants (AG) for the UE to send data. And as such this cell change only involves changing the physical channels E-AGCH/E-RGCH to accommodate the new role of the cell. The support of soft/softer HO means that the possibility of performance degradation is much less as compared to HSDPA. UA6.0 only supports HSUPA over Iur boundary if feature 30744 is enabled on SRNC and DRNC and both are RNC9370. So in markets where the above scenario is not applicable (like USA), if the primary radio leg goes over to the drift RNC, the HSDPA/E-DCH call will be reconfigured to HSDPA/DCH state with a maximum data rate controlled by DchRateCapping.maxUlRateHsdpaAndEdchToHsdpaAndDch. A timer is used to supervise the reconfiguration back to HSDPA/E-DCH state (only possible in SRNS relocation or when all radio legs handover back to SRNC) and an optimum value should avoid ping ponging between DCH and EDCH states in case call stays around Iur boundary. However reconfiguration to DCH can also occur if there are cells involved which dont support E-DCH or cells are fully loaded with maximum allowed number of E-DCH users or if UTRAN wants to activate compressed mode on the UE.
Page 87 of 108
Document name:
Date:
2009-26-06
6.16.2.2.
Depending upon the initial E-DCH throughput, the new DCH bearer throughput will be lower at application level. If some of the radio legs go back to SRNC then there is possibility that bearer will never configure back up to E-DCH. However such situation will only occur if the user only moves along the Iur boundary. Problem
HSUPA ping-pong along Iur Reduction in throughput during HO along Iur
Trace
Uu Uu
Trigger
There are consecutive Radio Bearer Reconfiguration procedures within x seconds doing E-DCH DCH state changes frequently There is no subsequent Radio Bearer Reconfiguration procedure observed after the initial procedure that configured UL to DCH
Table 64: HSUPA HO related issues involving Iur Some relavent KPIs/Counters are given that deal with the handover and call reliability aspects of HSUPA
PM system
UtranCell UtranCell
Counter/KPI
VS.EdchCellDeletion.RadioLinkFail VS.RAB.Drop.PS.CellDCH.EDCH_HSDSCH
UtranCell
VS.RAB.Drop.CN.Init.PS.CellDCH.EDCH_HSDSCH
UtranCell
6.16.1.
The scheduling mechanism for EDCH involves UEs sending scheduling requests that are assigned resources by the MAC-e entity upon evaluation of a set of criteria. This scheduling grant takes the form of absolute (giving max uplink power that can be transmitted) or relative (stipulating change/no-change in power with respect to previous TTI). However in case of overload (on Uu or Iub) the scheduler will not honour the request and would most likely start downgrading the served and non-served UEs through absolute and relative grants respectively. Hence it is important to ensure that UL target load and Iub links are setup correctly to give desired cell throughput. The scheduler is also responsible for the hybrid ARQ to ensure error-free delivery avoiding re-transmissions at higher layers, reducing delay. Furthermore the UL EDPCCH contains a happy bit that shows if the UE is satisfied with the current grant. This can act as an indicator of how fairly each UE is being scheduled.
Page 88 of 108
Document name:
Date:
2009-26-06
Under bad RF conditions the UE is likely to be transmitting at high power to reach the NodeB and hence will not have sufficient power available to send the data resulting in loss of throughput.
6.16.2.
UE Limitations
HSUPA capable terminals have peak data rates ranging from 0.7 Mbit/s to 5.7 Mbit/s at physical layer, see also 2 and 2. Depending on the terminal type, various options for maximum number of UL codes, minimum SF and TTI durations are supported. As a consequence the maximum achievable throughput is terminal dependent and should be taken into consideration when analysing HSUPA UE traces.
6.16.3.
Capacity issues
Because the E-DPDCH is a shared channel the throughput of one UE highly depends on the overall HSUPA traffic in the particular NodeB. Two cases can be differentiated: 6.16.3.1. Capacity issues sharing of the bandwidth
When sharing the HSUPA bandwidth with other users the application throughput will not be optimal due to the fact that The bandwidth provided by the E-DPDCH is limited, see Figure 34 The bandwidth on the backhaul transport network is limited
These kinds of capacity issues can be detected in a similar way to what has been described for HSDPA
Figure 34: User versus Cell throughput variation with increase in users
Page 89 of 108
Document name:
Date:
2009-26-06
6.16.3.2. Capacity issues HSUPA call cannot be established on a particular NodeB During call set up, E-DCH serving cell change and transition from URA_PCH/CELL_FACH/CELL_DCH to CELL_DCH with E-DCH the number of active HSUPA users is checked on a cell level against the parameters BTSEquipment.edchMaxNumberUserEbbu & BTSEquipment .edchMaxNumberUserNodeB. HSUPA capacity is also heavily dependent on hardware and processing resources which are limited in the NodeB, for more details see [16]. Some relevant KPIs/Counters are given that deal with NodeB capacity aspects of HSUPA. A full set of HSUPA related PM counters are available in [28]. PM system
UtranCell UtranCell UtranCell UtranCell UtranCell
Counter/KPI
VS.EdchIubTnlCongestIndc.Reserved VS.EdchIubTnlCongestIndc.DelayBuildUp VS.EdchIubTnlCongestIndc.FrameLoss VS.EdchIubTnlCongestIndc.NoCongestion VS.EdchIuRelAbnormal.CACReject
NodeB
VS.eDCHBLReductionFactor.IuBFactor
Page 90 of 108
Document name:
Date:
2009-26-06
In case RNC requests the UE to change the DL R99 data rate due to high DL TX code power reported by the NodeB (iRM scheduling)
In case of a change of R99 UL/DL data rate or HSDPA best cell, first a synchronised Radio Link Reconfiguration on NBAP is executed following changes of the ATM resources on the Iub via ALCAP procedures. RNC sends a RB Reconfiguration message on RRC and in case of a failure the UE sends back the RB reconfiguration failure. 6.17.1.2. Failure symptoms, identification and fixes for improvement
One reason for a failure in this procedure is that the UE is not supporting the requested new configuration. Failure can also occur due to un-optimised activation CFN in case of SRLR and physical channel failure while performing inter-frequency handover. Also during DCH to/from FACH transitions, the RNC is not able to listen to both FACH and DCH channels, which makes it vulnerable especially if UE experinces a failure and remains in one state while RNC changes the state causing either RB reconfiguration timeout or RLC disruption depending upon which is smaller. Table 67 and Table 68 are listing the identification of RB Reconfiguration Failures in traces and in the PM system: Problem
RB Reconfiguration failure
Trace
Uu or CT
Trigger
Any occurrence of the RRC message RB Reconfiguration Failure
Counter / KPI
(VS.RRC.RBReconfigSucc / VS.RRC.RBReconfigAtt) VS.RadioBearerReconfigurationUnsuccess.Timeout
UtranCell
VS.RadioBearerReconfigFailure.<cause>18
6.17.2. 6.17.2.1.
18
Relocation failures Concept IRAT-HO (subsection 6.10 for details) Inter-RNC HO (SRNS relocation (UE not involved)) In case of a Cell Update on a new RNC
Causes include no DL code resources, no DL power resources, unspecified, CAC RNC processing resources, lack of bandwidth on Iu, Iur and Iub
Page 91 of 108
Document name:
Date:
2009-26-06
The procedure is described in 2. The SRNC sends a Relocation Required message on RANAP. The CN sends back the Relocation Command message (successful case) or Relocation Preparation Failure (unsuccessful case). 6.17.2.2. Failure symptoms, identification and fixes for improvement
Failures of the relocation procedure occur most likely during the IRAT-HO, which is described here. A failure is detected during the RANAP Relocation Preparation procedure due to the following causes: Timer TRELOCprep (5sec) expiry at the SRNC Relocation Preparation Failure
In the first case the SRNC initiates the Relocation Cancel procedure at the Iu interface. This procedure enables the CN to initiate the release of the resources allocated during the Relocation Preparation procedure in the GSM network. The SRNC considers the UMTS to GSM handover as not possible at this point in time and keeps the existing radio connections established. This means that the existing Iu-signalling connection can still be used for the call. In the second case upon receiving a Relocation Preparation Failure message from the 3G CN, the SRNC still maintains the call. If the failure cause specified within the message is Relocation Failure in Target CN/RNC or Target System or Relocation not supported in Target RNC or Target System then SRNC repeats the Relocation Preparation procedure with the next suitable cell from the list of potential GSM target cells otherwise the SRNC considers the UMTS to GSM handover as not possible at this point in time. Table 69 is listing methods of how to identify relocation problems in call trace: Problem
Relocation Preparation Failure Relocation Cancel
Trace
CT CT
Trigger
Any occurrence of the RANAP message Relocation Preparation Failure Any occurrence of the RANAP message Relocation Cancel
Figure 35: Call Flow IRAT HO relocation cancellation due to timer expiry Tables below are listing the PM KPIs describing relocation failure and success:
Page 92 of 108
Document name:
Date:
2009-26-06
PM system
UtranCell UtranCell
Counter / KPI
VS.RAB.Drop.CS.RelocUEInvol / RAB.SuccEstab.CS VS.RAB.Drop.PS.RelocUEInvol / RAB.SuccEstab.PS.Sum
Table 70: Drop rate due to Failure in SRNS relocation UE involved PM system
UtranCell
Counter / KPI
(IRATHO.AttRelocPrepOutCS IRATHO.FailRelocPrepOutCS.<sum>)/ IRATHO.AttRelocPrepOutCS IRATHO.FailRelocPrepOutCS.TRELOCprep_exp/ IRATHO.AttRelocPrepOutCS
UtranCell
Table 71: PM KPIs for IRAT-HO relocation failure and success rates
6.17.3.
The release of the RAB and the RL is not only used when terminating the voice or data call, but also when doing an IRAT HO from 3G to 2G. In general failures are not expected to occur on this stage. The call handling is shown in Figure 11; the normal release procedure is identical with this call handling, the only exception is that it is not initiated by an Iu Release Request.
Page 93 of 108
Document name:
Date:
2009-26-06
7. Call quality
In this section those aspects are investigated that have a direct influence of the user perceived call quality. In the first part the BLER in the DL and UL is discussed. The second part gives a definition of the Quality of Service (QoS) parameters for the different types of services like voice, data and VT and a description of performance weaknesses and how to overcome these.
7.1.
In the following subsections the DL and UL BLER analysis is reflected in more detail. 7.1.1. DL Block Error Rate (BLER) analysis 7.1.1.1. Concept The DL closed loop power control is in charge of keeping the DL BLER in a predefined range. The DL closed loop power control can be split into two loops: outer and inner loop. Figure 36 below is showing the principle of the DL PC:
Page 94 of 108
Document name:
Date:
2009-26-06
Figure 36: Downlink power control principle DL outer loop PC: The RNC sends a target value for the BLER to the UE on the DCCH. This value should guarantee an optimal performance for the (voice or data) service based on the requested QoS parameters. The DL outer loop PC in the UE defines a SIR target based on the BLER. The control loop runs autonomously in the UE with a maximum speed of 100Hz. The method on how to set SIR target in order to provide the requested BLER is not specified in the 3GPP standard. However some UE performances in given RF conditions are specified in 2. When the UE is in compressed mode higher SIR target values will be defined, as there is no power control during transmission gaps. DL inner loop PC: The inner loop PC purpose is fast adaptation of the NodeB transmit power in order to achieve the targeted SIR for the considered downlink radio channel. Because of the speed of the control loop (up to 1500 Hz) the only elements involved in the inner loop power control are the UE and the NodeB. TPC pattern that the UE is sending to the NodeB is based on the comparison of the SIR estimation versus the SIR target. However the NodeB transmit power is limited to parameters given by the RNC on NBAP. 7.1.1.2. Failure symptoms, identification and fixes for improvement
The DL BLER is reported by any drive test system in Uu traces while the DL Tx code power can be captured using NodeB logging tool like OTCell or CDM/x or even with RNC initiated Call Trace if it is configured with NbapDedicatedMeasurements. Table 72 is listing the triggers in these traces: Problem
High DL BLER in Uu NodeB Tx Pwr via CT
Trac e
Uu CT
Trigger
DL BLER higher than x % for more than y seconds NodeB transmit power is exceeding for service x more than y seconds z dBm.
Page 95 of 108
Document name:
Date:
2009-26-06
7.1.2. UL Block Error Rate (BLER) analysis 7.1.2.1. Concept The UL closed loop power control is in charge of keeping the UL BLER in a predefined range. The UL closed loop power control can be split into two loops: outer and inner loop: UL outer loop PC: The UL outer loop PC is located at the RNC and is responsible for updating the UL SIR target so that the UL BLER ensures the QoS of the requested (voice or data) service. The RNC provides the NodeB the updated SIR target via the DCH FP on the Iub. The control loop runs in the RNC with a speed of up to 100 Hz. For updating the SIR target the RNC takes into account not only the measured BLER, but also the reported Quality estimates (QE) provided by the NodeB. In case Power Control Enhancements (DynamicParameterPerDch. qeThresholdForUlOlpc = 255) is not available then RNC relies only on the reported CRCI from the NodeB. Figure 37 below is visualising the principle:
Figure 37: UL outer loop power control If the UE is in soft/softer HO mode and one particular NodeB has more than one leg, the NodeB does frame selection in the NodeB (called micro-diversity). For frames coming from different NodeBs belonging to the same RNC the RNC is doing the frame selection (termed macro-diversity). In case the NodeBs belong to different RNCs the SRNC is doing the frame selection; the data is provided via the Iur interface. For each UL TB set the NodeB is performing a CRC check on PHY layer and adding a CRCI to the DCH-FP frame. In addition NodeB can also estimate the quality of the link and send to the SRNC via same frame in QE field. QE value ranges from 0 to 255 (small QEs are indicating good quality) and can be based
Page 96 of 108
Document name:
Date:
2009-26-06
upon Physical or Tranmsport channel BER. QE can also be used by the OLPC in the SRNC if 0 < DynamicParameterPerDch. qeThresholdForUlOlpc < 255. UL inner loop PC: The UL inner loop PC is adjusting the transmit power of the UE in order to achieve the SIR target provided by SRNC. All NodeBs involved in the particular call are sending TPC commands with a rate of up to 1500 Hz. The TPC commands of NodeBs can differ from one another. In this case if only one of the NodeBs is sending a power down command, the UE will lower its transmit power by the defined power-down-step. In case there is no TPC at all the transmit power of the UE remains unchanged. More information including parameter can be found in 2. 7.1.2.2. Failure symptoms, identification and fixes for improvement
Cells suffering with high UL BLER can be easily identified using data from the PM system. When doing drive testing high UL BLER can be identified by using the IMSI based call trace (CTb) in parallel to tracking the PMs as retrieved by the RNC. High UL BLER might cause a RLF in the UL and/or the drop of the call (see also subsection 6.1). A high UL BLER at RNS level may indicate inappropriate provisioning of TFCI for the various R99 UL services. For example TFI0 = 0x81 is chosen instead of 1x0 then this would disable the reporting of CRC from UE per TTI and hence impact the OLPC performance if silent mode is activated on the NodeB via activation flags: NodeB.isSilentModeAllowed and UlRbSetConf.ulFpMode = ulFpModeSilent. Table 73 and Table 74 are listing the triggers in interface traces and the corresponding PM KPIs: Problem
High UL BLER High UE power reached Bad CRCI Bad QE SIR exceeded target
Trace
CT Uu Iub Iub Iub Iub
Trigger
UL BLER higher than x % for more than y seconds Any occurrence where the UE is sending with at least y dB UE power for more than x seconds19 More than x % of the CRCIs within y seconds have a CRCI equal to 1. More than x % of the QEs within y seconds have a QE more than y. The SIR target for service x is exceeding value y. Any occurrence where the UL SIR target is not updated for more than x seconds. This is an indication of failure in the UL that might lead to an UL RLF.
Counter / KPI
(VS.DdUlAmrABtBadFrm / (VS.DdUlAmrABtGoodFrm + VS.DdUlAmrABtBadFrm)) (VS.DedicatedUplinkBadPdus.<sum> / (VS.DedicatedUplinkPduRlc + VS.DedicatedUplinkBadPdus))
19
Note that according to the 3GPP specification there are four power classes defined (power class 1 to 4) with maximum output power +33 dBm, +27 dBm, +24 dBm and +21 dBm. The most common mobiles on the market are class 3 (+24 dBm).
Page 97 of 108
Document name:
Date:
2009-26-06
7.2.
7.2.1. QoS general In this subsection general QoS KPIs are listed that are not linked to a particular service like voice, data or VT. Monioring these or similar KPIs can act as trigger points for identifying non-optimal performance. KPI
No network [%] Attach failure [%] Attach setup time [s] Location update success rate [%] SMS failure rate [%] MMS failure rate [%] SMS delivery time [s] MMS delivery time [s]
Table 75: General QoS KPI measured on application level In ALU UA6.0 QoS parameters like TC, ARP and THP given in the RANAP RAB Assignment Request message can be used for the OLS differentiation in various features like power control, iRM CAC, AO and iRM pre-emption, see 2 for details.
7.2.2. QoS voice service Because UMTS UL and DL links are uncorrelated due to different frequencies and reception paths it is necessary to measure the UL and DL voice quality separately. The voice quality equipment compares the received voice samples with the transmitted voice samples. In that way the evaluation software can do a voice quality classification for both directions independently. Table 77 below is giving the QoS KPIs for voice services. For the voice quality evaluation the Mean Opinion Score (MOS) is used. The MOS is defined by the ITU and is ranging from 1 to 5, for details see also ITU P.800 and ITU P.862. For further discussion on the MOS performance of various AMR codec rates see 2. A good voice quality can be considered when the MOS is exceeding 3.0.
Page 98 of 108
Document name:
Date:
2009-26-06
Voice quality degradations like e.g. echo or voice delay are reflected by this measure. Mean Opinion Score (MOS)
Below 2.0 2.0 to 3.0 3.0 to 4.0 Above 4.0
QoS value
Poor Fair Good Excellent
7.2.3. QoS data services 7.2.3.1. Concept There are different metrics available defining the QoS of data services like throughput, delay, jitter etc. In the PDP Context Activation Request message the UE can optionaly request pre-defined QoS profiles as specified in 2. The CN can check the requested QoS profile with entries from the HLR. The CN makes these negotiated QoS parameters available to the UTRAN via the RAB Assignment Request 2. Dedicated and common UTRAN resources can be dynamically assigned depending on traffic measurements or load. The initially assigned PS RB at the beginning of a PDP session depends on the UTRAN configuration. The RB data rate can be dynamically changed (or even the mobile is sent to idle mode/URA_PCH mode) depending on the data to be sent in the UL and/or DL. Depending on the status of the RLC buffer in the UE, the mobile might send a Measurement Report Event 4a (in case the buffer occupancy exceeds an absolute threshold) depending upon whether feature 34227.1 is enabled in UA6.0 via RadioAccessService.isBOTriggerForRbAdaptationAllowed. The RNC would then react on this Measurement Report by doing a RB reconfiguration (see subsection 5.4.1 and 6.17.1). Furthermore a smaller RB can be assigned in case of overload estimations done by the RNC (subsection 6.5). Furthermore data rates assigned at various state transitions can also be capped thanks to feature 34227.3 enabled via RadioAccessService. isOamCappingOfDataAllowd. Another difference when describing the PS data user perceived QoS is that a drop of the RAB and RRC connection does not (necessarily) mean that the PDP
Page 99 of 108
Document name:
Date:
2009-26-06
Context is removed from the GGSN or the FTP session drops. After the reestablishment of the RRC connection or the new establishment of the RAB, the FTP session can be resumed in case the session has not timed out. For the user the drop of the RRC and RAB is visible by stalling of the FTP transfer and low throughput rates. In case of real time applications like video streaming or web radio the drop will be noticed by the user if the buffer of the application is emptied and no new data is received. It might be that the application will re-start with codecs requiring lower bandwidth to fill the internal buffer again. On the PPP link of the PS data session the TCP/IP header and data can be compressed resulting in a throughput increase. For most Microsoft operating systems, compression is an available option in the PPP settings of the dial-up networking. In addition PDCP layer is providing header compression for e.g. TCP, UDP, RTP and IP header 2. Simple FTP-download tests of files with the size of 1MB in the UMTS networks has shown that the throughput for zipped binary files is around 25% less compared with the ASCII files. 7.2.3.2. Failure symptoms, identification and fixes for improvement UE state (Cell_DCH with HSPA, Cell_DCH, Cell_FACH or URA_PCH) Chosen RB rate (in case of R99 Cell_DCH) Reported failures of the transport network (subsection 6.13) Problems detected on the RLC layer e.g. RLC retransmission or RLC resets (subsection 6.14) Reported BLER in UL and/or DL (subsection 7.1) TCP configuration like TCP window size or MSS (see subsection 6.14.1) Retransmissions on TCP layer PPP/PDCP compression used/not-used. Usage of zipped files/unzipped ASCII files First the end-to-end data performance should be investigated Then delay measurements should be done indicating the source of the performance degradation (e.g. delay due to non-optimal RLC queue, retransmission on RLC etc.)
One example of an (graphical) analysis is shown in Figure 38 below. The throughput of a FTP transfer is measured by Wireshark 2 and visualised by tcptrace 2 is low. The root cause for the non-optimal performance is Congestion Control:
Document name:
Date:
2009-26-06
Figure 38: FTP performance degradation caused by Congestion Control The FTP throughput is the gradient of the curve; in addition TCP retransmission caused by SDU discards on RLC are shown in the right part of the picture (see also subsection 6.14.1). It is possible to cross-correlate the UE traces with Wireshark traces recorded at the FTP server and also with RF data like Ec/No or Active Set Update messages recorded by the UE logging tools. In that way FTP performance degradations can be linked to handover problems, bad radio conditions in terms of Ec/No or neighbour definition problems. When the traces are recorded by different mechanisms, it would be necessary to correlate the PC clocks by using time synchronisation. Otherwise tools like Actix or RFO can do event-based cross correlation. Another example for an end-to-end analysis is shown in Figure 39 below; the picture is visualising the delay of an ICMP ping between Internet server and PC client for UL and DL separately. The trace was recorded with Wireshark 2. Furthermore by tracing on the Iub, Iu and Gn interface it is possible to make similar delay plots for the particular interfaces. This will unveil where the high delay peaks are coming from and will help further the investigation.
Document name:
Date:
2009-26-06
Figure 39: end-to-end delay of an ICMP ping For the same measurement the delay on the Gn interface were also measured as shown in Figure 40 below. As expected the delay is very small and dont have a big impact on the overall delay. This trace was recorded using a Tektronix K12 protocol tracer.
Document name:
Date:
2009-26-06
Table 78 below is listing the identification triggers in network interface traces: Problem
TCP reset TCP retransmission TCP SACKs
Trace
TCP TCP TCP
Trigger
Number of occurrences if the REST flag of the TCP options is set to TRUE. Statistic counted per TCP session Number of occurrences of TCP retransmissions. Statistic counted per TCP session Number of SACK. Statistic counted per TCP session
Table 78: Identification of QoS issues for data service Table 79 below is listing the data QoS for identifying non-optimal performance: KPI
PDP context activation failure [%] PDP context activation time [s] PDP context cut off rate [%] FTP cut off rate [%] FTP throughput [kbit/s] Ping delay [s] HTTP failures [%] RB Assignment Success Rate [%]
Table 79: QoS of data services KPIs 7.2.4. QoS VT service For VT calls the QoS consists of voice and video quality. One Tool that can provide the quality assessment of the video samples, as a MOS value, is ALUs LVAT. Although there is an ITU standard that defines the framework of video quality measurement 2, it does not layout the algorithm and calibration of the MOS and hence that remains vendor propriatry. For voice QoS parameter the metric of subsection 7.2.2 is used. Table 80 below is listing the KPIs to retrieve the other QoS parameters for VT: KPI
Call completion success rate VT [%] Block call rate VT [%] Dropped calls VT [%] Call setup success rate VT [%]
Document name:
Date:
2009-26-06
Appendix
A. Measurement definition
A.1. Measurement definition voice For voice services the UMTS UE in the drive test van should call an ISDN line in the PLMN because otherwise it is hard to distinguish if the first or the second mobile is responsible for observed failures or also for voice quality degradations. This will help the RF planner to analyse the failure and propose additional network changes. One suggested voice test call sequence for the UMTS UE in the drive test van can be as follows: Network attach Mobile Originating Call (MOC), duration 2 minutes, alternating speech sample from the UE to the PLMN and vice versa. Network detach and pause of around 10 seconds Network attach Mobile Terminating Call (MTC), duration 2 minutes, alternating speech sample from UE to the PLMN and vice versa. Network detach and pause of around 10 seconds
The drive test kit should be capable of generating this measurement sequence automatically. In parallel the RF conditions of the UE and the neighbouring cells should be recorded using the drive test tool along with a 3G and 2G scanner for parallel verification. A.2. Measurement definition data When doing KPI performance verification of data services the FTP server should be directly connected to the GGSN to avoid any latency and delay caused by the Internet. For security reasons a special test APN should be used. The FTP throughput should be measured in motion and in addition also stationary in case that there are some Hot Spots inside the UMTS cluster e.g. railway stations, big hotels or airports. It is recommended to do testing via scripts; the advantage being the repeatability leading to ease of comparison and analysis. Data scripts are supported by most of the drive test tools, but can also be made with tools like cygwin providing a full Linux command shell environment 220. The data test call sequence should be as follows:
20
Network attach and PDP context activation FTP download of three times 5 MB file, 5 seconds pause in between Pause of 20 seconds FTP download of three times 5 MB file, 5 seconds pause in between Pause of 20 seconds
The original DOS FTP client should be used instead the FTP client from cygwin (/usr/bin/ftp). This can be achieved by defining a variable called FTP_CMD = c:\winnt\system32\ftp.exe in the scripts.
Document name:
Date:
2009-26-06
FTP upload of three times 2 MB file, 5 seconds pause in between Network detach, PDP context deactivation and pause of around 10 seconds
For troubleshooting purposes it might be necessary to record the TCP/IP protocol analyser as Wireshark on both the UE and the FTP server side. In parallel the RF conditions should be recorded. For measuring the maximum possible throughput on a radio link UDP shall be used because TCP retransmission might give an incorrect picture of the bandwidth capability. The TCP configuration of the client PC and the server should be comparable with the settings most common used by normal UMTS subscribers and in the Internet. TCP window size of the sending entity should be large enough so the RLC queue in the RNC is not going into underrun. Table 81 below is listing the default TCP/IP parameter that should be used during the testing: Entity
Client Server
Feature
SACK TCP window size PDCP compression PPP compression Starting MSS ICMP packet size MSS
Setting
Set to TRUE 64 kbyte
Short description
SACK allows the receiver to inform the sender about all segments that are successfully received The TCP window is the amount of outstanding data a sender can send before it gets an acknowledgment for the receiving entity When doing root cause analysis the feature should be disabled When doing root cause analysis the feature should be disabled The amount of TCP/IP packets sent by the sending entity at the beginning. Further packets will be send after reception of the first TCP ACK To measure the ICMP RTT an IP packet should be sent with the size of 40 byte (8 byte header plus 32 byte payload) The MSS should be 1460 byte resulting in a MTU of 1500 byte (= MSS + 20 byte TCP header + 20 byte IP header). The actual TCP/IP packet size used might be smaller if Internet router is segmenting the packets
Client
40 byte
Client/server
1460 byte
Table 81: Default TCP/IP parameter settings used for testing The TCP/IP settings can be verified using Wireshark. The settings can be set for Windows PCs in the registry or with help of shareware tools like 2. For UNIX and Linux operating systems the settings can be set in the corresponding configuration files. In case ciphering on RLC/MAC and data compression on PPP/PDCP are not used, special prepared ASCII files shall be used. This will ease the identification of each single packet in Wireshark, Iub or Iu traces to detect retransmission on TCP or RLC. Note that on Iu, Gn and Gi there is no compression and ciphering used so using the particular tracing equipment can identify the ASCII payload. The special ASCII files should contain only one (!) line and as an example the following sequence:
Document name:
Date:
2009-26-06
umts000000000umts000000001umts000000002umts000000003umts0000000 04umts000000005umts000000006 In case PPP data compression is on, zipped data shall be used to avoid irregular throughput measurements. Finally care should be taken that no other application on the PC are generating any unnecessary network traffic like keep alive signals. Figure 41 below is showing a snapshot of the Wireshark protocol analyser:
Document name:
Date:
2009-26-06
A.3. Measurement definition VT For VT one mobile should be located in the drive test van, the other mobile should be stationary located close to a UMTS site outside the UMTS cluster under test; this will minimise possible failure causes for this second UE and help the RF planner at the root cause analysis. The measurement sequence should be the same as defined for voice calls except that a network attach/detach is not necessary because this is service independent. So the full measurement sequence for the VT should be as follows: Mobile Originating Call (MOC), duration 2 minutes, alternating speech sample from UE 1 to UE 2 and vice versa. Pause of around 10 seconds Mobile Terminating Call (MTC), duration 2 minutes, alternating speech sample from UE 1 to UE 2 and vice versa Pause of around 10 seconds.
Document name:
Date:
2009-26-06
Iu CN
Fading simulator
NodeB
RNC