0% found this document useful (0 votes)
39 views

UFS Overview

Uploaded by

kelvinmacshop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

UFS Overview

Uploaded by

kelvinmacshop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

UFS overview

Updated: Aug 03, 2023 80-NN752-1 Rev: E

UFS is a simple, high-performance mass storage specification used by mobile


applications for data transfer between host processing and memory devices. It is
designed for high throughputs, low power consumption, and reduced bus
interface.

Devices

UFS works with the following device types:

● External cards
● Embedded packages
● Future expansion devices such as I/O devices, cameras, and wireless

It supports data transfer with applications run on mobile phones, mobile PCs,
digital cameras, portable media players, MP3 players, and any other device that
needs mass storage and external cards.

Performance

UFS performance is subject to the following standards:

● Mandatory – Gear 1 support (rateA: 1248 MB/s)


● Optional – Gear2 support (rateA: 2496 MB/s)

Topology

Currently, only one device per controller is supported. The capability for multiple
devices on a single controller is planned for a future design.

Power supply

UFS uses three power supplies, including a separate power supply for I/O and
core.

● VCCQ – 1.2 V, logic, controller, and I/O


● VCCQ2 – 1.8 V, controller and I/O
● VCC – 1.8 V 3.3 V, that is, NVM

VCCQ and VCCQ2 have different signaling as defined by MIPI M-PHY:

● VCCQ – 400 mVp/240 mVp (not terminated)


● VCCQ2 – 200 mVp/120 mVp (terminated)

Two signaling schemes are supported – Low-Speed mode with PWM signaling
and High-Speed Burst mode. Multiple gears are defined for both Low-Speed and
High-Speed modes. Signals use an 8b10b line coding and have a high reliability
– BER under 10(exp)-10.
UFS architecture overview

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The following figure shows the UFS architecture.

Figure : Architecture diagram

Application layer
● Task manager – Manager handles command queue control tasks, for
example, abort.
● UCS – Handles commands such as read and write. It implements the SCSI
command set as baseline protocol and can add the UFS native command
set to any extended UFS functionalities.

Device manager

The device manager handles device operation and configuration tasks. Device
operation is the processing of commands, such as sleep, suspend, power down,
and other power specific operations. Device configuration is the process of
maintaining and storing descriptors that are used to query, read, and modify
device configurations.

The device manager uses two service access points (SAP):

● UDM SAP – Through this SAP, the UFS transfer protocol (UTP) layer
allows the device manager to handle device level operations and
configurations, for example, query requests for a descriptor.
● UIO SAP – The UFS interconnected (UIC) layer exposes itself through this
SAP to trigger a reset of the UIC layer and request a reset from the host.

UTP

The UTP layer generates UFS protocol information units (UPIU) to transfer
messages from the device manager or application layer. UPIUs are transferred
from the host-side UTP to the device‑side UTP.
UTP supports three SAPs:

● UDM_SAP – Communicates with the device manager


● UTP_CMD_SAP – Transports commands
● UTP_TM_SAP – Transports task management functions

UIC

The UIC layer is the lowest layer in the UFS architecture. It handles transport
tasks and comprises the MIPI Unified Protocol (UniPro) and MIPI M-PHY
sublayers.

UIC supports two interfaces:

● UIC_SAP – Transports UPIU from the UFS host to the UFS device
● UIO_SAP – Transports queries and control of device management

MIPI UniPro

The MIPI M-PHY sublayer is the physical layer. Each lane within this layer
consists of a pair of differential signals—a Tx lane and Rx lane. A device may
support multiple lanes, and the number of lanes are always a multiple of 2.

MIPI M-PHY
The MIPI M-PHY sublayer is the physical layer. Each lane within this layer
consists of a pair of differential signals—a Tx lane and Rx lane. A device may
support multiple lanes, and the number of lanes are always a multiple of 2.

Name Type Description

REF_CLK Input Reference clock – Relatively low-speed clock


common to all UFS devices in the chain; used as a
reference for the PLL in each device

DIN_t Input Downstream lane input; differential input is true


and complements the signal pair

DIN_c

DOUT_t Output Upstream lane input; differential output is true and


complements the signal pair

DIN_c
RST_n Input Reset; UFS device hardware reset signal

The following figure shows the UFS system model.

Figure : UFS system model


UTP layer

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The UTP layer uses SAM-5, a client-server model used as a general architecture
for task management. In this model, clients are referred to as the host. The target
is a logical unit (LUN) located in the UFS device on the server.

UPIUs are used to communicate task information between the host and the
target. A task is a command or a sequence of commands/requests that perform a
service. The host provides a task tag that is used for multiple messages
executing the same task. LUNs service the tasks. Each LUN has a task queue
that may have more than one task, but each LUN can only service one task at a
time. Different LUNs can service the same tasks simultaneously.

A typical command sequence comprises:

● Command UPIU
● Data out UPIU (optional)
● Data in UPIU (optional)
● Response UPIU

Data pacing
When the host provides a write command, the device can pace the data out
phase by sending the ready to transfer UPIU to the target. This UPIU also
contains an embedded transfer context that can be used to initiate a DMA
transfer on a per packet basis at the initiating device.

UPIU transaction code

Code Description

NOP out Acts as a ping from the host to the target; can be used to check for
a connection path to the device.

NOP in Responds to a NOP out request by acting as a target response.

Command Originates in the host and is sent to a LUN within a target device;
contains a command descriptor block (CDB) and command
parameters. Represents the command phase.
Response Originates in the target and is sent back to the host; contains a
command-specific operation status and other response information.
Represents the status phase.

Data out Originates in the host and sends data from the host to the target
device; represents the data out phase.

Data in Originates in the target and sends data from the target back to the
host; represents the data in phase.

Task management request Carries the SAM task management function from the host to the
(TMR) target; standard functions are defined by the SAM-5 specification.
Additional functions defined by UFS.

Task management Carries the SAM task management function from the target back to
response the host.
Ready to transfer Indicates that the target has sufficient buffer space and is ready to
receive the next data out UPIU; the target can send multiple ready
to transfer UPIU if it has the buffer space receive multiple data out
UPIU packets. The maximum data buffer size is negotiated by the
host and target during enumeration and configuration. This UPIU
contains a DMA context and can be used to set up and trigger a
DMA action within a host controller.

Query request Originates in the host and is used to request descriptor information
data from the target; this transaction is defined outside of the
command and task management functions and is defined
exclusively by UFS.

Query response Originates in the target and sends descriptor information data back
to the host; this transaction is defined outside of the command and
task management functions and is defined exclusively by UFS.

Reject Originates in the target and is sent to the host; generated when the
target is unable to interpret or execute a UPIU. Indicates an error in
the field values of the received UPIU.
LUNs

Updated: Aug 03, 2023 80-NN752-1 Rev: E

LUNs are independent external processing units that process SCSI commands
and perform task management. A UFS device may have up to eight independent
LUNs. A LUN contains the following:

● Device server – A conceptual object that processes SCSI commands


● Task manager – A conceptual object that controls the sequencing of SCSI
commands and performs task management
● Tasks – A list or queue of tasks that need to be processed that is received
from the host

UFS device also contains well-known LUNs. These LUNs only support a limited
number of commands by which an application client can send a request or
receive a response that would contain information for the entire device.

Figure : LUN
Well-know LUN W-LUN LUN field in UPIU Command name

Report LUNs 01h 81h Inquiry, request sense, test unit ready,
start stop unit

UFS device 50h D0h Inquiry, request sense, test unit ready,
start stop unit

Boot 30h B0h Inquiry, request sense, test unit ready,


read (6), read (10), red (16)
RPMB 44h C4h Inquiry, request sense, test unit ready,
security in, security out

SCSI write command

The SCSI write command sequence comprises the command phase, data phase,
and response phase. A write command is initiated from the host using the
command UPIU. The device sends the ready to transfer UPUI when it is ready to
receive data. The host sends the data using the data out UPIU. The transfer
terminates when the device that contains the status of the write command sends
the response UPIU.

Figure : SCSI write sequence


SCSI read command

The SCSI read command sequence initiates from the host. It first sends a read
command with the command UPIU, then sends the data in UPIU. It completes by
sending the status from the card with the response UPIU.
Figure : SCSI read sequence

Logical block provisioning


Provisioning is a process that defines the relationship of logical block address
space with physical memory resources. Provisioning creates and defines LUNs
and requires the following:

● Qualcomm Flash Image Loader (QFIL) – A tool required to provision the


device; see Qualcomm Flash Image Loader (QFIL) User Guide
(80-NN120-1) for steps to provision the device.
● Provisioning script – QFIL points to a provisioning script to provision the
device; provisioning the script provides a definition on device configuration.
● Path to provisioning script: <target_build>\common\config\ufs\provision –
Provisioning is a one‑time process; the device cannot be reprovisioned
after being provisioned. Therefore, extreme care must be taken to ensure
the parameters in the provisioning file meet the end product requirement.
MIPI UniPro layer

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The MIPI UniPro is a sublayer of the UIC layer that provides basic transfer
capabilities to the UTP layer. The data plane between UFS and UniPro uses the
UniPro transfer layer CPorts for data transfer. The command plane uses DME
service primitives. UniPro is composed of several layers defined in the MIPI
specification; however, from a UFS perspective, all the layers are considered one
black box.

Figure : MIPI unified protocol


UniPro/UFS transport protocol interface – data
plane

UniPro provides CPorts as a conceptual port to applications located on layers


above UniPro. T_CO_SAP provides the core data transfer primitives.

Primitive Description
T_CO_DATA_req (message fragment, EOM) Issued by an application, this primitive tells
UniPro to send a message.

T_CO_DATA.cnf_L (L4CportResultCode) Issued by UniPro, this primitive reports the


result of the message transfer request.

T_CO_DATA.ind (message fragment, EOM, SOM, Issued by UniPro, this primitive delivers a
MsgStatus) received message to the service. EOM
indicates it should be the last fragment. SOM
denotes the start of a message.

T_CO_DATA.rsp_L() Issued by the service, this primitive tells


UniPro that the service is ready to receive
more data.

UniPro/UFS control interface – control plane

UniPro is configured and controlled via the control plane using the DME
primitives.
Primitive Description

DME_GET/DME_SET Provides read/write access to


all UniPro and M_PHY
attributes of the local UniPort

DME_PEER_GET/DME_PEER_SET (optional) Provides read/write access to


all UniPro and M_PHY
attributes of the peer UniPort

DME_POWERON/DME_POWEROFF (optional) Powers up or down all UniPro


layers (layers 1.5 through 4)

DME_HIBERNATE_ENTER/DME_HIBERNATE_EXIT Puts the entire link in


hibernation or wakes it up;
affects the local and peer
UniPorts (UniPro layers 1.5
through 4 and M-PHY)

DME_POWERMODE Changes the power mode of


one or both directions of the
M-PHY link

DME_TEST_MODE Sets the peer UniPro device on


the link into a specific test
mode

DME_LINKLOST Indicates to the UniPro stack


that the link has been lost

DME_ERROR Indicates to the UniPro stack


that an error condition has
been encountered in one of the
UniPro layers
DME_ENABLE Enables the entire local UniPro
stack (UniPro layers 1.5
through 4)

DME_RESET Resets the entire local UniPro


stack (UniPro layers 1.5
through 4)

DME_ENDPOINTRESET Sends an end point reset


request to a link end point

DME_LINKSTARTUP Starts the link and informs


about remote link startup
UFS controller

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The UFS host controller is responsible for managing the interface between the
host software and the UFS device. It has a wrapper that contains the following:

● UFS UTP controller (QTI specific) – Responsible for UTP layer functionality
of the UFS controller
● UniPro controller (third party) – Responsible for UIC layer functionality of
the UFS controller

The following figure shows the UFS system-level block diagram.

Figure : UFS system-level block diagram


The UFS host controller has the following interfaces:

● AHB interface – Used for software access and for programming and
initializing the controller; it connects as an AHB slave on the config NOC of
the chip operating at 75 MHz.
● AXI master interface – Used for data transfer to and from the system
memory; it connects as an AXI master on the system NOC operating at
200 MHz.

The UFS host controller performs the following actions:

● Receives clocks from the global clock controller (GCC) of the chip.
● Maintains a dedicated interrupt output line that is connected to apps QGIC.
● Connects to M-PHY via standard Reference M-PHY Module Interface
(RMMI).
● M-PHY has a differential serial interface to the UFS device. This works as
a dual simplex Tx/Rx interface with two lanes in each direction.

UFS controller wrapper

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The UTP controller implements a standard Host Controller Interface (HCI) for
hardware and software. For outbound transactions, the UTP controller:

● Receives commands from the software


● Generates outbound UPIU transactions
● Sends them to the UniPro controller via the CPort

For inbound transactions, the UTP controller:

● Receives inbound UPIU transactions from UniPro controller via the CPort
● Moves the received data to buffers in the system memory specified
through HCI

Figure : UFS controller wrapper


UFS controller wrapper

Updated: Aug 03, 2023 80-NN752-1 Rev: E


Operations and run‑time registers

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Register Description

Interrupt status (IS) The IS register indicates pending interrupts that


require service. It is symmetric to the IE
register.

Interrupt enable (IE) The IE register enables or disables the


reporting of a corresponding interrupt to the
host software. Each bit specifies an interrupt
except for a few that are reserved. When a bit
is set to 1 and the corresponding interrupt
condition is active, an interrupt is generated.
The IE register is symmetric to the IS register.
Host controller status (HCS) The HCS register indicates the status of the
host controller:

● Determines the error code for UTP

layer errors.

● Detects if the host controller is ready to

process TMRs (using bit 2) and TRs

(uses bit 1). These fields are cleared to

0 by the host controller if device

presence is not detected or an error is

detected within the host controller or

device.

● Detects if a device is present. Bit 0 is

set to 1 when a UFS device is attached

to the controller and cleared to 0 when

there is no UFS device attached.

Host controller enable (HCE) When bit 0 is set to 1 by the software, the host
controller is enabled. This causes the host
controller and UniPro controller to reset. After
the host controller initialization process is
completed, the host controller sets the HCE
register to 1. When it is set to 0, the host
controller shuts down the UIC layer (UniPro
and PHY) and the device attached before
disabling.

UTP TR register The UTP TR list is an array that manages and


contains up to 32 UTRDs.

UTP TMR register The UTP TMR list is an array that manages
and contains up to 8 UTMRDs.

List base address ● UTRLBA – Indicates the 32-bit base


(UTRLBA/UTRLBAU/UTMRLBA/UTMRLBAU) physical address for the UTP TR list.

● UTRLBAU – Indicates the upper 32 bits

for the TR list base address; this base

address is used when fetching

commands for execution.

● UTMRLBA – Indicates the 32-bit base

physical address for the UTP TMR list.

● UTMRLBAU –Indicates the upper 32

bits for the TMR list base address; this

base address is used when fetching

commands for execution.


List clear (UTRLCLR/UTMRLCLR) Each bit in this register corresponds to a slot in
the UTP TR or UTP TMR list, where bit 0
corresponds to request slot 0. A bit in this field
is set to 0 by the host software to indicate to
the host controller that a TR/TMR slot is
cleared. The host controller frees up any
resources associated to the request slot and
sets the associated bit in
UTRLDBR/UTMRLDBR to 0.

Run stop (UTRLRSR/UTMRLRSR) When bit 0 is set to 1, the host controller


processes this list. It continues processing the
list until bit 0 is set to 0, at which point the host
controller completes all the outstanding
TR/TMRs in the list and then stops.

Doorbell register (UTRLDBR/UTMRLDBR) Each bit in this register corresponds to a slot in


the UTP TR or UTP TMR list, where bit 0
corresponds to request slot 0. A bit in this field
is set to 1 by the host software to indicate to
the host controller that a TR/TMR has been
built in system memory for the associated
TR/TMR slot and is ready for execution.

When a TR/TMR is completed, with success or


error, the corresponding bit is cleared to 0 by
the host controller. The host controller always
processes TR/TMRs according to the order
each request is submitted to the list. However,
the execution of commands can be completed
out of order.

HCI TR FIFO The TR FIFO is a logic that ensures that TRs


are executed in a first-in-first-out manner. Each
time the doorbell register bit is marked, the bit
index is marked in the TR FIFO to maintain the
sequence in which these bits are written.

HCI TMR FIFO The TMR FIFO is a logic that ensures that
TMRs are executed in a first-in-first-out
manner. Each time the doorbell register bit is
marked, the bit index is marked in the TMR
FIFO to maintain the sequence in which these
bits are written.
IAG The host controller supports IAG, a process in
which a single command completion interrupt is
generated for a predefined number of
command completions. IAG is only supported
for the TR list; it is not supported for TMR list.

IAG is performed via the UTP transfer request


interrupt aggregation control register
(UTRIACR) using the following bits:

● Bit 31 – Interrupt aggregation

enable/disable bit (IAEN); the interrupt

aggregation mechanism is enabled or

disabled based on this bit.

● Bit 20 – Interrupt aggregation status bit

(IASB); this bit indicates to the host

software whether any responses have

been received and counted towards

interrupt aggregation.

● Bits 8-12 – Interrupt aggregation

counter threshold; these bits are used

to configure the number of responses

required to generate an interrupt.

● Bits 0-7 – Interrupt aggregation timeout

value; these bits are used to configure

the maximum time allowed between


response arrival to the host controller

and the generation of an interrupt.


UFS host interface architecture

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The UFS host driver allocates and uses TR descriptors and TMR descriptors to
communicate with the host controller hardware. The host controller system
memory has the capability to receive up to 32 TR descriptors and 8 TM
descriptors simultaneously.

Each TR descriptor points to the command UPIU (sent from SCSI), allocates a
place holder for the response UPIU, and contains a physical region descriptor
table (PRDT) that points to the data buffer that needs to be transferred or the
memory address for the data that would be received during the read command.

Figure : Host interface architecture diagram


Data structures

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The data structures represent the different system memory descriptors that the
device driver uses to communicate with the host controller as defined by the
specification. The corresponding data structures used in the UFSHC driver are
also described.

UTP UTRD interface

The UTP UTRD list comprises 32 UTP UTRDs.

Figure : UTRD interface


struct utp_transfer_req_desc {

/* DW 0-3 */

struct request_desc_header header;/* DW 4-5*/

__le32 command_desc_base_addr_lo;

__le32 command_desc_base_addr_hi;

/* DW 6 */

__le16 response_upiu_length;

__le16 response_upiu_offset;

/* DW 7 */
__le16 prd_table_length;

__le16 prd_table_offset;

};

UTP command descriptor

A UTP command descriptor contains the UPIU for the command, offset and
length of the data buffer associated with the command, and offset and length of
the PRDT.

Figure : UTP command descriptor


The TR region of the descriptor provides information on the command, for
example, NOP out or query request. The transfer response contains space for
the incoming UPIU to the command from the device. The PRDT is required only
for SCSI commands that require data transfer. UFSHCI supports three types of
commands—SCSI, native UFS, and device management.

struct utp_transfer_cmd_desc {

u8 command_upiu[ALIGNED_UPIU_SIZE];

u8 response_upiu[ALIGNED_UPIU_SIZE];

struct ufshcd_sg_entry prd_table[SG_ALL];

};

UTP TMR descriptor (UTMRD)


Figure : UTMRD interface

struct utp_task_req_desc {

/* DW 0-3 */

struct request_desc_header header;

/* DW 4-11 */
__le32 task_req_upiu[TASK_REQ_UPIU_SIZE_DWORDS];

/* DW 12-19 */

__le32 task_rsp_upiu[TASK_RSP_UPIU_SIZE_DWORDS];

};
Device tree

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The device tree is a data structure that contains properties describing the
hardware. The device tree allows the hardware properties to be referenced
without being hardcoded in the device driver. It makes the code generic and
allows for the use of different hardware without changing drivers. The UFS device
tree is located at kernel/arch/arm/boot/dts/qcom/<target name>.dtsi.

The first hardware described by the UFS device tree is the UFS PHY layer. UFS
PHY nodes describe the UFS PHY hardware macro. To bind the UFS PHY layer
with the UFS host controller, the controller node should contain a phandle
reference to UFS PHY node.

The following is a sample of the UFS properties in the device tree:

ufsphy1: ufsphy@0xfc597000 // This is the physical address of the UFS PHY


harware interface.

compatible = “qcom,ufsphy”;

reg = <0xfc597000 0x87c>;


vdda-phy-supply = <&pma8084_l3>; //LD03 is the power rail for main PHY
supply for analog domain

vdda-pll-supply = <&pma8084_l12>; // LD12 is the power rail for PHY PLL and
Power_Gen block

vdda-phy-max-microamp = <50000>; // Both this and vdda-pll-max-microamp


specify max power load that can be drawn from PHY and PLL power supply

vdda-pll-max-microamp = <1000>;//

status = “disabled”;

UFSHC nodes are defined to describe the on-chip UFS host controller. Each
instance of the controller should have its own node.

ufs1: ufshc@0xfc594000 // Start of the node that has the physical address
of UFS controller.

ufs-phy = <&ufsphy1> // points to the UFS PHY node

vcc-supply = <&pma8084_l19>;

vccq-supply =<&pma8084_l4>;
vccq2-supply = <&pma8084_s4>; // The above 3 describe the power rail
supplying to vcc, vccq and vccq2.

vcc-max-microamp = <460000>;

vccq-max-microamp = <450000>;

vccq2-max-microamp = <145000>; // Specifies maximum power load that can be


drawn from vcc, vccq and vccq2 supply

clock-names = “core_clk_src”, “core_clk”, “bus_clk”, “iface_clk”,

“ref_clk”;

// List of clock input name strings.

freq-table-hz = <100000000 200000000>, <0 0>, <0 0>,

<0 0>, <0 0>;

// Array of <min max> operating frequencies

qcom,msm-bus,name = "ufs1";
qcom,msm-bus,num-cases = <22>;

qcom,msm-bus,num-paths = <2>;

qcom,msm-bus,vectors-KBps =

<95 512 0 0>, <1 650 0 0>, /* No vote */

<95 512 922 0>, <1 650 1000 0>, /* PWM G1 */

<95 512 1844 0>, <1 650 1000 0>, /* PWM G2 */

SNIP

qcom,bus-vector-names = "MIN",

"PWM_G1_L1", "PWM_G2_L1", "PWM_G3_L1", "PWM_G4_L1",


UFS driver initialization

Updated: Aug 03, 2023 80-NN752-1 Rev: E

UFS driver initialization starts from the platform driver. This driver reads the
device tree, parses the different device elements, allocates memory for the host
bus adaptor (HBA), and then calls driver initialization using ufshcd_init().

The ufshcd_init() API performs the following tasks:

● Initializes clocks and vregs


● Reads the capabilities of the controller
● Allocates memory for various driver data structures
● Initializes the work queue, wait queue, mutexes, and IRQ
● Enables the host controller
● Maps vendor-specific callbacks
● Initializes the UniPro and M-PHY layers

The following figure illustrates the ufshcd_init() call flow.

Figure : ufshcd_init call flow


ufshd_pltfrm_probe

Initialization of the code starts at ufshcd-pltfrm.c. The ufshd_pltfrm_probe API


reads the ufs node from the device tree, reads the IRQ number, and then calls
APIs from ufshcd.c and ufs‑msm.c to initialize the driver.

ufshcd_alloc_host
The ufshcd_alloc_host API allocates HBA resources. The ufs_hba data structure
defines the register addresses and points to the lower device driver, capabilities,
IRQ, voltage regulator information, and so on.

Data structure Description

struct utp_transfer_req_desc *utrdl_base_addr; UTRD base address

struct utp_transfer_cmd_desc *ucdl_base_addr; UFS command descriptor base address

struct utp_task_req_desc *utmrdl_base_addr; UTMRD base address

dma_addr_t ucdl_dma_addr; UFS command descriptor DMA address


dma_addr_t utrdl_dma_addr; UTRDL DMA address

struct ufshcd_lrb *lrb; Local reference block

unsigned long lrb_in_use; Boolean flag that shows if the LRB is in use

unsigned long outstanding_tasks; Bits representing outstanding task requests

unsigned long outstanding_reqs; Bits representing outstanding TRs

u32 capabilities; Controller capabilities


int nutrs; TR queue depth supported by controller

u32 ufs_version; UFS version

struct ufs_hba_variant_ops *vops; Pointer to variant specific operations

void *priv; Pointer to variant specific private data

unsigned int irq; IRQ number of the controller

bool is_irq_enabled; Flag to check if HBA is enabled or disabled


struct uic_command *active_uic_cmd; Handle of active UIC command

Current state of the UFSHCD, can be

● Operational

● Reset

● Error

u32 ufshcd_state; UFSHCD states

u32 eh_flags; Error handling flags

u32 intr_mask; Interrupt mask bits


u16 ee_ctrl_mask; Exception event control mask

bool is_powered; Flag to check if the HBA is powered on; set


after clocks and vregs are initialized

struct work_struct eh_work; Worker to handle UFS errors that require


software attention

struct work_struct eeh_work; Worker to handle exception events raised


by the device

ufshcd_enable_auto_bkops() Tracks whether bkops is enabled in the


device
bool auto_bkops_enabled; Flag to determine if bkops is enabled or
disabled

struct ufs_dev_cmd dev_cmd; Device management request data; can be of


type NOP or query

get_variant_ops

The get_variant_ops is a function that maps low-level functions to the UFSHCD


and returns structure type ufs_hba_variant_ops. The ufs_hba_variant_ops
pointers are QTI-specific function pointers defined in ufs-msm.c.

struct ufs_hba_variant_ops {

const char *name;

int (*init)(struct ufs_hba *);

void (*exit)(struct ufs_hba *);


int (*setup_clocks)(struct ufs_hba *, bool);

int (*setup_regulators)(struct ufs_hba *, bool);

int (*hce_enable_notify)(struct ufs_hba *, bool);

int (*link_startup_notify)(struct ufs_hba *, bool);

int (*pwr_change_notify)(struct ufs_hba *,

bool, struct
ufs_pa_layer_attr *,

struct ufs_pa_layer_attr
*);

int (*suspend)(struct ufs_hba *, enum ufs_pm_op);

int (*resume)(struct ufs_hba *, enum ufs_pm_op);

};

ufshcd_parse_clock_info
The ufshcd_parse_clock_info API reads the device tree for a list of possible
clocks and selects the name and value of the highest clock frequency.

ufshcd_parse_vcc_info

The ufshcd_parse_vcc_info API reads information from the device tree for VCC,
VCCQ, and VCCQ2 and sets it into the driver.

ufshcd_init

This is the driver initialization routine. It calls ufshcd_hba_init() to initialize clocks


and voltage regulators and then calls the init API of the variant driver
msm_ufs_init().

msm_ufs_init calls the following:

● msm_ufs_bus_register – Registers the bus


● msm_ufs_phy_enable_vreg – Enables the voltage regulator of the PHY
layer
● msm_ufs_enable_phy_ref_clk – Enables the reference clock for the MPHY
layer

The following figure shows the TR and TMR descriptor lists that are created by
the initialization process.
Figure : TR and TMR descriptor lists

The following figure shows how each TR descriptor points to the command UPIU
sent from SCSI.

Figure : TR descriptor pointers


Queue commands

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The SCSI host passes the command to the API, a local reference block structure
is assembled from the cmd argument, a UPIU structure is composed, and the
driver requests the doorbell for completion.

The following figure illustrates the ufshcd_queuecommand() call flow.

Figure : ufshcd_queuecommand call flow


ufshcd_queuecommand()
The ufshcd_queuecommand() command is the main entry point for all SCSI
requests.

ufshcd_compose_upiu

The ufshcd_compose_upiu API takes the HBA as its first argument and the
configured LRB as its second argument. If the cmd type is SCSI, it prepares the:

● Request descriptor header


● UPIU SCSI command

If the command is from the device manager, it prepares the:

● Request descriptor header


● UPIU query request or UPIU NOP request.

The following figure illustrates the ufshcd_compose_upiu process.

Figure : ufshcd_compose_upiu
ufshcd_map_sg

The ufshcd_map_sg API maps the scatter-gather list of the PRDT.

Figure : ufshcd_map_sg
ufshcd_send_command

The ufshcd_send_command API requests the doorbell register.


UFS main IRQ

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Figure : UFS main IRQ

ufshcd_intr

The ufshcd_intr API is the main interrupt handler. It reads the interrupt status
register and, if any interrupt bit is set, calls the interrupt status handler.
ufshcd_sl_intr

The ufshcd_sl_intr API is called to service the interrupt routine. The interrupt
status bits are passed from ufshcd_intr() as an arg and the status is checked for
any errors. If errors are found, the driver calls error handler
ufshcd_check_errors(). If the TR is completed, the driver calls
ufshcd_transfer_req_compl().

ufshcd_transfer_req_compl()

The ufshcd_transfer_req_compl() API resets the interrupt aggregation register


and reads the doorbell register. For each command that was sent, it reads the
status, unmaps the DMA for that command, and passes the status to the SCSI
layer.

If the status has an error, the sense data is read from the UPIU response packet
and passed to the SCSI layer. If there is an error, the error handler is invoked.
UFS task management

Updated: Aug 03, 2023 80-NN752-1 Rev: E

UFS task management handles TMRs issued by the application.

Figure : FS task management


ufshcd_issue_tm_cmd

The ufshcd_issue_tm_cmd API issues task management commands to the


controller. It finds a free slot in the TMR list, configures the task management
UPIU, and requests the doorbell register.

ufshcd_task_req_compl

The ufshcd_task_req_compl API causes the controller to trigger an interrupt


when it completes a task and, if the OCS is successful, it reads the response
register from the response UPIU and passes it to the SCSI layer.
Driver error handling mechanism

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The host controller driver has two work queues that execute two different types of
error handling.

● ufshcd_err_handler – Called when the controller encounters system level


errors such as a UIC error or system bus error
● ufshcd_exception_event_handler – Called to service device exceptions
​ ufshcd_err_handler()
​ ufshcd_exception_event_handler

​ ufshcd_err_handler()

​ Updated: Aug 03, 2023 80-NN752-1 Rev: E

​ Figure : System exception flow




​ ufshcd_transfer_req_compl
​ The ufshcd_transfer_req_compl API reads the TR doorbell register and, for
each request completed, reads the transfer status of that request and
passes the information to the SCSI layer. The DMA unmaps the memory
used for the command.

​ ufshcd_tmc_handler
​ The ufshcd_tmc_handler API invokes ufshcd_task_req_compl to read the
overall command status register. If it succeeds, it passes the response
value to the SCSI layer.

​ Host control reset


​ If there are any errors in the UIC layer, or in any of the TMRs of TRs, the
host controller reset is called.

​ ufshcd_reset_and_restore
​ The ufshcd_reset_and_restore API resets the host controller and the UIC
layer if the controller encounters any errors. This API also resets the UFS
card for MSM8998 and later chipsets.
​ After the reset, the status of the controller is restored and it gets initialized
for new commands.

​ ufshcd_exception_event_handler

​ Updated: Aug 03, 2023 80-NN752-1 Rev: E

​ The ufshcd_exception_event_handler API handles exceptions raised by


the device.
​ Figure : Exception handler


​ ufshcd_get_ee_status
​ The ufshcd_get_ee_status API sends the dev cmd query to read the
device exception error.

​ ufshcd_bkops_ctrl
​ If the received flag has the URGENT BKOPS request set, the API calls
ufshcd_bkops_ctrl to handle the error request. It sends the query to read
the BKOPS status flag.

​ ufshcd_enable_auto_bkops
​ If the value of the status is equal to or more than PERF_IMPACT,
ufshcd_enable_auto_bkops is called. This allows UFS to execute BKOPS
when needed.

​ ufshcd_disable_auto_bkops
​ If the value is less than PERF_IMPACT, ufshcd_disable_auto_bkops is
called to disable auto BKOPS. The host then signals the device to execute
BKOPS.

Two-lane UFS host controller in
MSM8998

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Starting from MSM8998, the UFS host controller is configured with two Tx and
two Rx lanes. The motivation for the two-lane configuration is to achieve higher
Rx throughput.

For example, Seq Read for MSM8996 with one lane is approximately 530 MB/s,
and Seq Read for MSM8998 with two lanes is approximately 880 MB/s,
achieving an increase of approximately 66% in Rx throughput.

Configure the target to either one lane Rx, one lane Tx, or one lane for both Rx
and Tx by changing the flags in the kernel/msm-4.4/drivers/scsi/ufs/ufs-qcom.h
file. In the following example, the lane setting is changed from 2 to 1:

Default flags for two lane configuration:

​ #define UFS_QCOM_LIMIT_NUM_LANES_RX 2

#define UFS_QCOM_LIMIT_NUM_LANES_TX 2

Configuring flags for one lane Rx and one lane Tx:

​ #define UFS_QCOM_LIMIT_NUM_LANES_RX 1
#define UFS_QCOM_LIMIT_NUM_LANES_TX 1

Auto hibern8 (AH8)

AH8 is a hardware feature where the controller sets the state of the link between
the host and device to hibern8. AH8 is more aggressive than hibern8 and is
issued by the driver enabling overall power optimization after AH8 is enabled.

Although AH8 is a hardware feature, the driver sets the hibern8 time. The
driver sets the AH8 time in the
kernel_msm-4.4/kernel/msm-4.4/drivers/scsi/ufs/ufshcd.c
file in the following API:

​ ufshcd_init_hibern8_on_idle()

hba->hibern8_on_idle.delay_ms = 1

Change the AH8 time by changing the UFS-RESET variable.

A new register, TLMM_UFS_RESET, allows the software to assert/deassert


RESET_n signals of the UFS device. If the UFS device is reset using this register
while the MSM™ chipset remains powered on, it may result in disabling write
protect (WP) features configured for LUN when the target is booted up.

UFS clock scaling


The UFS device performs clock scaling depending on the UFS data load. The
UFS scales down to 50 MHz and scales up to 400 MHz. Clock scaling is
performed to achieve power optimization while ensuring the performance impact
is minimal.

The UFS driver (ufshcd.c) registers with kernel devfreq with two structures:

● static struct devfreq_simple_ondemand_data ufshcd_ondemand_data =


​ {
​ upthreshold = 70,
​ downdifferential = 65,
​ simple_scaling = 1,
​ };

● static int ufshcd_devfreq_init(struct ufs_hba *hba)
​ {
​ scaling->profile.polling_ms = 60;
​ scaling->profile.target = ufshcd_devfreq_target;
​ scaling->profile.get_dev_status =
ufshcd_devfreq_get_dev_status;
● }

ufshcd_devfreq_get_dev_status() is executed after every 60 ms and starts a


sampling window.

scaling->window_start_t // start of sampling window


If UFS operations are within the sampling window, the API measures the time
to complete those operations.

scaling->tot_busy_t // measures total time when ufs driver was


performing operation(s)

Based on these two variables and the ufshcd_ondemand_data structure, the


kernel devfreq determines if UFS requires to be scaled up, scaled down, or
if no change is needed; for example:

● If scaling->tot_busy_t is equal to or more than (upthreshold =) 70% of


scaling-> window_start_t, devfreq scales up the UFS frequency.
● If scaling->tot_busy_t is less than (upthreshold – downdifferential =) 5% of
scaling-> window_start_t, devfreq scales down the UFS frequency.
● If scaling->tot_busy_t is less than (upthreshold =) 70%, but more than
(upthreshold – downdifferential =) 5% of scaling->window_start_t, devfreq
does not change the frequency.

If the kernel devfreq decides to change the frequency, it calls routine


ufshcd_devfreq_target() that performs some sanity checks before calling
ufshcd_devfreq_scale() to change the frequency and UFS gear.

UFS clock gating

If there is no activity for a given period of time, the UFS driver gates its clocks.
Clock gating time period is different at different clock frequencies, for
example:
● Time period = 10 ms when clocks are scaled to high frequency
● Time period = 50 ms when clocks are scaled to low frequency

Two main APIs that affect gating are:

● ufshcd_gate_work() – Clocks are gated only when a link is in an hibern8


state. This API ensures the link is first in an hibern8 state, then it proceeds
to gate the clocks, following which it puts the host in low power mode.
● ufshcd_ungate_work() – Ungating sequence is opposite of gating
sequence. First the host exits low power mode, then the clocks are
ungated, finally the link exits the hibern8 state.

Debugging

To log a history of commands and events along with their timestamp, execute
the following command from a Trace32 simulator on RAM dumps:

v.v ufs_qcom_hosts[0]->hba->cmd_log.entries[]

This debug feature enables logging UFS commands in RAM and is useful in
analyzing a sequence of commands leading to a failure.

The following is the output of one of several commands captured by the


debug patch:
scsi: send: seq_no=46747 lun=0x0 cmd_id=0x2a lba=0x1414070
txfer_len=4096 tag=2, doorbell=0x24 outstanding=0x24 idn=0
time=41710801 us

The debug patch enables the following information when executing a


command:

● cmd type – Prints the type of command, for example, scsi, query, nop, and
so on
● string – For example, send or complete
● sequence number – Displays how many commands were executed before
the current command
● lun – Logical unit for this particular command
● cmd_id
● lba
● txfer_len – Size of the transaction for a command, in bytes
● tag – Current doorbell register for a command
● doorbell – Different bits of a doorbell register set when executing the
command
● idn – Used only for query idn, a value that indicates the type of data
● outstanding reqs – Bits representing number of transfer requests
● tstamp – Timestamp

We can see the following in a Trace32 simulator. Here, 20 different log entries
will be printed. Each log entry is unique, which can be identified by
understanding the str, cmd_type, cmd_id. The timestamp shows the time
taken for each command to complete execution.
1. (str = 0xFFFFFFA0A8E53D97 -> "dme_cmpl_2", cmd_type =
0xFFFFFFA0A8ECCAAC -> "dme", lun = 0x0, cmd_id = 0x17, lba = 0x0,
transfer_len = 0x0, idn = 0x0, doorbell = 0x0, outstanding_reqs = 0x0,
seq_num = 0xCB20, tag = 0x0, tstamp = 0x0000001B8EB168B2),
2. (str = 0xFFFFFFA0A8E53C3C -> "dme_send", cmd_type =
0xFFFFFFA0A8ECCAAC -> "dme", lun = 0x0, cmd_id = 0x18, lba = 0x0,
transfer_len = 0x0, idn = 0x0, doorbell = 0x0, outstanding_reqs = 0x0,
seq_num = 0xCB21, tag = 0x0, tstamp = 0x0000001B8F9FEA3B),
3. (str = 0xFFFFFFA0A8E53D97 -> "dme_cmpl_2", cmd_type =
0xFFFFFFA0A8ECCAAC -> "dme", lun = 0x0, cmd_id = 0x18, lba = 0x0,
transfer_len = 0x0, idn = 0x0, doorbell = 0x0, outstanding_reqs = 0x0,
seq_num = 0xCB22, tag = 0x0, tstamp = 0x0000001B8FA4570B),
4. (str = 0xFFFFFFA0A8E53A38 -> "scsi_send", cmd_type =
0xFFFFFFA0A8E50F61 -> "scsi", lun = 0x0, cmd_id = 0x2A, lba =
0x01A46458, transfer_len = 0x1000, idn = 0x0, doorbell = 0x1,
outstanding_reqs = 0x1, seq_num = 0xCB23, tag = 0x0, tstamp =
0x0000001B8FA59230),
5. (str = 0xFFFFFFA0A8E5448B -> "scsi_cmpl", cmd_type =
0xFFFFFFA0A8E50F61 -> "scsi", lun = 0x0, cmd_id = 0x2A, lba =
0x01A46458, transfer_len = 0x1000, idn = 0x0, doorbell = 0x0,
outstanding_reqs = 0x1, seq_num = 0xCB24, tag = 0x0, tstamp =
0x0000001B8FA6D33B)

To change the format of the individual fields of the log entries, select the value for
str/cmd_type and right-click the cursor to change the format to string. This will
show what the str/cmd_type is.
For example, in (1), the str is “dme_cmpl_2”, cmd_type is “dme”, the cmd_id is
0x17, which when looked up in scsi_proto.h, stands for “dme” command.
Similarly, if cmd_id is 0x18, it stands for copy. All these values can be looked up
in scsi_proto.h to understand the type of command. The control plane
communicates with the UniPro transport layer via device management entity
service primitives. The str shows us this was a dme complete command. The
sequence number tells us the order of the commands executed and we can see
that the value of the sequence number is incremented in each of the following
commands. For some of the commands, there is no need to access LUNs or
LBAs. Hence, in (1), (2), and (3), the value for LUN and LBA remains zero.
However, in (4) and (5), it is a scsi_send and scs-_cmpl command, and cmd_id –
0x2A, which stands for write_10 operation. Here the logical block address has a
value, which states that the write operation must happen with reference to the
LBA mentioned.
WriteBooster

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Triple-level-cell flash NAND (TLC NAND) stores up to 3 bits. TLC NAND bits are
logically defined, which requires complicated programming and has a higher
error correction probability. Single-level-cell NAND (SLC NAND) stores 1 bit.

To improve write performance, a WriteBooster buffer is used, where part of the


TLC NAND storage is configured as SLC NAND, either temporarily or
permanently. This inclusion of SLC NAND as a WriteBooster buffer ensures that
write requests are processed with lower latency, which also improves overall
write performance. In TLC NAND, some portions are allocated as user space,
which is taken up as WriteBooster buffer. The data written into the WriteBooster
buffer can be pushed into TLC NAND storage either by explicit host command or
implicitly while in Hibernate state.
Figure : WriteBooster callflow

WriteBooster feature support is only for UFS 3.1 devices. It is enabled only when
the clocks are scaled up. ufshcd_wb_conifg checks if the WriteBooster feature is
supported with the API. ufshcd_wb_sup Bit[8] of dExtendedUFSFeaturesSupport
indicates if the device supports the WriteBooster feature. It also configures the
LUN, which can be configured in two ways:

● LU dedicated buffer
● Shared buffer ufshcd_wb_ctrl – Query request that attempts to set a
WriteBooster parameter in a configuration descriptor to a value different
from zero; if that is true, WriteBooster is enabled.

Flushing when the entire buffer for WriteBooster is consumed, data will be written
in normal storage instead of in the WriteBooster buffer. The device informs the
host when the WriteBooster buffer is full or near full with the exception event
WRITEBOOSTER_FLUSH_NEEDED. The ufshcd_wb_flush_needed event
mechanism is enabled by setting the WRITEBOOSTER_EVENT_EN bit of the
wExceptionEventControl attribute.

There are two methods for flushing data from the WriteBooster buffer to the
normal storage:

● Explicit flush command ufshcd_wb_buf_flush_enable – The Host controller


writes to this flag through the device controller, to enable data flushing on
the WriteBooster buffer whenever necessary. Query
dAvailableWriteBoosterBufferSize attribute and enable the WriteBooster
buffer flush if only 30% of the WriteBooster buffer is available. In reduction
cases, flush only if 10% is available.
● ufshcd_wb_toggle_flush_during_h8 – Enables the flush operation during
hibernate. The device initiates a WriteBooster buffer flush operation
whenever the link enters the Hibernate state.
Provisioning

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Provisioning is an event that creates and defines logical unit numbers (LUNs) on
a new card. The provisioning XML file can be found in
<target_build>\common\config\ufs\provision, where
<target_build> is the customer build ID.

Logical units can be configured in Secure mode using bProvisioningType


parameter of the Unit Descriptor. The UFS device can be reprovisioned as many
times as needed, if bConfigDescrLock="0". When the UFS device is provisioned
with bConfigDescrLock="1", this is the final provision and it cannot be
reprovisioned.

UFS provisioning layout

HLOS kernel, HLOS filesystem, modem image, MBA, and ESP (EFI system
partition) are stored in LUN0. All peripheral image loader (PIL) images are stored
in the HLOS file system in LUN0. Well-known LUN 0x30 alternates between
LUN1 and LUN2 to provide a fail safe backup for the extended boot loader (XBL).

One-time programmable images (OTP) are stored in LUN3. The rest of the boot
chain is stored in LUN4, using the existing backup GPT mechanism for a fail safe
update.
Customers can configure the last two LUNs, but they must create all partitions at
the same time.

Tools required for provisioning

Qualcomm Flash Image Loader (QFIL) – A QTI tool that supports provisioning
and flashing images on UFS in a factory environment. If using a flashing tool
other than QFIL, the customer must change the block size to 4K to be compatible
with UFS.

For more information on provisioning and types of provisioning, refer to the


JEDEC specification Universal Flash Storage (UFS 3.1), JESD220E.
Customization

Updated: Aug 03, 2023 80-NN752-1 Rev: E

The following describes customization commands.

● Enable/disable clock scaling


○ To enable clock scaling:
○ echo 1 >
/sys/bus/platform/devices/1d84000.ufshc/clkscale_enable
○ To disable clock scaling:
○ echo 0 >
/sys/bus/platform/devices/1d84000.ufshc/clkscale_enable
○ To view existing setting:
○ cat
/sys/bus/platform/devices/1d84000.ufshc/clkscale_enable
● Enable/disable clock gating (enabled by default)
○ To enable clock gating:
○ echo 1 >
/sys/bus/platform/devices/1d84000.ufshc/clkgate_enable
○ To disable clock gating:
○ echo 0 >
/sys/bus/platform/devices/1d84000.ufshc/clkgate_enable
○ To view existing setting:
○ cat
/sys/bus/platform/devices/1d84000.ufshc/clkgate_enable
● Disable low power mode

echo 0 >
/sys/bus/platform/devices/1d84000.ufshc/hibern8_on_idle_enable

echo 0 > /sys/bus/platform/devices/1d84000.ufshc/clkscale_enable

echo 0 > /sys/bus/platform/devices/1d84000.ufshc/clkgate_enable

echo on > /sys/bus/platform/devices/1d84000.ufshc/power/control


● Change auto_hibern8 time from 10 ms
By default, the time is set to 1 ms.
To view:
● cat /sys/bus/platform/devices/1d84000.ufshc/auto_hibern8
● Output displayed on the screen: 1000
To change the time to 100 ms:
● echo 100000 >
/sys/bus/platform/devices/1d84000.ufshc/auto_hibern8
Related documents

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Title Number

Qualcomm Technologies, Inc.

Qualcomm Flash Image Loader (QFIL) User Guide 80-NN120-1

Resources

Universal Flash Storage (UFS) Host Controller JESD223A

Interface

Universal Flash Storage (UFS 1.1) JESD220A

Universal Flash Storage (UFS 3.1) JESD220E


Acronyms and terms

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Acronym or term Definition

AH8 Auto hibern8

CDB Command descriptor block

CPort CPort is a SAP on the UniPro transport layer (L4) within a device that is
used for connection-oriented data transmission

CRB Command request block

CSR Configuration and Status Register Interface


DME Device management entity

EOP End of packet

FIFO First-in-first-out

GCC Global clock controller

HBA Host bus adaptor

HCE Host controller enable

HCS Host controller status

HPQ High priority queue

IAG Interrupt aggregation and generation

LBA List base address

LRB Local reference block


LUN Logical unit number

MIPI Mobile Industry Processor Interface

PLL Phase-locked loop

PRDT Physical region descriptor table

RPMB Replay protected memory block

SAP Service access point

SBC SCSI block commands

SOF Start of packet

SPC SCSI primary commands

TMR Task management request

TR Transfer request
UCS UFS command set

UFS Universal Flash Storage

UIC UFS interconnected

UPIU UFS protocol information unit

UTP UFS transfer protocol

UTRD UFS transfer request descriptor

UTMRD UFS task management request descriptor

UniPro Unified Protocol

WP Write protect
Revision history

Updated: Aug 03, 2023 80-NN752-1 Rev: E

Revision Date Description

A May 2014 Initial release

B July 2017 Added Section 5.3, Logical block provisioning, and Section 10.5,
Two-lane UFS host controller in MSM8998

Updated Section 10.4.1.4, ufshcd_reset_and_restore

C July 2018 Editorial updates. No technical content was changed.

D April 2020 Added WriteBooster section; updated UFS clock scaling, UFS
clock gating, and debugging sections
E August 2023 Added Provisioning and Customization sections; updated
WriterBooster callflow

You might also like