[CSAPP]bomblab实验笔记
这节搞的是二进制拆弹,可以通俗理解为利用反汇编知识找出程序的六个解锁密码.
早就听闻BOMBLAB的大名,再加上我一直觉得反汇编是个很艰难的工作,开工前我做好了打BOSS心理准备.实际上手后发现比想象的要简单.
我觉得这多亏了作者没有搞代码优化,让我能比较轻易的还原出源代码,甚至一眼看出所用的数据结构.但凡它搞一点儿代码混淆,都会把这次实验变成一次苦痛之旅.
前置小技巧
1.gdb调试汇编
我试了一番后觉得用以下几条指令在全屏模式下调试是体验最好的
gdb -tui ./bomb #带文字用户界面
layout asm #大板块的汇编视图
layout regs #大板块的寄存器视图
效果预览
2.函数参数与寄存器的关系
位数\参数数量 | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
64 | %rdi | %rsi | %rdx | %rcx | %r8 | %r9 |
32 | %edi | %esi | %edx | %ecx | %r8d | %r9d |
16 | %di | %si | %dx | %cx | %r8w | %r9w |
8 | %dil | %sil | %dl | %cl | %r8b | %r9b |
3.有什么问题不妨先翻到文章末尾看看参考链接
phase_1
执行gdb -S --no-show-raw-insn ./bomb > dump.txt
,将bomb可执行文件反汇编结果放入到dump.txt中,并去除掉冗余的机器码数据
观察main的反汇编,在调用phase_1之前有一步mov %rax,%rdi
,故猜测rdi中为输入字符串的地址
0X400e32: callq 0X40149e <read_line>
0X400e37: mov %rax,%rdi
0X400e3a: callq 0X400ee0 <phase_1>
0X400e3f: callq 0X4015c4 <phase_defused>
启动gdb ./bomb
,输入b phase_1
在phase_1处加断点,用"testmsg"做测试信息
(gdb) b phase_1
Breakpoint 1 at 0x400ee0
(gdb) r
Starting program: /home/kangyu/Desktop/bomb/bomb
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
testmsg
info r
查看寄存器状态
(gdb) info r
rax 0x603780 6305664
rbx 0x0 0
rcx 0x7 7
rdx 0x1 1
rsi 0x603780 6305664
rdi 0x603780 6305664
rbp 0x402210 0x402210 <__libc_csu_init>
rsp 0x7fffffffde38 0x7fffffffde38
记下rdi对应值,x/s 0x603780
查看对应字符串
(gdb) x/s 0x603780
0x603780 <input_strings>: "testmsg"
则确认猜想
再来看phase_1的反汇编代码
0X0000000000400ee0 <phase_1>:
0X400ee0: sub $0x8,%rsp
0X400ee4: mov $0x402400,%esi
0X400ee9: callq 0X401338 <strings_not_equal>
0X400eee: test %eax,%eax
0X400ef0: je 0X400ef7 <phase_1+0x17>
0X400ef2: callq 0X40143a <explode_bomb>
0X400ef7: add $0x8,%rsp
0X400efb: retq
大胆猜测strings_not_equal是比较两个字符串是否相等的,且一个参数是rdi,另一个参数是esi.只要这俩字符串一样,phase_1就解开了
在strings_not_equal前加个断点看看esi对应字符串的值
(gdb) b *0x400ee9
Breakpoint 2 at 0x400ee9
(gdb) c
Continuing.
Breakpoint 2, 0x0000000000400ee9 in phase_1 ()
(gdb) x/s 0x402400
0x402400: "Border relations with Canada have never been better."
记下上面这个字符串,重启一遍bomb测试之
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
Border relations with Canada have never been better.
Phase 1 defused. How about the next one?
成功押中.
phase_2
为了方便研究,我把跳转指令的目标地址替换成了对应标签
0X0000000000400efc <phase_2>:
0X400efc: push %rbp
0X400efd: push %rbx
0X400efe: sub $0x28,%rsp
0X400f02: mov %rsp,%rsi
0X400f05: callq 0X40145c <read_six_numbers>
0X400f0a: cmpl $0x1,(%rsp)
0X400f0e: je .L3
0X400f10: callq 0X40143a <explode_bomb>
0X400f15: jmp .L3
.L1
0X400f17: mov -0x4(%rbx),%eax
0X400f1a: add %eax,%eax
0X400f1c: cmp %eax,(%rbx)
0X400f1e: je .L2
0X400f20: callq 0X40143a <explode_bomb>
.L2
0X400f25: add $0x4,%rbx
0X400f29: cmp %rbp,%rbx
0X400f2c: jne .L1
0X400f2e: jmp .L4
.L3
0X400f30: lea 0x4(%rsp),%rbx
0X400f35: lea 0x18(%rsp),%rbp
0X400f3a: jmp .L1
.L4
0X400f3c: add $0x28,%rsp
0X400f40: pop %rbx
0X400f41: pop %rbp
0X400f42: retq
看到里面有个read_six_numbers,我们直接猜这次是要输入六位数字做密码,在read_six_numbers后一行打个断点,分别用"123456",“12345”,"1234567"做密码尝试,结果全在运行到断点前炸了
于是分析read_six_numbers代码
0X000000000040145c <read_six_numbers>:
0X40145c: sub $0x18,%rsp
0X401460: mov %rsi,%rdx
0X401463: lea 0x4(%rsi),%rcx
0X401467: lea 0x14(%rsi),%rax
0X40146b: mov %rax,0x8(%rsp)
0X401470: lea 0x10(%rsi),%rax
0X401474: mov %rax,(%rsp)
0X401478: lea 0xc(%rsi),%r9
0X40147c: lea 0x8(%rsi),%r8
0X401480: mov $0x4025c3,%esi
0X401485: mov $0x0,%eax
0X40148a: callq 0X400bf0 <aaa@qq.com>
0X40148f: cmp $0x5,%eax
0X401492: jg .L1
0X401494: callq 0X40143a <explode_bomb>
.L1
0X401499: add $0x18,%rsp
0X40149d: retq
看到里面有个__isoc99_sscanf,这不就是标准的库函数嘛.它的格式是
int sscanf(const char *str, const char *format, ...)
sscanf的第二个参数对应rsi,观察代码,调用sscanf前刚好对esi进行了一次赋值操作,查看对应地址
(gdb) x/s 0x4025c3
0x4025c3: "%d %d %d %d %d %d"
原来read_six_numbers的六个数间是有空格的啊!
在read_six_numbers末尾加个断点,用"1 2 3 4 5 6"测试之.嗯,这次没炸.
存进去的值总要有地方放,又需要方便的读取,我们自然想到了按顺序放到栈里.我在read_six_numbers返回后的下一行加断点,用特殊值"1 1 4 5 1 4"验证之
Breakpoint 2, 0x0000000000400f30 in phase_2 ()
(gdb) info r
rax 0x6 6
rbx 0x0 0
rcx 0x0 0
rdx 0x7fffffffde14 140737488346644
rsi 0x0 0
rdi 0x7fffffffd770 140737488344944
rbp 0x402210 0x402210 <__libc_csu_init>
rsp 0x7fffffffde00 0x7fffffffde00
......
(gdb) x 0x7fffffffde00
0x7fffffffde00: "\001"
(gdb) x 0x7fffffffde04
0x7fffffffde04: "\001"
(gdb) x 0x7fffffffde08
0x7fffffffde08: "\004"
(gdb) x 0x7fffffffde0c
0x7fffffffde0c: "\005"
(gdb) x 0x7fffffffde10
0x7fffffffde10: "\001"
(gdb) x 0x7fffffffde14
0x7fffffffde14: "\004"
(gdb) x 0x7fffffffde18
0x7fffffffde18: "1\aaa@qq.com"
完美命中.
这里多说一句lea 0x4(%rsp),%rbx
和mov 0x4(%rsp),%rbx
的效果是不同的,拿伪代码举例就是
rbx=rsp+4 //lea 0x4(%rsp),%rbx
rbx=*rsp+4 //mov 0x4(%rsp),%rbx
所以用C++对phase_2手工逆向就是
int sp[7]; //注意是int型,元素长度为4byte
sscanf(input,"%d %d %d %d %d %d",sp);
if(!(sp[0]==1)){
BOOM();
}
bx=sp+1;
bp=sp+7;
do{
ax=*(bx-1);
ax+=ax;
if(!(ax==*bx)){
BOOM();
}
bx+=1;
if(bx==bp){
break;
}
}while(1);
return;
把这个代码推导一下得出密码"1 2 4 8 16 32",测试之,OK
Phase 1 defused. How about the next one?
1 2 4 8 16 32
That's number 2. Keep going!
phase_3
先来贴代码
0X0000000000400f43 <phase_3>:
0X400f43: sub $0x18,%rsp
0X400f47: lea 0xc(%rsp),%rcx
0X400f4c: lea 0x8(%rsp),%rdx
0X400f51: mov $0x4025cf,%esi
0X400f56: mov $0x0,%eax
0X400f5b: callq 0X400bf0 <aaa@qq.com>
0X400f60: cmp $0x1,%eax
0X400f63: jg .L1
0X400f65: callq 0X40143a <explode_bomb>
.L1
0X400f6a: cmpl $0x7,0x8(%rsp)
0X400f6f: ja .L2
0X400f71: mov 0x8(%rsp),%eax
0X400f75: jmpq *0x402470(,%rax,8)
0X400f7c: mov $0xcf,%eax
0X400f81: jmp .L3
0X400f83: mov $0x2c3,%eax
0X400f88: jmp .L3
0X400f8a: mov $0x100,%eax
0X400f8f: jmp .L3
0X400f91: mov $0x185,%eax
0X400f96: jmp .L3
0X400f98: mov $0xce,%eax
0X400f9d: jmp .L3
0X400f9f: mov $0x2aa,%eax
0X400fa4: jmp .L3
0X400fa6: mov $0x147,%eax
0X400fab: jmp .L3
.L2
0X400fad: callq 0X40143a <explode_bomb>
0X400fb2: mov $0x0,%eax
0X400fb7: jmp .L3
0X400fb9: mov $0x137,%eax
.L3
0X400fbe: cmp 0xc(%rsp),%eax
0X400fc2: je .L4
0X400fc4: callq 0X40143a <explode_bomb>
.L4
0X400fc9: add $0x18,%rsp
0X400fcd: retq
发现直接调用了sscanf,查看一下格式字符串的内容
(gdb) x/s 0x4025cf
0x4025cf: "%d %d"
故本次输入为两个数字
代码里还出现了JA,这是无符号大于时跳转
后面用到了jmpq,它是64位系统专用的jump,可以粗略的理解为
jmpq d(a,b,c)=jmp a+b*c+d
注意题目里的jmpq后面是*0x402470
,查看内存得知它的值为400f7c
现在来手工逆向
int sp[2]; //注意是int型,元素长度为4byte
ax=sscanf(input,"%d %d",sp+2,sp+3);
if(!(ax>0)){
BOOM();
}
if(!(*(sp+2)>7)){
BOOM();
}
ax=*(sp+2);
jmpq 0x400f7c+ax*8;
/*
0X400f7c: mov $0xcf,%eax
0X400f81: jmp .L3
0X400f83: mov $0x2c3,%eax
0X400f88: jmp .L3
......
*/
if(!(*(sp+3)==ax)){
BOOM();
}
return;
根据输入参数的不同,jmpq跳跃的位置也不同,不过ax只能在[0,7]上取值,挨个算出来就行.一算发现ax=0时就可以,此时*(sp+2)=0,*(sp+3)=0XCF=207,测试之,OK
That's number 2. Keep going!
0 207
Halfway there!
phase_4
反汇编的代码,这次涉及到了一个子函数func4
0X000000000040100c <phase_4>:
0X40100c: sub $0x18,%rsp
0X401010: lea 0xc(%rsp),%rcx
0X401015: lea 0x8(%rsp),%rdx
0X40101a: mov $0x4025cf,%esi
0X40101f: mov $0x0,%eax
0X401024: callq 0X400bf0 <aaa@qq.com>
0X401029: cmp $0x2,%eax
0X40102c: jne <explode_bomb>
0X40102e: cmpl $0xe,0x8(%rsp)
0X401033: jbe .L1
0X401035: callq <explode_bomb>
.L1
0X40103a: mov $0xe,%edx
0X40103f: mov $0x0,%esi
0X401044: mov 0x8(%rsp),%edi
0X401048: callq 0X400fce <func4>
0X40104d: test %eax,%eax
0X40104f: jne <explode_bomb>
0X401051: cmpl $0x0,0xc(%rsp)
0X401056: je .L2
0X401058: callq <explode_bomb>
.L2
0X40105d: add $0x18,%rsp
0X401061: retq
0000000000400fce <func4>:
0X400fce: sub $0x8,%rsp
0X400fd2: mov %edx,%eax
0X400fd4: sub %esi,%eax
0X400fd6: mov %eax,%ecx
0X400fd8: shr $0x1f,%ecx
0X400fdb: add %ecx,%eax
0X400fdd: sar %eax
0X400fdf: lea (%rax,%rsi,1),%ecx
0X400fe2: cmp %edi,%ecx
0X400fe4: jle .L1
0X400fe6: lea -0x1(%rcx),%edx
0X400fe9: callq 0X400fce <func4>
0X400fee: add %eax,%eax
0X400ff0: jmp .L2
.L1
0X400ff2: mov $0x0,%eax
0X400ff7: cmp %edi,%ecx
0X400ff9: jge .L2
0X400ffb: lea 0x1(%rcx),%esi
0X400ffe: callq 0X400fce <func4>
0X401003: lea 0x1(%rax,%rax,1),%eax
.L2
0X401007: add $0x8,%rsp
0X40100b: retq
查看格式字符串的值,是两个整数
(gdb) x/s 0x4025cf
0x4025cf: "%d %d"
等你把代码费劲的扒完一遍后,你会发现test ax,ax
意思不是ax==ax
嘛,它结果不是恒为1嘛,这不是说func4压根没造成任何影响.这一大片汇编代码可以简化为:
ax=sscanf(input,"%d %d",sp+2,sp+3);
if(!(ax=2)){
BOOM();
}
if(!(*(sp+2)<=0xe)){
BOOM();
}
func4(); //毫无作用的占位代码
if(!(*(sp+3)==0x0)){
BOOM();
}
return;
所以只要第一个数小于等于0xe,第二个数等于0就能过
我的答案是0 0
Halfway there!
0 0
So you got that one. Try this one.
后来我一看别人的答案发现不对,这个结果是我巧合做对了. test a,b
的实际作用是a&b
,所以test ax,ax
即是判断ax==0
,这就得把汇编代码重新翻译一遍了.
void BOOM()
{
printf("BOOM!!!");
exit(0);
}
int ax, cx, dx, si, di;
void func4()
{
ax = dx;
ax -= si;
cx = ax;
cx >>= 31;
ax += cx;
ax >>= 1;
cx = ax + si;
if (cx <= di)
goto L1;
dx = cx - 1;
func4();
ax += ax;
goto L2;
L1:
ax = 0;
if (cx >= di)
goto L2;
si = cx + 1;
func4();
ax = ax + ax + 1;
L2:
return;
}
int main()
{
int arg1, arg2;
if (scanf("%d %d", &arg1, &arg2) != 2){
BOOM();
}
if (!(arg1 <= 0xe)){
BOOM();
}
dx = 0xe;
si = 0x0;
di = arg1;
func4();
if (ax != 0){
BOOM();
}
if (arg2 != 0x0){
BOOM();
}
printf("OK");
return 0;
}
分析分析新翻译的汇编代码,对func4有影响的只有用户输入的第一个数(我在上面翻译成了arg1),因为arg1<=0xe,那我们直接写个程序穷举之
void f(){
for(int arg1=0;arg1<=0xe;arg1++){
dx = 0xe;
si = 0x0;
di = arg1;
func4();
if ((ax & ax) == 0){
printf("%d is ok\n",arg1);
}
}
}
输出结果为
0 is ok
1 is ok
3 is ok
7 is ok
所以第一个数可以取得值为0,1,3,7,第二个数只能取0
phase_5
0X0000000000401062 <phase_5>:
0X401062: push %rbx
0X401063: sub $0x20,%rsp
0X401067: mov %rdi,%rbx
0X40106a: mov %fs:0x28,%rax
0X401073: mov %rax,0x18(%rsp)
0X401078: xor %eax,%eax
0X40107a: callq 0X40131b <string_length>
0X40107f: cmp $0x6,%eax
0X401082: je .L2
0X401084: callq 0X40143a <explode_bomb>
0X401089: jmp .L2
.L1
0X40108b: movzbl (%rbx,%rax,1),%ecx
0X40108f: mov %cl,(%rsp)
0X401092: mov (%rsp),%rdx
0X401096: and $0xf,%edx
0X401099: movzbl 0x4024b0(%rdx),%edx
0X4010a0: mov %dl,0x10(%rsp,%rax,1)
0X4010a4: add $0x1,%rax
0X4010a8: cmp $0x6,%rax
0X4010ac: jne .L1
0X4010ae: movb $0x0,0x16(%rsp)
0X4010b3: mov $0x40245e,%esi
0X4010b8: lea 0x10(%rsp),%rdi
0X4010bd: callq 0X401338 <strings_not_equal>
0X4010c2: test %eax,%eax
0X4010c4: je .L3
0X4010c6: callq 0X40143a <explode_bomb>
0X4010cb: nopl 0x0(%rax,%rax,1)
0X4010d0: jmp .L3
.L2
0X4010d2: mov $0x0,%eax
0X4010d7: jmp .L1
.L3
0X4010d9: mov 0x18(%rsp),%rax
0X4010de: xor %fs:0x28,%rax
0X4010e7: je .L4
0X4010e9: callq 0X400b30 <aaa@qq.com>
.L4
0X4010ee: add $0x20,%rsp
0X4010f2: pop %rbx
0X4010f3: retq
观察string_length附近的几行,我们猜测这几行功能是判断输入字符串长度是否为6,不是的化就炸了.分别用长度为5,6,7的字符串做测试,在explode_bomb下面打个断点,结果只有6没炸,猜想正确.
再往后跳到L2,把ax置0,再跳入L1
注意到L1着一块形成了一个循环,只有在ax==6时才能跳出,且每次判断前都有一个ax++操作,所以我们猜测ax是L1的循环变量
具体研究L1,这里用到了movzbl,注意mov()为取数值,lea()为取地址.mov的其中一个参数为bx,往会看注意到phase_4开始时令bx=di,即bx此时指向输入字符串input,而整个循环中bx未改变过,所以movzbl (%rbx,%rax,1),%ecx
的作用相当于cx=input[ax]
下面的mov %cl,(%rsp)
佐证了我们的猜想,因为cl为cx的低八位,正好是一个字节.
一直到下一个movzbl这一行的功能可以概括为input[ax]
的低四位,进而令dx=0x4024b0+input[ax]&0xf
.观察一下0x4024b0的值
(gdb) x/s 0x4024b0
0x4024b0 <array.3449>: "maduiersnfotvbylSo you think you can stop the bomb with ctrl-c, do you?"
注意前16个字符为无规律的小写字符串,而input[ax]&0xf
恰好只有16种取值,相当于把它俩做了个对应.再往后这个对应出来的字符串会被放到sp[ax+10]
里.这照应了开头的sub $0x20,%rsp
为运行栈分配空间.
等到L1结束后会调用strings_not_equal,很明显两个参数是待比较的字符串,一个是我们刚求得sp,另一个是0x40245e
,观察其值
(gdb) x/s 0x40245e
0x40245e: "flyers"
所以我们的目的很明确了:要把输入字符串经过L1里的循环逻辑转换成"flyers"
.循环的过程上面分析了,又到了写暴力程序的时候了
#include <stdio.h>
char sa[20]="maduiersnfotvbyl";
char sb[20]="flyers";
char ans[20];
int main()
{
for(int i=0;i<6;i++){
for(int ch='a';ch<='z';ch++){
if(sa[ch&0xf]==sb[i]){
ans[i]=ch;
break;
}
}
}
printf("%s",ans);
return 0;
}
运行结果为"ionefg",测试,通过
So you got that one. Try this one.
ionefg
Good work! On to the next...
phase_6
0X00000000004010f4 <phase_6>:
0X4010f4: push %r14
0X4010f6: push %r13
0X4010f8: push %r12
0X4010fa: push %rbp
0X4010fb: push %rbx
0X4010fc: sub $0x50,%rsp
0X401100: mov %rsp,%r13
0X401103: mov %rsp,%rsi
0X401106: callq 0X40145c <read_six_numbers>
0X40110b: mov %rsp,%r14
0X40110e: mov $0x0,%r12d
#第一部分
.L0
0X401114: mov %r13,%rbp
0X401117: mov 0x0(%r13),%eax
0X40111b: sub $0x1,%eax
0X40111e: cmp $0x5,%eax
0X401121: jbe .L1
0X401123: callq 0X40143a <explode_bomb>
.L1
0X401128: add $0x1,%r12d
0X40112c: cmp $0x6,%r12d
0X401130: je .L4
0X401132: mov %r12d,%ebx
.L2
0X401135: movslq %ebx,%rax
0X401138: mov (%rsp,%rax,4),%eax
0X40113b: cmp %eax,0x0(%rbp)
0X40113e: jne .L3
0X401140: callq 0X40143a <explode_bomb>
.L3
0X401145: add $0x1,%ebx
0X401148: cmp $0x5,%ebx
0X40114b: jle .L2
0X40114d: add $0x4,%r13
0X401151: jmp .L0
#第二部分
.L4
0X401153: lea 0x18(%rsp),%rsi
0X401158: mov %r14,%rax
0X40115b: mov $0x7,%ecx
.L5
0X401160: mov %ecx,%edx
0X401162: sub (%rax),%edx
0X401164: mov %edx,(%rax)
0X401166: add $0x4,%rax
0X40116a: cmp %rsi,%rax
0X40116d: jne .L5
0X40116f: mov $0x0,%esi
0X401174: jmp .L9
#第三部分
.L6
0X401176: mov 0x8(%rdx),%rdx
0X40117a: add $0x1,%eax
0X40117d: cmp %ecx,%eax
0X40117f: jne .L6
0X401181: jmp .L8
.L7
0X401183: mov $0x6032d0,%edx
.L8
0X401188: mov %rdx,0x20(%rsp,%rsi,2)
0X40118d: add $0x4,%rsi
0X401191: cmp $0x18,%rsi
0X401195: je .L10
.L9
0X401197: mov (%rsp,%rsi,1),%ecx
0X40119a: cmp $0x1,%ecx
0X40119d: jle .L7
0X40119f: mov $0x1,%eax
0X4011a4: mov $0x6032d0,%edx
0X4011a9: jmp .L6
#第四部分
.L10
0X4011ab: mov 0x20(%rsp),%rbx
0X4011b0: lea 0x28(%rsp),%rax
0X4011b5: lea 0x50(%rsp),%rsi
0X4011ba: mov %rbx,%rcx
.L11
0X4011bd: mov (%rax),%rdx
0X4011c0: mov %rdx,0x8(%rcx)
0X4011c4: add $0x8,%rax
0X4011c8: cmp %rsi,%rax
0X4011cb: je .L12
0X4011cd: mov %rdx,%rcx
0X4011d0: jmp .L11
#第五部分
.L12
0X4011d2: movq $0x0,0x8(%rdx)
0X4011da: mov $0x5,%ebp
.L13
0X4011df: mov 0x8(%rbx),%rax
0X4011e3: mov (%rax),%eax
0X4011e5: cmp %eax,(%rbx)
0X4011e7: jge .L14
0X4011e9: callq 0X40143a <explode_bomb>
.L14
0X4011ee: mov 0x8(%rbx),%rbx
0X4011f2: sub $0x1,%ebp
0X4011f5: jne .L13
0X4011f7: add $0x50,%rsp
0X4011fb: pop %rbx
0X4011fc: pop %rbp
0X4011fd: pop %r12
0X4011ff: pop %r13
0X401201: pop %r14
0X401203: retq
这一阶段的代码看起来特长,但我们可以根据有无嵌套拆成五个部分,然后逐个攻破
第一部分检查每个数是否都小于等于6,且各不相同,逆向得
void f1(int a[]){
for(int i=0;i<6;i++){
if(a[i]>6) BOOM();
for(int j=i+1;j<6;j++){
if(a[i]==a[j]) BOOM();
}
}
}
第二部分令每个元素等于-x
void f2(int a[]){
for(int i=0;i<6;i++)
a[i]=7-a[i];
}
第三部分我看出来它做了一个链式结构,但一开始并没有想过它还用了数据结构,导致陷入迷惑了.然后一看别人博文里提到了链表立马反应过来了.
我们先来检查一下0x6032d0附近的内存
(gdb) x/40x 0x6032d0
0x6032d0 <node1>: 0x0000014c 0x00000001 0x006032e0 0x00000000
0x6032e0 <node2>: 0x000000a8 0x00000002 0x006032f0 0x00000000
0x6032f0 <node3>: 0x0000039c 0x00000003 0x00603300 0x00000000
0x603300 <node4>: 0x000002b3 0x00000004 0x00603310 0x00000000
0x603310 <node5>: 0x000001dd 0x00000005 0x00603320 0x00000000
0x603320 <node6>: 0x000001bb 0x00000006 0x00000000 0x00000000
通过变量名和数据组织形式我们可以看出来这是一个链表,对应的结构体为
struct node{
int data;
int id;
node *next;
int other;
};
对第三部分代码进行逆向得
while(1){
cx=*(sp+si*1); // cx=sp+0,sp+0x4,sp+0x8,sp+0xc,....
if(cx<=1) dx=0x6032d0; //dx=表头地址
else {
ax=1;
dx=0x6032d0;
//遍历链表,使得dx=第sp[si/4]项的地址
do{
dx=*(dx+0x8);
ax+=1;
while(ax!=cx);
}
*(sp+si*2+0x20)=dx;//从sp+0x20起,每8字节记录一个链表项地址
si+=4;
if(si==0x18) goto L10; //sp+0x18,即&sp[6]
}
检查一下运行栈的内存布局,切合了我们的推导
(gdb) x/20x 0x7fffffffddc0
0x7fffffffddc0: 0x00000006 0x00000005 0x00000004 0x00000003
0x7fffffffddd0: 0x00000002 0x00000001 0x00000000 0x00000000
0x7fffffffdde0: 0x00603320 0x00000000 0x00603310 0x00000000
0x7fffffffddf0: 0x00603300 0x00000000 0x006032f0 0x00000000
0x7fffffffde00: 0x006032e0 0x00000000 0x006032d0 0x00000000
第四部分将链表进行了重排
bx=*(sp+0x20)=sp[8]
ax=sp+0x28
si=sp+0x50
cx=bx
while(1)
{
//遍历上一步在运行栈中记录好的地址
//用这个顺序重新排列链表项
dx=*ax
*(cx+0x8)=dx
ax+=0x8
if(ax==si) goto L12
cx=dx
}
第五部分遍历链表,如果排序结果是按value递减的,那么就是合法的密码
//此时dx指向重排后的链表尾项,bx指向链表首项
*(dx+0x8)=0;
bp=0x5;
do{
ax=*(bx+0x8); //ax=bx->next
ax=*ax; //ax=ax->value
if(bx->value<ax) BOOM();
bx=bx->next;
bp--;
}while(bp!=0);
观察0x6032d0
附近的内存布局,我们可以推出合法的链表项顺序应为3->4->5->6->1->2
,因为第二步操作的存在,每个数x
应该取7-x
,即4 3 2 1 6 5
根据网友的提醒bomb里还有一个隐藏关卡,通过分析phase_defused可以找到入口
0X00000000004015c4 <phase_defused>:
0X4015c4: sub $0x78,%rsp
0X4015c8: mov %fs:0x28,%rax
0X4015d1: mov %rax,0x68(%rsp)
0X4015d6: xor %eax,%eax
0X4015d8: cmpl $0x6,0x202181(%rip) # 603760 <num_input_strings>
0X4015df: jne 0X40163f <phase_defused+0x7b>
0X4015e1: lea 0x10(%rsp),%r8
0X4015e6: lea 0xc(%rsp),%rcx
0X4015eb: lea 0x8(%rsp),%rdx
0X4015f0: mov $0x402619,%esi
0X4015f5: mov $0x603870,%edi
0X4015fa: callq 0X400bf0 <aaa@qq.com>
0X4015ff: cmp $0x3,%eax
0X401602: jne 0X401635 <phase_defused+0x71>
0X401604: mov $0x402622,%esi
0X401609: lea 0x10(%rsp),%rdi
0X40160e: callq 0X401338 <strings_not_equal>
0X401613: test %eax,%eax
0X401615: jne 0X401635 <phase_defused+0x71>
0X401617: mov $0x4024f8,%edi
0X40161c: callq 0X400b10 <aaa@qq.com>
0X401621: mov $0x402520,%edi
0X401626: callq 0X400b10 <aaa@qq.com>
0X40162b: mov $0x0,%eax
0X401630: callq 0X401242 <secret_phase>
0X401635: mov $0x402558,%edi
0X40163a: callq 0X400b10 <aaa@qq.com>
0X40163f: mov 0x68(%rsp),%rax
0X401644: xor %fs:0x28,%rax
0X40164d: je 0X401654 <phase_defused+0x90>
0X40164f: callq 0X400b30 <aaa@qq.com>
0X401654: add $0x78,%rsp
我们注意到这里面多了secret_phase,因此秘密关卡进入条件就在这里.分别分析sscanf和 strings_not_equal的输入参数
(gdb) x/s 0x603870
0x603870 <input_strings+240>: "0 0"
(gdb) x/s 0x402619
0x402619: "%d %d %s"
(gdb) x/s 0x402622
0x402622: "DrEvil"
可知只有在输入"0 0 DrEvil"才可进入隐藏关卡,观察我们的输入数据,只有在第四阶段时输入了0 0,故在它后面加上个"DrEvil"即可进入隐藏关
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
Border relations with Canada have never been better.
1 2 4 8 16 32
0 207
0 0 DrEvil
ionefg
4 3 2 1 6 5
0 DrEvilPhase 1 defused. How about the next one?
That's number 2. Keep going!
Halfway there!
So you got that one. Try this one.
Good work! On to the next...
Curses, you've found the secret phase!
But finding it and solving it are quite different...
进入之后我们先研究secret_phase的代码
0X0000000000401242 <secret_phase>:
0X401242: push %rbx
0X401243: callq 0X40149e <read_line>
0X401248: mov $0xa,%edx
0X40124d: mov $0x0,%esi
0X401252: mov %rax,%rdi
0X401255: callq 0X400bd0 <aaa@qq.com>
0X40125a: mov %rax,%rbx
0X40125d: lea -0x1(%rax),%eax
0X401260: cmp $0x3e8,%eax
0X401265: jbe .L1
0X401267: callq 0X40143a <explode_bomb>
.L1
0X40126c: mov %ebx,%esi
0X40126e: mov $0x6030f0,%edi
0X401273: callq 0X401204 <fun7>
0X401278: cmp $0x2,%eax
0X40127b: je .L2
0X40127d: callq 0X40143a <explode_bomb>
.L2
0X401282: mov $0x402438,%edi
0X401287: callq 0X400b10 <aaa@qq.com>
0X40128c: callq 0X4015c4 <phase_defused>
0X401291: pop %rbx
0X401292: retq
逆向得
//输入长度小于10的字符串,转换为long,放入ax中
ax=input;
bx=ax;
ax=ax-1;
if(ax>0x3e8) BOOM();
si=bx;
di=0x6030f0;
ax=fun7()
if(ax!=2) BOOM(); //返回值为2时通过
这里研究一下0x6030f0
对应的内存区域是啥
(gdb) x/120x 0x6030f0
0x6030f0 <n1>: 0x00000024 0x00000000 0x00603110 0x00000000
0x603100 <n1+16>: 0x00603130 0x00000000 0x00000000 0x00000000
0x603110 <n21>: 0x00000008 0x00000000 0x00603190 0x00000000
0x603120 <n21+16>: 0x00603150 0x00000000 0x00000000 0x00000000
0x603130 <n22>: 0x00000032 0x00000000 0x00603170 0x00000000
0x603140 <n22+16>: 0x006031b0 0x00000000 0x00000000 0x00000000
0x603150 <n32>: 0x00000016 0x00000000 0x00603270 0x00000000
0x603160 <n32+16>: 0x00603230 0x00000000 0x00000000 0x00000000
0x603170 <n33>: 0x0000002d 0x00000000 0x006031d0 0x00000000
0x603180 <n33+16>: 0x00603290 0x00000000 0x00000000 0x00000000
0x603190 <n31>: 0x00000006 0x00000000 0x006031f0 0x00000000
0x6031a0 <n31+16>: 0x00603250 0x00000000 0x00000000 0x00000000
0x6031b0 <n34>: 0x0000006b 0x00000000 0x00603210 0x00000000
0x6031c0 <n34+16>: 0x006032b0 0x00000000 0x00000000 0x00000000
0x6031d0 <n45>: 0x00000028 0x00000000 0x00000000 0x00000000
0x6031e0 <n45+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x6031f0 <n41>: 0x00000001 0x00000000 0x00000000 0x00000000
0x603200 <n41+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603210 <n47>: 0x00000063 0x00000000 0x00000000 0x00000000
0x603220 <n47+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603230 <n44>: 0x00000023 0x00000000 0x00000000 0x00000000
0x603240 <n44+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603250 <n42>: 0x00000007 0x00000000 0x00000000 0x00000000
0x603260 <n42+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603270 <n43>: 0x00000014 0x00000000 0x00000000 0x00000000
0x603280 <n43+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603290 <n46>: 0x0000002f 0x00000000 0x00000000 0x00000000
0x6032a0 <n46+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x6032b0 <n48>: 0x000003e9 0x00000000 0x00000000 0x00000000
0x6032c0 <n48+16>: 0x00000000 0x00000000 0x00000000 0x00000000
一看这组织形式,不就是个二叉树嘛,对应的结构体:
struct node{
long data
node *left
node *right
long other
}
接下来研究fun7的代码
0X0000000000401204 <fun7>:
0X401204: sub $0x8,%rsp
0X401208: test %rdi,%rdi
0X40120b: je .L2
0X40120d: mov (%rdi),%edx
0X40120f: cmp %esi,%edx
0X401211: jle .L1
0X401213: mov 0x8(%rdi),%rdi
0X401217: callq 0X401204 <fun7>
0X40121c: add %eax,%eax
0X40121e: jmp .L3
.L1
0X401220: mov $0x0,%eax
0X401225: cmp %esi,%edx
0X401227: je .L3
0X401229: mov 0x10(%rdi),%rdi
0X40122d: callq 0X401204 <fun7>
0X401232: lea 0x1(%rax,%rax,1),%eax
0X401236: jmp .L3
.L2
0X401238: mov $0xffffffff,%eax
.L3
0X40123d: add $0x8,%rsp
0X401241: retq
//调用时di指向根节点,si存放输入值input
fun7:
sp-=0x8;
//遍历到叶子节点
if(di==0) {
ax=bigint;
return;
}
dx=*di; //取节点对应的data
if(dx<=si) { // data<=input,访问右子节点
ax=0;
if(dx==si) { //data==input
return;
}
di=*(di+0x10); //di=di->right
call fun7;
ax=ax+ax+1;
return ;
}
else {
di=*(di+0x8); //di=di->left
call fun7;
ax+=ax;
return;
}
很明显,这不仅是个二叉树,还是个四层的二叉排序数.
我们要返回2,逆推就是1*2 => 0*2+1 => 0*2 => 0
,即按照LRL
的顺序进行访问,对比着前面的内存分布图可得出约束条件
0x24>x
0x8<=x
0x16>x
0x14==x
所以输入值应为0x14,即20
Curses, you've found the secret phase!
But finding it and solving it are quite different...
20
Wow! You've defused the secret stage!
Congratulations! You've defused the bomb!
ALL STAGE CLEAR!
参考资料:
寄存器操作数大小及参数数量 by alike_meng
GDB 单步调试汇编 by 张雅宸
学 Win32 汇编[28] - 跳转指令: JMP、JECXZ等 by 万一的 Delphi 博客
gdb中x的用法 by 我打打江南走过过