乐于分享
好东西不私藏

用AI写一个51单片机反汇编软件

用AI写一个51单片机反汇编软件

知道单片机汇编指令对应的机器码格式就可以对二进制机器码或者Hex文件进行反汇编逆向.51单片机有111条指令,指令长度有1字节,2字节,3字节,手动写这种反汇编软件纯粹是个体力活:需要去一条条的了解汇编指令的格式,写代码解析.用AI的话可以各种语言实现这个功能,你要做的就是向AI输入一句需求:反汇编51单片机hex的软件,用(C++/C#/网页)实现.几分钟写出质量非常高的代码。为了实现跨平台我用的网页,下面是AI做的软件:
机器码
  • 数据传送类指令(29 条)
助记符
操作数
机器码(HEX)
字节数
MOV
A, Rn
0xE8~0xEF
1
MOV
A, direct
0xE5 direct
2
MOV
A, @Ri
0xE6~0xE7
1
MOV
A, #data
0x74 data
2
MOV
Rn, A
0xF8~0xFF
1
MOV
Rn, direct
0xA8~0xAF direct
2
MOV
Rn, #data
0x78~0x7F data
2
MOV
direct, A
0xF5 direct
2
MOV
direct, Rn
0x88~0x8F direct
2
MOV
direct, direct
0x85 direct_src direct_dest
3
MOV
direct, @Ri
0x86~0x87 direct
2
MOV
direct, #data
0x75 direct data
3
MOV
@Ri, A
0xF6~0xF7
1
MOV
@Ri, direct
0xA6~0xA7 direct
2
MOV
@Ri, #data
0x76~0x77 data
2
MOV
DPTR, #data16
0x90 data_high data_low
3
MOVC
A, @A+DPTR
0x93
1
MOVC
A, @A+PC
0x83
1
MOVX
A, @Ri
0xE2~0xE3
1
MOVX
A, @DPTR
0xE0
1
MOVX
@Ri, A
0xF2~0xF3
1
MOVX
@DPTR, A
0xF0
1
PUSH
direct
0xC0 direct
2
POP
direct
0xD0 direct
2
XCH
A, Rn
0xC8~0xCF
1
XCH
A, direct
0xC5 direct
2
XCH
A, @Ri
0xC6~0xC7
1
XCHD
A, @Ri
0xD6~0xD7
1
SWAP
A
0xC4
1
  • 算术运算类指令(24 条)
助记符
操作数
机器码(HEX)
字节数
ADD
A, Rn
0x28~0x2F
1
ADD
A, direct
0x25 direct
2
ADD
A, @Ri
0x26~0x27
1
ADD
A, #data
0x24 data
2
ADDC
A, Rn
0x38~0x3F
1
ADDC
A, direct
0x35 direct
2
ADDC
A, @Ri
0x36~0x37
1
ADDC
A, #data
0x34 data
2
SUBB
A, Rn
0x98~0x9F
1
SUBB
A, direct
0x95 direct
2
SUBB
A, @Ri
0x96~0x97
1
SUBB
A, #data
0x94 data
2
INC
A
0x04
1
INC
Rn
0x08~0x0F
1
INC
direct
0x05 direct
2
INC
@Ri
0x06~0x07
1
INC
DPTR
0xA3
1
DEC
A
0x14
1
DEC
Rn
0x18~0x1F
1
DEC
direct
0x15 direct
2
DEC
@Ri
0x16~0x17
1
MUL
AB
0xA4
1
DIV
AB
0x84
1
DA
A
0xD4
1
  • 控制转移类指令(17 条)
助记符
操作数
机器码(HEX)
字节数
AJMP
addr11
0x01/0x21/0x41/0x61/0x81/0xA1/0xC1/0xE1 + addr_low
2
LJMP
addr16
0x02 addr_high addr_low
3
SJMP
rel
0x80 offset
2
JMP
@A+DPTR
0x73
1
JZ
rel
0x60 offset
2
JNZ
rel
0x70 offset
2
JC
rel
0x40 offset
2
JNC
rel
0x50 offset
2
JB
bit, rel
0x20 bit offset
3
JNB
bit, rel
0x30 bit offset
3
JBC
bit, rel
0x10 bit offset
3
CJNE
A, #data, rel
0xB4 data offset
3
CJNE
A, direct, rel
0xB5 direct offset
3
CJNE
Rn, #data, rel
0xB8~0xBF data offset
3
CJNE
@Ri, #data, rel
0xB6~0xB7 data offset
3
DJNZ
Rn, rel
0xD8~0xDF offset
2
DJNZ
direct, rel
0xD5 direct offset
3
ACALL
addr11
0x11/0x31/0x51/0x71/0x91/0xB1/0xD1/0xF1 + addr_low
2
LCALL
addr16
0x12 addr_high addr_low
3
RET
0x22
1
RETI
0x32
1
NOP
0x00
1
  • 位操作类指令(17 条)
助记符
操作数
机器码(HEX)
字节数
CLR
C
0xC3
1
CLR
bit
0xC2 bit
2
SETB
C
0xD3
1
SETB
bit
0xD2 bit
2
CPL
C
0xB3
1
CPL
bit
0xB2 bit
2
ANL
C, bit
0x82 bit
2
ANL
C, /bit
0xB0 bit
2
ORL
C, bit
0x72 bit
2
ORL
C, /bit
0xA0 bit
2
MOV
C, bit
0xA2 bit
2
MOV
bit, C
0x92 bit
2
代码分析
51单片机是8位指令集,从地址0x0处读取机器码判断当前是111条中的哪一条,然后根据当前指令性质可以知道当前指令长度是几字节的,下一条指令就偏移几个字节.
  const opcodeTable = {0x00: { mnemonic: 'NOP'bytes1, cycles: 1 },0x01: { mnemonic: 'AJMP'bytes2, cycles: 2type'ajmp' },0x05: { mnemonic: 'INC'bytes2, cycles: 1type'inc_dir' },0x15: { mnemonic: 'DEC'bytes2, cycles: 1type'dec_dir' },0x22: { mnemonic: 'RET'bytes1, cycles: 2 },0x25: { mnemonic: 'ADD'bytes2, cycles: 1type'add_a_dir' },0xA0: { mnemonic: 'ORL'bytes2, cycles: 1type'orl_c_nbit' },0xC0: { mnemonic: 'PUSH'bytes2, cycles: 2type'push_dir' },0xC2: { mnemonic: 'CLR'bytes2, cycles: 1type'clr_bit' },0xFF: { mnemonic: 'MOV'bytes1, cycles: 1, operand: 'R7, A' }};
mnemonic决定反汇编后显示的指令名称,bytes决定当前指令的操作码需要的数据,下一条指令的地址偏移.
SFR寄存器对应单片机地址区间0x80~0xFF,需要一个数据结构把地址映射到寄存器名称.
const sfrNames = {0x80'P0'0x81'SP'0x82'DPL'0x83'DPH',0x87'PCON'0x88'TCON'0x89'TMOD'0x8A'TL0',0x8B'TL1'0x8C'TH0'0x8D'TH1'0x90'P1',0x98'SCON'0x99'SBUF'0xA0'P2'0xA8'IE',0xB0'P3'0xB8'IP'0xD0'PSW'0xE0'ACC',0xE0'A'0xF0'B'};
同样还用了一个数据结构把位寻址映射到名称:
const bitNames = {0x00'P0.0'0x01'P0.1'0x02'P0.2'0x03'P0.3',0x04'P0.4'0x05'P0.5'0x06'P0.6'0x07'P0.7',			...		};
结论
用不到700行的代码就完成反汇编,可见AI的代码功底非常强的,我们自己的代码可能需要好多次的版本迭代才能到达如此的精简,以后扩展解析ARM,RISCV也方便.最后分享一下刚刚软件:https://share.weiyun.com/DOY32236