Skip to content

Commit 707295b

Browse files
committed
First stab at pickling extensions.
1 parent 6e78419 commit 707295b

File tree

2 files changed

+224
-18
lines changed

2 files changed

+224
-18
lines changed

doc/sphinx/source/index.rst

+19-18
Original file line numberDiff line numberDiff line change
@@ -9,24 +9,25 @@ Coding Patterns for Python Extensions
99
This describes reliable patterns of coding Python Extensions in C. It covers the essentials of reference counts, exceptions and creating functions that are safe and efficient.
1010

1111
.. toctree::
12-
:numbered:
13-
:maxdepth: 3
14-
15-
refcount
16-
exceptions
17-
canonical_function
18-
parsing_arguments
19-
new_types
20-
module_globals
21-
super_call
22-
compiler_flags
23-
debugging/debug
24-
memory_leaks
25-
thread_safety
26-
code_layout
27-
cpp
28-
miscellaneous
29-
further_reading
12+
:numbered:
13+
:maxdepth: 3
14+
15+
refcount
16+
exceptions
17+
canonical_function
18+
parsing_arguments
19+
new_types
20+
module_globals
21+
super_call
22+
compiler_flags
23+
debugging/debug
24+
memory_leaks
25+
thread_safety
26+
code_layout
27+
cpp
28+
pickle
29+
miscellaneous
30+
further_reading
3031

3132

3233
Indices and tables

doc/sphinx/source/pickle.rst

+205
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
.. highlight:: c
2+
:linenothreshold: 10
3+
4+
.. toctree::
5+
:maxdepth: 2
6+
7+
====================================
8+
Pickling and C Extensions
9+
====================================
10+
11+
If you need to provide support for pickling your specialised types from your C extension then you need to implement some special functions.
12+
13+
This example shows you how to provided pickle support for for the ``custom2`` type described in the C extension tutorial in the
14+
`Python documentation <https://docs.python.org/3/extending/newtypes_tutorial.html#adding-data-and-methods-to-the-basic-example>`_.
15+
16+
Pickle Version Control
17+
-------------------------------
18+
19+
Since the whole point of ``pickle`` is persistence then pickled objects can hang around in databases, file systems, data from the `shelve <https://docs.python.org/3/library/shelve.html#module-shelve>`_ module and whatnot for a long time.
20+
It is entirely possible that when un-pickled, sometime in the future, that your C extension has moved on and then things become awkward.
21+
22+
It is *strongly* recommended that you add some form of version control to your pickled objects.
23+
In this example I just have a single integer version number which I write to the pickled object.
24+
If the number does not match on unpickling then I raise an exception.
25+
When I change the type API I would, judiciously, change this version number.
26+
27+
Clearly more sophisticated strategies are possible by supporting older versions of the pickled object in some way but this will do for now.
28+
29+
We add some simple pickle version information to the C extension:
30+
31+
.. code-block:: c
32+
33+
34+
static const char* PICKLE_VERSION_KEY = "_pickle_version";
35+
static int PICKLE_VERSION = 1;
36+
37+
Now we can implement ``__getstate__`` and ``__setstate__``, think of these as symmetric operations. First ``__getstate__``.
38+
39+
Implementing ``__getstate__``
40+
---------------------------------
41+
42+
``__getstate__`` pickles the object.
43+
``__getstate__`` is expected to return a dictionary of the internal state of the ``Custom`` object.
44+
Note that a ``Custom`` object has two Python objects (``first`` and ``last``) and a C integer (``number``) that need to be converted to a Python object.
45+
We also need to add the version information.
46+
47+
Her is the C implementation:
48+
49+
.. code-block:: c
50+
51+
/* Pickle the object */
52+
static PyObject *
53+
Custom___getstate__(CustomObject *self, PyObject *Py_UNUSED(ignored)) {
54+
PyObject *ret = Py_BuildValue("{sOsOsisi}",
55+
"first", self->first,
56+
"last", self->last,
57+
"number", self->number,
58+
PICKLE_VERSION_KEY, PICKLE_VERSION);
59+
return ret;
60+
}
61+
62+
Implementing ``__setstate__``
63+
---------------------------------
64+
65+
The implementation of ``__setstate__`` un-pickles the object.
66+
This is a little more complicated as there is quite a lot of error checking going on.
67+
We are being passed an arbitrary Python object and need to check:
68+
69+
* It is a Python dictionary.
70+
* It has a version key and the version value is one that we can deal with.
71+
* It has the required keys and values to populate our ``Custom`` object.
72+
73+
Note that our ``__new__`` method (``Custom_new()``) has already been called on ``self``.
74+
Before setting any member value we need to de-allocate the existing value set by ``Custom_new()`` otherwise we will have a memory leak.
75+
76+
.. code-block:: c
77+
78+
/* Un-pickle the object */
79+
static PyObject *
80+
Custom___setstate__(CustomObject *self, PyObject *state) {
81+
/* Error check. */
82+
if (!PyDict_CheckExact(state)) {
83+
PyErr_SetString(PyExc_ValueError, "Pickled object is not a dict.");
84+
return NULL;
85+
}
86+
/* Version check. */
87+
/* Borrowed reference but no need to increment as we create a C long
88+
* from it. */
89+
PyObject *temp = PyDict_GetItemString(state, PICKLE_VERSION_KEY);
90+
if (temp == NULL) {
91+
/* PyDict_GetItemString does not set any error state so we have to. */
92+
PyErr_Format(PyExc_KeyError, "No \"%s\" in pickled dict.",
93+
PICKLE_VERSION_KEY);
94+
return NULL;
95+
}
96+
int pickle_version = (int) PyLong_AsLong(temp);
97+
if (pickle_version != PICKLE_VERSION) {
98+
PyErr_Format(PyExc_ValueError,
99+
"Pickle version mismatch. Got version %d but expected version %d.",
100+
pickle_version, PICKLE_VERSION);
101+
return NULL;
102+
}
103+
/* NOTE: Custom_new() will have been invoked so self->first and self->last
104+
* will have been allocated so we have to de-allocate them. */
105+
Py_DECREF(self->first);
106+
self->first = PyDict_GetItemString(state, "first"); /* Borrowed reference. */
107+
if (self->first == NULL) {
108+
/* PyDict_GetItemString does not set any error state so we have to. */
109+
PyErr_SetString(PyExc_KeyError, "No \"first\" in pickled dict.");
110+
return NULL;
111+
}
112+
/* Increment the borrowed reference for our instance of it. */
113+
Py_INCREF(self->first);
114+
115+
/* Similar to self->first above. */
116+
Py_DECREF(self->last);
117+
self->last = PyDict_GetItemString(state, "last"); /* Borrowed reference. */
118+
if (self->last == NULL) {
119+
/* PyDict_GetItemString does not set any error state so we have to. */
120+
PyErr_SetString(PyExc_KeyError, "No \"last\" in pickled dict.");
121+
return NULL;
122+
}
123+
Py_INCREF(self->last);
124+
125+
/* Borrowed reference but no need to incref as we create a C long from it. */
126+
PyObject *number = PyDict_GetItemString(state, "number");
127+
if (number == NULL) {
128+
/* PyDict_GetItemString does not set any error state so we have to. */
129+
PyErr_SetString(PyExc_KeyError, "No \"number\" in pickled dict.");
130+
return NULL;
131+
}
132+
self->number = (int) PyLong_AsLong(number);
133+
134+
Py_RETURN_NONE;
135+
}
136+
137+
Add the Special Methods
138+
---------------------------------
139+
140+
Now we need to add these two special methods to the methods table which now looks like this:
141+
142+
.. code-block:: c
143+
144+
static PyMethodDef Custom_methods[] = {
145+
{"name", (PyCFunction) Custom_name, METH_NOARGS,
146+
"Return the name, combining the first and last name"
147+
},
148+
{"__getstate__", (PyCFunction) Custom___getstate__, METH_NOARGS,
149+
"Pickle the Custom object"
150+
},
151+
{"__setstate__", (PyCFunction) Custom___setstate__, METH_O,
152+
"Un-pickle the Custom object"
153+
},
154+
{NULL} /* Sentinel */
155+
};
156+
157+
Example of Using ``custom2.Custom``
158+
-------------------------------------
159+
160+
We can test this with code like this that pickles one object then creates another object from that pickle.
161+
Here is some Python code that exercises our module:
162+
163+
.. code-block:: python
164+
165+
import pickle
166+
167+
import custom2
168+
169+
original = custom2.Custom('FIRST', 'LAST', 11)
170+
print(
171+
f'original is {original} @ 0x{id(original):x} first: {original.first} last: {original.last}'
172+
' number: {original.number} name: {original.name()}'
173+
)
174+
pickled_value = pickle.dumps(original)
175+
print(f'Pickled original is {pickled_value}')
176+
result = pickle.loads(pickled_value)
177+
print(
178+
f'result is {result} @ 0x{id(result):x} first: {result.first} last: {result.last}'
179+
' number: {result.number} name: {result.name()}'
180+
)
181+
182+
183+
.. code-block:: sh
184+
185+
$ python main.py
186+
original is <custom2.Custom object at 0x102b00810> @ 0x102b00810 first: FIRST last: LAST number: 11 name: FIRST LAST
187+
Pickled original is b'\x80\x04\x95[\x00\x00\x00\x00\x00\x00\x00\x8c\x07custom2\x94\x8c\x06Custom\x94\x93\x94)\x81\x94}\x94(\x8c\x05first\x94\x8c\x05FIRST\x94\x8c\x04last\x94\x8c\x04LAST\x94\x8c\x06number\x94K\x0b\x8c\x0f_pickle_version\x94K\x01ub.'
188+
result is <custom2.Custom object at 0x102a3f510> @ 0x102a3f510 first: FIRST last: LAST number: 11 name: FIRST LAST
189+
190+
So we have pickled one object and recreated a different, but equivalent, instance from that object.
191+
192+
Pickling Objects with External State
193+
-----------------------------------------
194+
195+
This is just a simple example, if your object relies on external state such as open files, databases and the like you need to be careful, and knowledgeable about your state management.
196+
197+
References
198+
-----------------------
199+
200+
* Python API documentation for `__setstate__ <https://docs.python.org/3/library/pickle.html#object.__setstate__>`_
201+
* Python API documentation for `__getstate__ <https://docs.python.org/3/library/pickle.html#object.__getstate__>`_
202+
* Useful documentation for `Handling Stateful Objects <https://docs.python.org/3/library/pickle.html#pickle-state>`_
203+
* Python `pickle module <https://docs.python.org/3/library/pickle.html>`_
204+
* Python `shelve module <https://docs.python.org/3/library/shelve.html>`_
205+

0 commit comments

Comments
 (0)