Unpickling a python 2 object with python 3
I'm wondering if there is a way to load an object that was pickled in Python 2.4, with Python 3.4.
I've been running 2to3 on a large amount of company legacy code to get it up to date.
Having done this, when running the file I get the following error:
File "H:\fixers - 3.4\addressfixer - 3.4\trunk\lib\address\address_generic.py" , line 382, in read_ref_files d = pickle.load(open(mshelffile, 'rb')) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1: ordinal not in range(128)
looking at the pickled object in contention, it's a
dict in a
dict, containing keys and values of type
So my question is: Is there a way to load an object, originally pickled in python 2.4, with python 3.4?
You'll have to tell
pickle.load() how to convert Python bytestring data to Python 3 strings, or you can tell
pickle to leave them as bytes.
The default is to try and decode all string data as ASCII, and that decoding fails. See the
Optional keyword arguments are _fiximports , encoding and errors , which are used to control compatibility support for pickle stream generated by Python 2. If _fiximports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.
Setting the encoding to
latin1 allows you to import the data directly:
with open(mshelffile, 'rb') as f: d = pickle.load(f, encoding='latin1')
but you'll need to verify that none of your strings are decoded using the wrong codec; Latin-1 works for any input as it maps the byte values 0-255 to the first 256 Unicode codepoints directly.
The alternative would be to load the data with
encoding='bytes', and decode all
bytes keys and values afterwards.