Stuff What I Posted: 12/12/2010

Friday 17 December 2010

Pickling

One of the use cases I have for unpickling is context-dependent whitelisting. You can easily do this by instantiating an Unpickler object and setting a replacement find_globals method on it.

        def find_global(moduleName, className):
            t = namespaceSubstitutions.get((moduleName, className), None)
            if t is not None:
                moduleName, className = t
        
            mod = __import__(moduleName, globals(), locals(), [])
            # This won't have given us "X.B", but rather "X".  So get "B" from "X".
            idx = moduleName.rfind(".")
            if idx != -1:
                subModuleName = moduleName[idx+1:]
                mod = getattr(mod, subModuleName)

            obj = getattr(mod, className)
            if moduleName +"."+ className in namespaceWhitelist or moduleName in namespaceWhitelist:
                return obj
            raise cPickle.UnpicklingError("%s.%s is not whitelisted for unpickling" %
               (moduleName, className))

        unpickler = cPickle.Unpickler(StringIO(packet))
        unpickler.find_global = find_global

        return unpickler.load()

This allows a whitelist to be enforced for specific cases where unpickling is done, perhaps for objects coming over the wire, but not for objects read from disk.

Another use case I have, this time for pickling, is to transform objects that are being pickled into forms that are compatible with whatever is unpickling them. Let's say your server application is built using a complicated and heavyweight framework that whatever machine it is running on literally groans at the effort of having to do so. Let's call it Twifted [1]. Your client application however has to be extremely lightweight, and doesn't have all the need for functionality that the server application does. So in order to make programming on the server more natural, you allow programmers to send objects in handy Twifted form over the wire to the client. Perhaps as return values from your RPC calls.

In order to do this, you hook into pickling. Now copyreg.pickle allows you to install global functions that can transform objects before they get pickled. But you want to do it on a case by case basis. It needs to be done for the objects that get sent over the wire, but not for the ones that go to disk. But the Pickler object doesn't have a friendly overridable function to allow you to do this, in much the same way find_globals does for Unpickler.

This is one possibility:

reducers = {}

def register_pickle_convertor(source_type, convertor_func):
    global reducers
    # Use the limited copy_reg API to do a trial install and to validate its correctness.
    copy_reg.pickle(source_type, convertor_func)
    # Now unregister it directly, because there is no API to do this.
    del copy_reg.dispatch_table[source_type]

    # Put the validated reducer in our back pocket for use as required.
    print "Registered sake RPC convertor for objects of type %r" % source_type
    reducers[source_type] = convertor_func

def modified_cPickle_dumps(obj):
    global reducers
    copy_reg.dispatch_table.update(reducers)
    try:
        return cPickle.dumps(obj, cPickle.HIGHEST_PROTOCOL)
    finally:
        # Remove our influence.
        for source_type in reducers:
            del copy_reg.dispatch_table[source_type]

I haven't put that much thought into it, but I'd like to find a better solution.

[1] I've never used Twisted, and the choice of this name has no bearing on all the suggestions to do so that I have ignored over the years.

Thursday 16 December 2010

unittest.py quirk

Now I am using Python 2.7, and whatever form of unittest module that comes with it. Maybe this is fixed over in the (still not relevant to me) 3.x branches. Anyway, oftentimes I will create some simple mocking in the setUp method of my test case, and then the test method will error in a way that is unrelated to my mocking. But error handling of tests happens before tearDown is called and it will collide with my mocking, so instead of finding out about what test failed I find out about this collision.

It goes something like this.

class MyTESTSSSS(unittest.TestCase):
    def setUp(self):
        self.oldOpen = open
        __builtin__.open = ReplacementOpenFunc

    def testSomething(self):
        1/0

Of course, that may not reproduce the problem, it's just for the purpose of illustration. Anyway, what's my point.. oh, in order to work around this, I now have to structure my tests in a way that enforces pre-error teardown.

class MyTESTSSSS(unittest.TestCase):
    def setUp(self):
        self.oldOpen = open
        __builtin__.open = ReplacementOpenFunc

    def preErrorTearDown(self):
        __builtin__.open = self.oldOpen

    def testSomething(self):
        try:
            self._testSomething()
        finally:
            self.preErrorTearDown()

    def _testSomething(self):
        1/0

Anyone else encounter this problem and take any different approaches to recovering from it?

If the answer involves a mocking framework, decorators, or context managers, then it is the wrong answer for me :-)

test_repr.py

Note to self. test_repr.py in the Python test suite does lots of long file name based tests. However, you don't need to have your source code under too many sub-directories on Windows before a test hits the path length limit and fails unwittingly. I think in my case it was c:\Users\MyName\Stackless\stackless-branches\release27-maint\release27-maint-export\release27-maint\...

Stuff What I Posted

Friday 17 December 2010

Pickling

Thursday 16 December 2010

unittest.py quirk

test_repr.py

Links

Blog Archive

Labels