Really Complicated Documents

The library supports arbitrarily structured objects through the type expression feature. Expressions can include sequences, collections and nested messages, covering most application persistence requirements.

Type expressions can also include the PointerTo type. This is the mechanism by which applications can store and recover graph objects - linked lists and trees. Even something as convoluted as an AST (i.e. abstract syntax tree) for a regular expression, can be stored and recovered.

The Python memory model is a handle-based model - there is no explicit support for pointers as there is in languages such as C++ and Go. However, handles can be managed in a way that simulates pointer behaviour and this is what the library PointerTo type is for.

Value Versus Pointer

To understand how the library can store and recover complex graph objects, the next few paragraphs demonstrate the use of PointerTo and how that manifests in the stored representations.

Consider the following Python script:

>> import ansar as ar
>> f = ar.File('flag', bool)
>> b = False
>> f.store(b)

This produces perhaps the simplest possible application object:

{
    "value": false
}

By making a single change to the type expression:

>> import ansar as ar
>> f = ar.File('pointer-to-flag', ar.PointerTo(bool))
>> b = False
>> f.store(b)

The change in representation illustrates the effect of the change in type expression:

{
    "aliased": [
        [
            1100,
            false
        ]
    ],
    "value": 1100
}

The value has changed from false to the small integer 1100 and that same integer value appears in the aliased list, paired up with the false value.

There is no obvious difference when recovering the representation:

>>> r, _ = f.recover()
>>> r
False

Many Pointers To The Same Value

The next step is to demonstrate how the management of PointerTo objects can be used to share a common object. An example of sharing a value in Python, looks like this:

>> b = True
>> a = [b, b, b, b]

Creating a stored representation looks like this:

>> f = ar.File('array-of-pointer', ar.ArrayOf(ar.PointerTo(bool), 4))
>> f.store(a)

The representation records the single instance of a bool and the multiple references to that instance:

{
    "aliased": [
        [
            1100,
            true
        ]
    ],
    "value": [
        1100,
        1100,
        1100,
        1100
    ]
}

After recovering this representation, a few careful Python expressions expose the true reason for the existence of PointerTo:

>>> r, _ = f.recover()
>>> r
[True, True, True, True]
>>> id(r[0]) == id(r[2])
True
>>> id(r[0]) == id(r[3])
True

Comparing the id()s of the array elements shows that the recovered array r is populated with the single handle, referring to a common bool value.

Exploiting this ability to manage Python handles leads to much more significant constructions, such as linked lists. The following sections show how such lists can be built, stored and recovered.

A Linked List

Consider the following class declaration:

import ansar as ar

class Node(ar.Message):
    def __init__(self, next=None):
        ar.Message.__init__(self)
        self.next = next

ar.bind(Node, type_details={
    'next': ar.PointerTo(Node),
})

A Node contains a single member next that refers to the following instance of a Node, presumably until there is a None value. This Python session demonstrates the construction and persistence of a linked-list of Nodes:

>>> next = None
>>> for i in range(4):
...     c = Node(next)
...     next = c
...
>>> c
<__main__.Node object at 0x7f711cfc0550>
>>> c.next
<__main__.Node object at 0x7f711cfc0510>
>>> c.next.next
<__main__.Node object at 0x7f711cfc0490>
>>> c.next.next.next
<__main__.Node object at 0x7f711cfc04d0>
>>> c.next.next.next.next
>>>
>>> f = ar.File('linked-list', ar.PointerTo(Node))
>>> f.store(c)
>>> r, _ = f.recover()
>>> r
<__main__.Node object at 0x7f711cfc0d50>
>>> r.next
<__main__.Node object at 0x7f711cfc80d0>
>>> r.next.next
<__main__.Node object at 0x7f711cfc8190>
>>> r.next.next.next
<__main__.Node object at 0x7f711cfc8250>
>>> r.next.next.next.next

The for loop creates a linked-list in a reverse direction - the first Node object created is eventually the last of 4 Nodes in the list. The list passes through store() and recover() operations. The recovered list is examined in the same manner as the original, verifing that all links are in place and that the list is properly terminated (i.e. the final next member is set to the value None).

The stored representation looks like this:

{
    "aliased": [
        [
            1100,
            {
                "next": 1101
            }
        ],
        [
            1101,
            {
                "next": 1102
            }
        ],
        [
            1102,
            {
                "next": 1103
            }
        ],
        [
            1103,
            {
                "next": null
            }
        ]
    ],
    "value": 1100
}

A Map Of Loops

Linking can be combined with the structured types. Here is the construction of a map of loops:

>>> f = ar.File('map-loop', ar.MapOf(str,ar.PointerTo(Node)))
>>> m = {}
>>> def loop():
...      next = None
...      for x in range(4):
...          n = Node(next)
...          next = n
...      next.next.next.next.next = next
...      return next
...
>>> m['bjorn']=loop()
>>> m['sven']=loop()
>>> m['hilda']=loop()
>>> m['freya']=loop()
>>> bjorn = m['bjorn']
>>> bjorn
<__main__.Node object at 0x7f629da00690>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7f629da00690>
>>> bjorn.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7f629da00690>
>>> f.store(m)
>>> r, _ = f.recover()
>>>
... bjorn = r['bjorn']
>>> bjorn
<__main__.Node object at 0x7f629da07650>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7f629da07650>

A dict is populated with named loops. This creates a total of 16 Node objects divided across 4 loops. This can all be traced through the following representation:

{
    "aliased": [
        [
            1100,
            {
                "next": 1101
            }
        ],
        [
            1101,
            {
                "next": 1102
            }
        ],
        [
            1102,
            {
                "next": 1100
            }
        ],
        [
            1103,
            {
                "next": 1104
            }
        ],
        [
            1104,
            {
                "next": 1105
            }
        ],
        [
            1105,
            {
                "next": 1103
            }
        ],
        [
            1106,
            {
                "next": 1107
            }
        ],
        [
            1107,
            {
                "next": 1108
            }
        ],
        [
            1108,
            {
                "next": 1106
            }
        ],
        [
            1109,
            {
                "next": 1110
            }
        ],
        [
            1110,
            {
                "next": 1111
            }
        ],
        [
            1111,
            {
                "next": 1109
            }
        ]
    ],
    "value": [
        [
            "bjorn",
            1100
        ],
        [
            "sven",
            1103
        ],
        [
            "hilda",
            1106
        ],
        [
            "freya",
            1109
        ]
    ]
}

Warning

The entries in the MapOf must be PointerTo(Node) not plain Node.

A Document With Everything

This section combines all the constructs presented in the preceding sections. The result is a single application object that includes significant structuring and graph object elements. The techniques demonstrated should be enough to serve as the basis for almost any imaginable data persistence requirement.

The class declaration looks like this:

import ansar as ar

class Node(ar.Message):
    def __init__(self, next=None):
        ar.Message.__init__(self)
        self.next = next

ar.bind(Node, type_details={
    'next': ar.PointerTo(Node),
})

class Doc(ar.Message):
    def __init__(self, flag=None, pointer_to_flag=None, array_to_flag=None, linked_list=None, loop=None, map_of_loops=None):
        ar.Message.__init__(self)
        self.flag=flag
        self.pointer_to_flag=pointer_to_flag
        self.array_to_flag=array_to_flag
        self.linked_list=linked_list
        self.loop=loop
        self.map_of_loops=map_of_loops

ar.bind(Doc, type_details={
    'flag': bool,
    'pointer_to_flag': ar.PointerTo(bool),
    'array_to_flag': ar.ArrayOf(ar.PointerTo(bool), 4),
    'linked_list': ar.PointerTo(Node),
    'loop': ar.PointerTo(Node),
    'map_of_loops': ar.MapOf(str,ar.PointerTo(Node)),
})

Construction and verification of a recovered Doc looks like:

>>> f = ar.File('doc', Doc)
>>> d = Doc()
>>> def links():
...     next = None
...     for x in range(4):
...         n = Node(next)
...         next = n
...     return next
...
>>> def loop():
...     next = links()
...     next.next.next.next.next = next
...     return next
...
>>> flag = True
>>> d.flag = flag
>>> d.pointer_to_flag = flag
>>> d.array_to_flag = [flag, flag, flag, flag]
>>> d.linked_list = links()
>>> d.loop = loop()
>>> m = {}
>>> m['bjorn']=loop()
>>> m['sven']= loop()
>>> m['hilda']=loop()
>>> m['freya']=loop()
>>> d.map_of_loops = m
>>>
>>> f.store(d)
>>> r, _ = f.recover()
>>>
... bjorn = r.map_of_loops['bjorn']
>>> bjorn
<__main__.Node object at 0x7f73f3521490>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7f73f3521490>
>>> bjorn.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7f73f3521490>

The complete representation appears below:

{
    "aliased": [
        [
            1100,
            true
        ],
        [
            1101,
            {
                "next": 1102
            }
        ],
        [
            1102,
            {
                "next": 1103
            }
        ],
        [
            1103,
            {
                "next": 1104
            }
        ],
        [
            1104,
            {
                "next": null
            }
        ],
        [
            1105,
            {
                "next": 1106
            }
        ],
        [
            1106,
            {
                "next": 1107
            }
        ],
        [
            1107,
            {
                "next": 1108
            }
        ],
        [
            1108,
            {
                "next": 1105
            }
        ],
        [
            1109,
            {
                "next": 1110
            }
        ],
        [
            1110,
            {
                "next": 1111
            }
        ],
        [
            1111,
            {
                "next": 1112
            }
        ],
        [
            1112,
            {
                "next": 1109
            }
        ],
        [
            1113,
            {
                "next": 1114
            }
        ],
        [
            1114,
            {
                "next": 1115
            }
        ],
        [
            1115,
            {
                "next": 1116
            }
        ],
        [
            1116,
            {
                "next": 1113
            }
        ],
        [
            1117,
            {
                "next": 1118
            }
        ],
        [
            1118,
            {
                "next": 1119
            }
        ],
        [
            1119,
            {
                "next": 1120
            }
        ],
        [
            1120,
            {
                "next": 1117
            }
        ],
        [
            1121,
            {
                "next": 1122
            }
        ],
        [
            1122,
            {
                "next": 1123
            }
        ],
        [
            1123,
            {
                "next": 1124
            }
        ],
        [
            1124,
            {
                "next": 1121
            }
        ]
    ],
    "value": {
        "array_to_flag": [
            1100,
            1100,
            1100,
            1100
        ],
        "flag": true,
        "linked_list": 1101,
        "loop": 1105,
        "map_of_loops": [
            [
                "bjorn",
                1109
            ],
            [
                "sven",
                1113
            ],
            [
                "hilda",
                1117
            ],
            [
                "freya",
                1121
            ]
        ],
        "pointer_to_flag": 1100
    }
}