Really Complicated Documents¶
The library supports arbitrarily structured objects through the type expression feature. Expressions can include sequences, collections and nested messages, covering most application persistence requirements.
Type expressions can also include the PointerTo type. This is the mechanism by which applications can store and recover graph objects - linked lists and trees. Even something as convoluted as an AST (i.e. abstract syntax tree) for a regular expression, can be stored and recovered.
The Python memory model is a handle-based model - there is no explicit support for pointers as there is in languages such as C++ and Go. However, handles can be managed in a way that simulates pointer behaviour and this is what the library PointerTo type is for.
Value Versus Pointer¶
To understand how the library can store and recover complex graph objects, the next few paragraphs demonstrate the use of PointerTo and how that manifests in the stored representations.
Consider the following Python script:
>> import ansar as ar
>> f = ar.File('flag', bool)
>> b = False
>> f.store(b)
This produces perhaps the simplest possible application object:
{
"value": false
}
By making a single change to the type expression:
>> import ansar as ar
>> f = ar.File('pointer-to-flag', ar.PointerTo(bool))
>> b = False
>> f.store(b)
The change in representation illustrates the effect of the change in type expression:
{
"aliased": [
[
1100,
false
]
],
"value": 1100
}
The value
has changed from false
to the small integer 1100
and
that same integer value appears in the aliased
list, paired up with
the false
value.
There is no obvious difference when recovering the representation:
>>> r, _ = f.recover()
>>> r
False
Many Pointers To The Same Value¶
The next step is to demonstrate how the management of PointerTo objects can be used to share a common object. An example of sharing a value in Python, looks like this:
>> b = True
>> a = [b, b, b, b]
Creating a stored representation looks like this:
>> f = ar.File('array-of-pointer', ar.ArrayOf(ar.PointerTo(bool), 4))
>> f.store(a)
The representation records the single instance of a bool
and
the multiple references to that instance:
{
"aliased": [
[
1100,
true
]
],
"value": [
1100,
1100,
1100,
1100
]
}
After recovering this representation, a few careful Python expressions expose the true reason for the existence of PointerTo:
>>> r, _ = f.recover()
>>> r
[True, True, True, True]
>>> id(r[0]) == id(r[2])
True
>>> id(r[0]) == id(r[3])
True
Comparing the id()s
of the array elements shows that the recovered array
r
is populated with the single handle, referring to a common bool
value.
Exploiting this ability to manage Python handles leads to much more significant constructions, such as linked lists. The following sections show how such lists can be built, stored and recovered.
A Linked List¶
Consider the following class declaration:
import ansar as ar
class Node(ar.Message):
def __init__(self, next=None):
ar.Message.__init__(self)
self.next = next
ar.bind(Node, type_details={
'next': ar.PointerTo(Node),
})
A Node
contains a single member next
that refers to the following
instance of a Node
, presumably until there is a None
value. This
Python session demonstrates the construction and persistence of a linked-list
of Nodes
:
>>> next = None
>>> for i in range(4):
... c = Node(next)
... next = c
...
>>> c
<__main__.Node object at 0x7f711cfc0550>
>>> c.next
<__main__.Node object at 0x7f711cfc0510>
>>> c.next.next
<__main__.Node object at 0x7f711cfc0490>
>>> c.next.next.next
<__main__.Node object at 0x7f711cfc04d0>
>>> c.next.next.next.next
>>>
>>> f = ar.File('linked-list', ar.PointerTo(Node))
>>> f.store(c)
>>> r, _ = f.recover()
>>> r
<__main__.Node object at 0x7f711cfc0d50>
>>> r.next
<__main__.Node object at 0x7f711cfc80d0>
>>> r.next.next
<__main__.Node object at 0x7f711cfc8190>
>>> r.next.next.next
<__main__.Node object at 0x7f711cfc8250>
>>> r.next.next.next.next
The for
loop creates a linked-list in a reverse direction - the first
Node
object created is eventually the last of 4 Nodes
in the list.
The list passes through store()
and recover()
operations. The recovered
list is examined in the same manner as the original, verifing that all
links are in place and that the list is properly terminated (i.e. the final
next
member is set to the value None
).
The stored representation looks like this:
{
"aliased": [
[
1100,
{
"next": 1101
}
],
[
1101,
{
"next": 1102
}
],
[
1102,
{
"next": 1103
}
],
[
1103,
{
"next": null
}
]
],
"value": 1100
}
A Loop Of Links¶
A single additional line of Python links the final Node
back to the
start of the list, creating an endless loop. The library detects “cycles”
or “circular references” and uses back-patching to complete complex
constructions of this nature. This ability is necessary for the proper
handling of graph objects:
>>> next = None
>>> for i in range(4):
... c = Node(next)
... next = c
...
>>> c
<__main__.Node object at 0x7fe4c7150650>
>>> c.next.next.next.next
>>> c.next.next.next.next = c
>>> c.next.next.next.next
<__main__.Node object at 0x7fe4c7150650>
>>> c.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7fe4c7150650>
>>>
>>> f = ar.File('linked-loop', ar.PointerTo(Node))
>>> f.store(c)
>>>
>>> r, _ = f.recover()
>>> r
<__main__.Node object at 0x7fe4c7156150>
>>> r.next.next.next.next
<__main__.Node object at 0x7fe4c7156150>
>>>
The assignment of the start Node
to the fourth link, closes the loop. The
loop is then passed through a store and recover process. The start object and
final link of the recovered loop, are shown to be the same address. The
stored representation looks like:
{
"aliased": [
[
1100,
{
"next": 1101
}
],
[
1101,
{
"next": 1102
}
],
[
1102,
{
"next": 1103
}
],
[
1103,
{
"next": 1100
}
]
],
"value": 1100
}
A Map Of Loops¶
Linking can be combined with the structured types. Here is the construction of a map of loops:
>>> f = ar.File('map-loop', ar.MapOf(str,ar.PointerTo(Node)))
>>> m = {}
>>> def loop():
... next = None
... for x in range(4):
... n = Node(next)
... next = n
... next.next.next.next.next = next
... return next
...
>>> m['bjorn']=loop()
>>> m['sven']=loop()
>>> m['hilda']=loop()
>>> m['freya']=loop()
>>> bjorn = m['bjorn']
>>> bjorn
<__main__.Node object at 0x7f629da00690>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7f629da00690>
>>> bjorn.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7f629da00690>
>>> f.store(m)
>>> r, _ = f.recover()
>>>
... bjorn = r['bjorn']
>>> bjorn
<__main__.Node object at 0x7f629da07650>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7f629da07650>
A dict
is populated with named loops. This creates a total of 16 Node
objects
divided across 4 loops. This can all be traced through the following representation:
{
"aliased": [
[
1100,
{
"next": 1101
}
],
[
1101,
{
"next": 1102
}
],
[
1102,
{
"next": 1100
}
],
[
1103,
{
"next": 1104
}
],
[
1104,
{
"next": 1105
}
],
[
1105,
{
"next": 1103
}
],
[
1106,
{
"next": 1107
}
],
[
1107,
{
"next": 1108
}
],
[
1108,
{
"next": 1106
}
],
[
1109,
{
"next": 1110
}
],
[
1110,
{
"next": 1111
}
],
[
1111,
{
"next": 1109
}
]
],
"value": [
[
"bjorn",
1100
],
[
"sven",
1103
],
[
"hilda",
1106
],
[
"freya",
1109
]
]
}
Warning
The entries in the MapOf must be PointerTo(Node) not
plain Node
.
A Document With Everything¶
This section combines all the constructs presented in the preceding sections. The result is a single application object that includes significant structuring and graph object elements. The techniques demonstrated should be enough to serve as the basis for almost any imaginable data persistence requirement.
The class declaration looks like this:
import ansar as ar
class Node(ar.Message):
def __init__(self, next=None):
ar.Message.__init__(self)
self.next = next
ar.bind(Node, type_details={
'next': ar.PointerTo(Node),
})
class Doc(ar.Message):
def __init__(self, flag=None, pointer_to_flag=None, array_to_flag=None, linked_list=None, loop=None, map_of_loops=None):
ar.Message.__init__(self)
self.flag=flag
self.pointer_to_flag=pointer_to_flag
self.array_to_flag=array_to_flag
self.linked_list=linked_list
self.loop=loop
self.map_of_loops=map_of_loops
ar.bind(Doc, type_details={
'flag': bool,
'pointer_to_flag': ar.PointerTo(bool),
'array_to_flag': ar.ArrayOf(ar.PointerTo(bool), 4),
'linked_list': ar.PointerTo(Node),
'loop': ar.PointerTo(Node),
'map_of_loops': ar.MapOf(str,ar.PointerTo(Node)),
})
Construction and verification of a recovered Doc
looks like:
>>> f = ar.File('doc', Doc)
>>> d = Doc()
>>> def links():
... next = None
... for x in range(4):
... n = Node(next)
... next = n
... return next
...
>>> def loop():
... next = links()
... next.next.next.next.next = next
... return next
...
>>> flag = True
>>> d.flag = flag
>>> d.pointer_to_flag = flag
>>> d.array_to_flag = [flag, flag, flag, flag]
>>> d.linked_list = links()
>>> d.loop = loop()
>>> m = {}
>>> m['bjorn']=loop()
>>> m['sven']= loop()
>>> m['hilda']=loop()
>>> m['freya']=loop()
>>> d.map_of_loops = m
>>>
>>> f.store(d)
>>> r, _ = f.recover()
>>>
... bjorn = r.map_of_loops['bjorn']
>>> bjorn
<__main__.Node object at 0x7f73f3521490>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7f73f3521490>
>>> bjorn.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7f73f3521490>
The complete representation appears below:
{
"aliased": [
[
1100,
true
],
[
1101,
{
"next": 1102
}
],
[
1102,
{
"next": 1103
}
],
[
1103,
{
"next": 1104
}
],
[
1104,
{
"next": null
}
],
[
1105,
{
"next": 1106
}
],
[
1106,
{
"next": 1107
}
],
[
1107,
{
"next": 1108
}
],
[
1108,
{
"next": 1105
}
],
[
1109,
{
"next": 1110
}
],
[
1110,
{
"next": 1111
}
],
[
1111,
{
"next": 1112
}
],
[
1112,
{
"next": 1109
}
],
[
1113,
{
"next": 1114
}
],
[
1114,
{
"next": 1115
}
],
[
1115,
{
"next": 1116
}
],
[
1116,
{
"next": 1113
}
],
[
1117,
{
"next": 1118
}
],
[
1118,
{
"next": 1119
}
],
[
1119,
{
"next": 1120
}
],
[
1120,
{
"next": 1117
}
],
[
1121,
{
"next": 1122
}
],
[
1122,
{
"next": 1123
}
],
[
1123,
{
"next": 1124
}
],
[
1124,
{
"next": 1121
}
]
],
"value": {
"array_to_flag": [
1100,
1100,
1100,
1100
],
"flag": true,
"linked_list": 1101,
"loop": 1105,
"map_of_loops": [
[
"bjorn",
1109
],
[
"sven",
1113
],
[
"hilda",
1117
],
[
"freya",
1121
]
],
"pointer_to_flag": 1100
}
}