.. _really-complicated-documents: Really Complicated Documents **************************** The library supports arbitrarily structured objects through the type expression feature. Expressions can include sequences, collections and nested messages, covering most application persistence requirements. Type expressions can also include the :ref:`PointerTo` type. This is the mechanism by which applications can store and recover `graph` objects - linked lists and trees. Even something as convoluted as an AST (i.e. abstract syntax tree) for a regular expression, can be stored and recovered. The Python memory model is a handle-based model - there is no explicit support for pointers as there is in languages such as C++ and Go. However, handles can be managed in a way that simulates pointer behaviour and this is what the library :ref:`PointerTo` type is for. Value Versus Pointer ==================== To understand how the library can store and recover complex graph objects, the next few paragraphs demonstrate the use of :ref:`PointerTo` and how that manifests in the stored representations. Consider the following Python script:: >> import ansar as ar >> f = ar.File('flag', bool) >> b = False >> f.store(b) This produces perhaps the simplest possible application object: .. literalinclude:: include/bool.json By making a single change to the type expression:: >> import ansar as ar >> f = ar.File('pointer-to-flag', ar.PointerTo(bool)) >> b = False >> f.store(b) The change in representation illustrates the effect of the change in type expression: .. literalinclude:: include/pointer-to-bool.json The ``value`` has changed from ``false`` to the small integer ``1100`` and that same integer value appears in the ``aliased`` list, paired up with the ``false`` value. There is no obvious difference when recovering the representation:: >>> r, _ = f.recover() >>> r False Many Pointers To The Same Value ------------------------------- The next step is to demonstrate how the management of :ref:`PointerTo` objects can be used to share a common object. An example of sharing a value in Python, looks like this:: >> b = True >> a = [b, b, b, b] Creating a stored representation looks like this:: >> f = ar.File('array-of-pointer', ar.ArrayOf(ar.PointerTo(bool), 4)) >> f.store(a) The representation records the single instance of a ``bool`` and the multiple references to that instance: .. literalinclude:: include/array-of-pointer.json After recovering this representation, a few careful Python expressions expose the true reason for the existence of :ref:`PointerTo`:: >>> r, _ = f.recover() >>> r [True, True, True, True] >>> id(r[0]) == id(r[2]) True >>> id(r[0]) == id(r[3]) True Comparing the ``id()s`` of the array elements shows that the recovered array ``r`` is populated with the single handle, referring to a common ``bool`` value. Exploiting this ability to manage Python handles leads to much more significant constructions, such as linked lists. The following sections show how such lists can be built, stored and recovered. A Linked List ============= Consider the following class declaration: .. literalinclude:: include/linked-list.py A ``Node`` contains a single member ``next`` that refers to the following instance of a ``Node``, presumably until there is a ``None`` value. This Python session demonstrates the construction and persistence of a linked-list of ``Nodes``:: >>> next = None >>> for i in range(4): ... c = Node(next) ... next = c ... >>> c <__main__.Node object at 0x7f711cfc0550> >>> c.next <__main__.Node object at 0x7f711cfc0510> >>> c.next.next <__main__.Node object at 0x7f711cfc0490> >>> c.next.next.next <__main__.Node object at 0x7f711cfc04d0> >>> c.next.next.next.next >>> >>> f = ar.File('linked-list', ar.PointerTo(Node)) >>> f.store(c) >>> r, _ = f.recover() >>> r <__main__.Node object at 0x7f711cfc0d50> >>> r.next <__main__.Node object at 0x7f711cfc80d0> >>> r.next.next <__main__.Node object at 0x7f711cfc8190> >>> r.next.next.next <__main__.Node object at 0x7f711cfc8250> >>> r.next.next.next.next The ``for`` loop creates a linked-list in a reverse direction - the first ``Node`` object created is eventually the last of 4 ``Nodes`` in the list. The list passes through :py:meth:`~ansar.file.File.store` and :py:meth:`~ansar.file.File.recover` operations. The recovered list is examined in the same manner as the original, verifing that all links are in place and that the list is properly terminated (i.e. the final ``next`` member is set to the value ``None``). The stored representation looks like this: .. literalinclude:: include/linked-list.json A Loop Of Links =============== A single additional line of Python links the final ``Node`` back to the start of the list, creating an endless loop. The library detects "cycles" or "circular references" and uses back-patching to complete complex constructions of this nature. This ability is necessary for the proper handling of graph objects:: >>> next = None >>> for i in range(4): ... c = Node(next) ... next = c ... >>> c <__main__.Node object at 0x7fe4c7150650> >>> c.next.next.next.next >>> c.next.next.next.next = c >>> c.next.next.next.next <__main__.Node object at 0x7fe4c7150650> >>> c.next.next.next.next.next.next.next.next <__main__.Node object at 0x7fe4c7150650> >>> >>> f = ar.File('linked-loop', ar.PointerTo(Node)) >>> f.store(c) >>> >>> r, _ = f.recover() >>> r <__main__.Node object at 0x7fe4c7156150> >>> r.next.next.next.next <__main__.Node object at 0x7fe4c7156150> >>> The assignment of the start ``Node`` to the fourth link, closes the loop. The loop is then passed through a store and recover process. The start object and final link of the recovered loop, are shown to be the same address. The stored representation looks like: .. literalinclude:: include/linked-loop.json A Map Of Loops ============== Linking can be combined with the structured types. Here is the construction of a map of loops:: >>> f = ar.File('map-loop', ar.MapOf(str,ar.PointerTo(Node))) >>> m = {} >>> def loop(): ... next = None ... for x in range(4): ... n = Node(next) ... next = n ... next.next.next.next.next = next ... return next ... >>> m['bjorn']=loop() >>> m['sven']=loop() >>> m['hilda']=loop() >>> m['freya']=loop() >>> bjorn = m['bjorn'] >>> bjorn <__main__.Node object at 0x7f629da00690> >>> bjorn.next.next.next.next <__main__.Node object at 0x7f629da00690> >>> bjorn.next.next.next.next.next.next.next.next <__main__.Node object at 0x7f629da00690> >>> f.store(m) >>> r, _ = f.recover() >>> ... bjorn = r['bjorn'] >>> bjorn <__main__.Node object at 0x7f629da07650> >>> bjorn.next.next.next.next <__main__.Node object at 0x7f629da07650> A ``dict`` is populated with named loops. This creates a total of 16 ``Node`` objects divided across 4 loops. This can all be traced through the following representation: .. literalinclude:: include/map-loop.json .. warning:: The entries in the :ref:`MapOf` must be :ref:`PointerTo(Node)` `not` plain ``Node``. A Document With Everything ========================== This section combines all the constructs presented in the preceding sections. The result is a single application object that includes significant structuring and graph object elements. The techniques demonstrated should be enough to serve as the basis for almost any imaginable data persistence requirement. The class declaration looks like this: .. literalinclude:: class-and-store/doc_every.py Construction and verification of a recovered ``Doc`` looks like:: >>> f = ar.File('doc', Doc) >>> d = Doc() >>> def links(): ... next = None ... for x in range(4): ... n = Node(next) ... next = n ... return next ... >>> def loop(): ... next = links() ... next.next.next.next.next = next ... return next ... >>> flag = True >>> d.flag = flag >>> d.pointer_to_flag = flag >>> d.array_to_flag = [flag, flag, flag, flag] >>> d.linked_list = links() >>> d.loop = loop() >>> m = {} >>> m['bjorn']=loop() >>> m['sven']= loop() >>> m['hilda']=loop() >>> m['freya']=loop() >>> d.map_of_loops = m >>> >>> f.store(d) >>> r, _ = f.recover() >>> ... bjorn = r.map_of_loops['bjorn'] >>> bjorn <__main__.Node object at 0x7f73f3521490> >>> bjorn.next.next.next.next <__main__.Node object at 0x7f73f3521490> >>> bjorn.next.next.next.next.next.next.next.next <__main__.Node object at 0x7f73f3521490> The complete representation appears below: .. literalinclude:: class-and-store/doc_every.json