.. _a-folder-of-jobs: A Folder Of Jobs **************** This section introduces the :py:class:`~ansar.folder.Folder` class and how it can be used to manage collections of files containing application objects. Folders can be treated like maps of objects. Hierarchies of folders with specific content at each point can be created and managed in a clear and concise way. A Folder In The Filesystem ========================== A :py:class:`~ansar.folder.Folder` is an absolute location in the filesystem. Once created it always refers to the same location, independent of where the host application may move to:: >>> import ansar as ar >>> >>> f = ar.Folder('mighty-thor') >>> f.path '/home/bjorn/mighty-thor' Bjorn is working in his home folder. Internally the :py:class:`~ansar.folder.Folder` object converts the relative name ``mighty-thor`` to the full pathname. All subsequent operations on the object will operate on that absolute location. Full pathnames passed to the :py:class:`~ansar.folder.Folder` are adopted without change and no name at all is a synonym for the current folder. Creation of :py:class:`~ansar.folder.Folder` objects is also the mechanism for creation of folders in the filesystem. This means that the ``mighty-thor`` folder is assured to exist on disk once the ``f`` variable has been assigned. Any errors result in an exception. A Folder Of Folders And Files ============================= The following code has an excellent chance of producing a folder hierarchy in your own home folder: .. code-block:: python import os import ansar as ar home = ar.Folder(os.environ['HOME']) gods = home.folder('gods') odin = gods.folder('odin') loki = gods.folder('loki') thor = gods.folder('thor') That hierarchy will look like this: .. figure:: home-zeus.png :align: center Note the use of the :py:meth:`~ansar.folder.Folder.folder` method to create `sub-folders` from the parent. The new :py:class:`~ansar.folder.Folder` refers to the `absolute location` below the parent. Remembering the ``ReceivedJob`` class, work can now be delegated with the following: .. code-block:: python f = loki.file('job', ReceivedJob) j = ReceivedJob(title='royal decree', service='herculean task') f.store(j) The :py:meth:`~ansar.folder.Folder.file` method is used to create a :py:class:`~ansar.file.File` object at the `absolute location` provided by the parent folder object. The :py:meth:`~ansar.file.File.store` method is used to pass the job on to Loki. .. note:: The parameters passed on creation of a :py:class:`~ansar.folder.Folder` are all saved in the object and influence the subsequent behaviour of class methods. They are also passed on to the child objects created by the :py:meth:`~ansar.folder.Folder.folder` and :py:meth:`~ansar.folder.Folder.file` methods, where appropriate. Listing The Files In A Folder ============================= A folder is a container of files. These can be `fixed decorations` on a known hierarchy of folders, or they can be a dynamic collection, where the set of files available at any one time is unknown. This is the case for a spooling area where jobs are persisted until completed or abandoned. The next few paragraphs are relevant to folders that behave like spooling areas. Assuming that ``loki`` is conscientious about his responsibilites, he might check for new assignments using this:: received = [m for m in loki.matching()] The :py:meth:`~ansar.folder.Folder.matching` generator method returns a sequence of the filenames detected in the folder. Given the following folder listing:: $ ls /home/bjorn/gods/loki 2888-43c4-998f-3b5671f69459.xml 4409-4182-a1fc-dde4004ccbe9.xml 549d-4ba9-9a08-f77b50540c92.xml 2856-4e96-bc0b-3840ae3b2c6a.xml 3128-4f85-9729-691661b55682.xml 2eaf-4efb-b07a-aa1ad6e67d04.xml 631b-4f18-9207-0e39940a668b.xml 1fae-4dc2-b274-149f7520bed0.xml 4995-40a3-8ccd-116bcf78fd83.xml 5f26-4d12-8276-b615244edc4e.xml 3dec-4518-be5b-953065216afc.xml b11b-4d55-8168-cdeab30ae771.xml configuration.json The :py:meth:`~ansar.folder.Folder.matching` method will return the sequence "2888-43c4-998f-3b5671f69459", "4409-4182-a1fc-dde4004ccbe9", "549d-4ba9-9a08-f77b50540c92", etc. The method automatically truncates the file extension resulting in a name suitable for any file operations that might follow. As always, this automated handling of file extension can be disabled by passing ``decorate_names=False`` on creation of the ``loki`` :py:class:`~ansar.folder.Folder` object. The ``configuration`` name will not appear in the listing as it does not end with the extension setting for the folder. If a folder is to contain a mixture of fixed decorations and dynamic content the proper way to do that is using the `re` (i.e. regular expression) parameter on creation of the :py:class:`~ansar.folder.Folder` object:: loki = gods.folder('loki', te=ReceivedJob, re='^.{27}$', encoding=ar.CodecXml) .. note:: The ``te`` parameter is optional for the :py:class:`~ansar.folder.Folder` class, unlike for the :py:class:`~ansar.file.File` class. For this reason it must be named. This brute-force expression will cause the ``loki`` folder object to limit its attention to those filenames that are 27 characters long (e.g. the length of "2888-43c4-998f-3b5671f69459"). Internally the expression match is performed on the truncated version of the filename - with no file extension. The folder can then contain fixed decorations and the :py:class:`~ansar.folder.Folder` methods involved in processing dynamic content will not "see" them. The ``configuration.json`` file can be replaced with a ``configuration.xml`` file, if that was the true intent. It is also valid to create several :py:class:`~ansar.folder.Folder` objects that refer to the same absolute location but are created with different `re` expressions. As long as the expressions describe mutually exclusive names the different dynamic collections can exist alongside each other. Of course, the simplest arrangement is for any dynamic content to be assigned its own dedicated folder. Considering the ease with which folders can be created "on disk" there is less justification for maintaining folders with mixed content. Working With A Folder Of Files ============================== The :py:meth:`~ansar.folder.Folder.each` method is similar to :py:meth:`~ansar.folder.Folder.matching` except that it returns a sequence of ready-made :py:class:`~ansar.file.File` objects. This means that the object inside the file is one method call away:: for f in loki.each(): j, _ = f.recover() # Process the job here. f.store(j) The :py:meth:`~ansar.file.File.recover` method, introduced in a previous section, is being used to load the file contents into a ``ReceivedJob``. The caller is free to process the job and perhaps save the results back into the file. Yet another method exists to further automate the processing of folders. The :py:meth:`~ansar.folder.Folder.recover` method goes all the way and returns a sequence of the ``ReceivedJob`` objects. Actually, it returns a 3-tuple of 1) a unique key, 2) the recovered object and 3) the detected version. An extra parameter is required at :py:class:`~ansar.folder.Folder` construction time:: kn = (lambda j: j.unique_id, lambda j: str(j.unique_id)) loki = gods.folder('loki', te=ReceivedJob, re='^.{27}$', encoding=ar.CodecXml, keys_names=kn) The `keys_names` parameter delivers a pair of functions to the :py:class:`~ansar.folder.Folder` object. These two functions are used internally during the execution of several :py:class:`~ansar.folder.Folder` methods, to calculate a key value and a filename, respectively. When the :py:meth:`~ansar.folder.Folder.recover` method opens a file and loads the contents, this results in an instance of the ``te``. The method then calls the first function passing the freshly loaded object. The function can make use of any of the values within the object to formulate the key. The constraints are that the result must be acceptable as a unique Python ``dict`` key and that the value is "stable", i.e. the key formulated for an object will be the same each time the object is loaded. Whatever that function produces becomes the first element of the ``k, j, _`` tuple below: .. code-block:: python jobs = {k: j for k, j, _ in loki.recover()} This gives the application complete control over the key value used by the ``dict`` comprehension. Calling the :py:meth:`~ansar.folder.Folder.store` method looks like this:: loki.store(jobs) The method iterates the collection of ``jobs`` writing the latest values from each object into a system file. To do this it uses the second ``keys_names`` function, passing the current object and getting a filename in return. The function can make use of any of the values within the object to formulate the filename. The constraints are that the result must be acceptable as a filename and the value is "stable", i.e. the filename constructed for an object will be the same each time the object is stored. In advanced use there can also be the need for an additional "tag" that distinguishes one set of :py:class:`~ansar.folder.Folder`-related materials from another. Simply adding the "job-" prefix to the constructed filename is an example of a tag. An additional collection of objects co-habiting the same space might be given the "schedule-" prefix. The final effect of the second ``keys_names`` function is that the application has complete control over where objects are stored, i.e. under what filenames. There is no requirement relating the keys and the filenames. The set of keys produced for a set of objects in a :py:class:`~ansar.folder.Folder` is independent of the set of filenames produced for those same objects. There can be cases where the same value can be used for both but doing so is a design choice. .. note:: The :py:meth:`~ansar.folder.Folder.store` and :py:meth:`~ansar.folder.Folder.recover` methods are not designed to work in the same way. The first is a method that accepts an entire ``dict`` whereas the second is a `generator` method that can be used to `construct` a ``dict``, by visiting one file at a time. This design difference is because recovery of objects involves version information and the application needs an opportunity to respond to that version, for each individual file. Refer to :ref:`versions-upgrading-and-migration` for more information. The individual jobs can be modified:: for k, j in jobs.items(): if update_job(j): loki.update(jobs, j) Or the entire collection can be processed and then saved back to the folder as a single operation:: for k, j in jobs.items(): update_job(j) loki.store(jobs) There are also methods to support adding new jobs, removing individual jobs and lastly, the removal of an entire collection. This group of methods assumes the ``dict`` object to be the canonical reference, modifying the related folder contents as needed. A Few Details ============= The 3 "scanning" methods - :py:meth:`~ansar.folder.Folder.matching`, :py:meth:`~ansar.folder.Folder.each` and :py:meth:`~ansar.folder.Folder.recover`, provide different styles of folder processing. To avoid the dangers associated with modifications to folder contents during scanning, the latter 2 methods take filename snapshots using :py:meth:`~ansar.folder.Folder.matching` and then iterate the snapshots. The style based on the :py:meth:`~ansar.folder.Folder.matching` method is the most powerful but also requires the most boilerplate code. Using the :py:meth:`~ansar.folder.Folder.each` method avoids the responsibility of creating a correct :py:class:`~ansar.file.File` object and allows for both :py:meth:`~ansar.file.File.recover` and :py:meth:`~ansar.file.File.store` operations on the individual objects. Lastly, the :py:meth:`~ansar.folder.Folder.recover` method requires the least boilerplate but is constrained in one important aspect; there is no :py:class:`~ansar.file.File` object available. Processing a folder with the :py:meth:`~ansar.folder.Folder.recover` method is a "read-only" process - without a :py:class:`~ansar.file.File` object there can be no :py:meth:`~ansar.file.File.store`. The :py:meth:`~ansar.folder.Folder.clear` method uses a snapshot to select files for deletion, rather than a wholesale delete of all folder contents. This preserves the integrity of the folder where it is being shared with fixed files, and other :py:class:`~ansar.folder.Folder` objects defined with different `re` expressions. Snapshots are also used to delete any "dangling" files at the end of a call to :py:meth:`~ansar.folder.Folder.store`. This ensures that the set of files in the folder is consistent with the contents of the presented ``dict``.