The scope of this service is to provide an effective solution for analysing and categorising all of an organisation's legacy data in order to apply the required retention period data and to undertake a data cleansing exercise to remove files that have exceeded the specified data. For most organisations, because of the volume of data, this exercise is impossible to undertake manually. The DataCube uniquely provides an effective approach to this problem and the appliance is introduced into your network to be used by Apperception's staff to complete this service.
The DataCube is a workflow management system to discover, collect, analyse, index and categorise large volumes of data across distributed file systems. A DataCube appliance is installed and configured into the network and when the files have been identified, a data inventory is built that contains a record for every required data file, identifying the file name, file path and various attributes, obtained from analysing the content and metadata properties of the file. At the core of the DataCube are three essential components:
- A Latent Semantic Indexing engine that analyses the content of the data file and compares the words in the text against example documents (“exemplars”). The LSI engine identifies similar concepts and establishes how strongly the data file correlates any given exemplar and builds and index of all of the data.
- A Schema Management module that contains details of categorises to be used for an organisation’s taxonomy and defines attributes and rules for what should be contained for each category.(please see Taxonomy / Schema Creation Services) For the purpose of this service, DataCube uses up to 25 exemplars which have been identified for each schema/taxonomy category that can be used by the LSI engine to identify which category any given file in the data inventory belongs to. When the category has been identified, the properties for that category are given to the data file.
- The Retention Policy Management module which provides reports on which documents may be destroyed or should be reviewed based on their retention status. The module is configurable, allowing different users to review and delete or just review.
The DataCube’s Schema Management module will be used to create a bespoke schema (or a personalised version of the already created Local Government Classification Schema)), based on the client’s Retention Policy.
The DataCube LGCS Schema already contains exemplars relating to the UK Local Government Classification Schema (LGCS) and, although, these can be used as the basis for Local Government clients, we will need to identify exemplars of specific documents used by the client. These may include supplier contracts, employment contracts, service records and case files.
The DataCube builds an index of all of the documents to be categorised and this index is used, along with the client schema to update the database record for each document. At the same time as the documents are categorised, their deletion date or earliest review date is also recorded.
Once the documents have been categorised, the Retention Policy Management module provides detailed reports on the documents which are due for deletion or review. Those which are due for deletion may be automatically deleted from the interface. Those which are due for review are presented and the operator is given the choice whether to ignore or delete the document and the opportunity to reset the earliest review date
The system does not change your original data or its locations (except by placing meta data tag on it- even the last accessed data stays the same) and is carried out on your premises.