The old adage, “there’s light at the end of the tunnel,” is nearly always a sign of hope- usually a task nearly done or a difficult time nearly ended.
Sadly in today’s world of massive data production and usage, its amusing retort, “let’s hope they are not train headlights coming towards us,” is more apt.
The light which is bearing down on us is the new HMG Protective Marking scheme, its changes and how that will affect how we go about protecting and marking our documents. This new iteration of the Protective Marking scheme is scheduled to be in place and workable by April 2014.
The reasons for the change- ever mysterious thinking of some government departments- but in this case it seems to have been brought about by three drivers:
The need to align ourselves with a similar if not quite identical security marking system to co-exist with our allies to maximise the safe sharing of data. The US scheme has less categories- Top Secret, Secret, Confidential and Unclassified. The UK’s ‘new’ scheme is proposed to simplify the tiers from the current six to three. Top Secret, Secret and Official. (The whole element and concept of ‘unclassified’ is being removed.) The ‘Official category’ will no doubt have sub categories such as sensitive, internal, public and the like.
The second driver is to remove the highly prescriptive ‘resultant’ descriptors which dictate the Protective level of a document or data by its potential damage to the country, organisation or individual. The HMG wanted to create a more useful set of descriptors which removed the seeming arbitrary indicators and concentrated on protecting data by its relativity sensitivities and being able to react to its relationship with the information it covers.
Finally, it seeks to place the decision making regarding the protective level right back with the information user or creator. By having levels which reflect real life scenarios rather than abstract effect of breaches, it wants to get accountability back into the hands or keystrokes of the users. The removal of ‘Unclassified’ means that ach document will have to go through a positive classification activity- i.e. the user choosing the level- rather than the passive unclassified.
So what does this mean to most Local Authorities? Well first, there is a thought that the government will accept a ‘static’ or parallel running of the schemes. This translates into leaving existing documents marked as per today’s system and marking all new documents with the new scheme. Time will phase out the old system as old documents time-expire and with little effort the new system will be implemented.
Critics of that methodology point out that in fact the scheme becomes more complex as people handle dual systems, with potentially differing markings on identical documents. Also the use of documentation may reduce as people seek to limit the use of old data and thus effectiveness may reduce.
The alternative is to remark all existing documents with the new marking scheme- again widely believed to potentially take the path of least resistance and end up with whole swathes of incorrectly marked documents.
Well for once, the Authorities who have not implemented Protective marking are almost on a level playing field with their colleagues who had done so.
The perennial problem is to be able to assess the content of the document and place it into a Protective marking category. Doing it manually is impossible. It was always a convenient way to move the issue, and the cost, onto the front line departments who conveniently could be said to be ‘closer to the coalface so more able to make the best judgements regarding the protective level.’
This of course has failed dismally. The individuals are struggling to stay with today’s issues and citizen facing activities, so the thought of them being able to start to hunt through historical documents and categorise them is delusional.” OK- Hire some temps- bring in some interns.” The mantra goes- of course it means that a decision is made which protects money – enables an answer to be given to reviewers as to action being taken but doesn’t ever sort the problem.
Just to understand the mathematics- studies have shown that manual system is actually unrealistic, but for comparison purposes a guide to the costs can be obtained from a research article (Oard et al. 2008) who stated that in a scientifically supervised trial, a team of undergraduate and graduate legal students achieved an average of sorting 21.5 documents per hour, but at a categorisation accuracy of only 55.5%- so to manually sort 50,000 documents would take 50,000/21.5 = 2325 hours (at 6 hours per day) =387 days or nearly 2 man years. (With an almost 1 in 2 error rate.)
But the problem of looking for needles in haystacks is not a new one. Unlike the movies where the hero finds the missing document and links the plot together, real life is not so dramatic, but does have the same problems.
Legal companies often have to search through millions of documents to find the ‘smoking gun.’ Enron was a case in point where the resources needed to find the connections and who knew what was almost more expensive that the legal representatives- Almost!!
Apperception Services Ltd has developed a system called DataCube. This uses its ability to conceptually search unstructured data and sort documents into subject categories by the concepts of the content of the document and not by title, strings of words or names.
Using a taxonomy of either your own design or in the case of the Local government market the LGCS, (personalised to your Authority,) DataCube has created a system which can search and locate each document and place it into an appropriate LGCS category. Each Category has parameters for each of the activities and transactions (data retention dates, protective labelling level) which enable the document to be classified into the appropriate Protective Labelling category, including the exceptions to the rules.
Having sorted each document into its LGCS category and sub category, DataCube then runs a labelling programme against the dataset, marking the metadata with its LGCS category and its protective level.
This enable each Authority to retro-label all of its legacy data. It identifies out of retention time documents, it identifies illegal (whether against policy or legal) files such as YouTube or music downloads, and also can identify where each document is located within the Authorities system.
Confidential documents can be found and removed if inappropriately stored, Orphan documents – from long dead projects or retired or employees who have left can be secured or removed.
Existing labelled documents can be similarly dealt with by the much more efficient method of addressing each of the LGCS categories and either reaffirming or changing the Protective marking parameters and then running the DataCube indexing against the data set to create new accurately positioned protective labels.
So don’t panic- that light coming towards you just might be a rescue team armed with the DataCube!!
Please feel free to contact Apperception and discuss the steps we can help you take in defending your data.
Oard et al. 2008
HMG Security Policy Framework…