The emerging fourth generation of electronic records management technologies - and the generation I'm stuck on.
I gave a presentation recently about how I see the records market. It forced me to consider the different generations of technology, and while there are different ways to slice it, this is one that I think is relevent to the current market and the practice transition that's going on at the moment.
To me, it's useful to think about records systems as a place to keep the records - a repository, and a place to keep the metadata about them - which includes the classification information.
I'm going to skip over everything before we had electronic registries - because as interesting as tally sticks are, I don't think those technologies (yes, they're technologies) significantly impact on the transition that's happening at the moment.
When I look at the market through the repository and metadata lens, I see three generations with one emerging -
- First generation - electronic registries with physical repositories.
- Second generation - electronic registries with tightly coupled electronic repositories.
- Third generation - electronic registries with loosely coupled repositories that require structure to drive classification - think of the vendors that use sharepoint as their repository, but still require you to tie a location in sharepoint to a classification in the registry.
- Fourth generation - electronic registries with loosely coupled repositories that do not require structure to classify records - that instead use crawlers to crawl a repository and understand what content is available to capture in the registry, and then autoclassification as the primary/only means of classification.
Each generation solved a problem.
The first generation took us away from card catalogues and solved a set of challenges around access to the catalogue and how it could be searched.
The second generation solved the challenge that managing separated electronic repositories and electronic registries came with the problem that our custodial model of records didn't work unless we could take custody - and without control of a repository, we didn't really have the control of the record that custody required.
The third generation started to solve the problem that moving records is impractical, and creates a set of challenges around duplication. These systems started to give organisations the ability to manage records in a place that they didn't own with various versions of SharePoint as the most common target but file servers well represented as well. In some players, this represents evolution from second generation, but one of the typical things that second generation players are doing is synchronising the managed repository with their own - and maintaining that synchronisation over time. The key feature of these records systems that distinguishes them from future systems (I think) will be the reliance on the need to structure information in specific ways to drive classification of it.
The fourth generation is the one that's starting to emerge. The key distinguishing feature of it is the reliance on some branch of AI for classification - whether this is machine learning, natural language processing or something else. The typical model for this type of system is that it connects to a repository, does some kind of indexing process to build a registry, and then uses Autoclassification to classify the objects in the repository. This means that structure of the repository (ie. BCS) becomes irrelevent to decisions about classification with all its inherent problems.
There's another generation that I'm really stuck on how to categorise. There's a good argument to be made for business systems as pre-dating the first generation described above. The relational database model has been around since 1970's, and people have been creating records in them by any definition that we have now of records. I'm a bit stuck on it - so I'm interested in your thoughts.