Next Generation Enterprise Historians

We are at a crossroad as technology advancement in information systems is challenging our fundamentals, and so it should. But what are we at risk of losing as we progress?

Are Historians Old or New Tech?

In this article, Process Historian is used synonymously as Enterprise Historian.

As a rule, we have vintage and modern Historians used in a myriad of applications. We even have historians used in systems where alternative technologies (including relational databases) are more suited. Modern historians have added benefits as developers have learned from the past and build new capability onto newer platforms. The question of old or new is not an either/or question, as the answer, it is old AND new. This is an important distinction to keep in mind.

Are Historians old or new tech?
When do I need a Process Historian?
What is Process Data?
How important is data access speed for Process Data?
Why are Process Historians so expensive?
Can I use other cloud platforms instead of a Process Historian?
Is the Hybrid model viable?
Next Generation Historian Hypothesis

When do I need a Process Historian?

Visualization, user controls, and alarms aside, SCADA can act as a Front-End Processor (FEP), or in other words a data concentrator for historians. SCADA systems gather data from multiple sources (devices) and condense and organise information in such a way that operators can manage and operate their assets effectively in real-time. The better the SCADA and its configuration, the better the outcomes for the operators and the business as a whole. SCADA systems can produce rich data sets which historically have been challenging to access. Along came the data historian and along came new context and high-performance data acquisition for end users including corporate.

What is Process Data?

Process data is sampled readings from devices and software-based systems. The bulk of the samples are from instruments monitoring environment elements like pressure, temperature, speed etc. Process data can produce very high volumes of samples. When systems are architected, the Data Management Plan (i.e. how to manage the payload by design) is too often neglected because systems often start small and work very well from a proof of concept. Comparatively, even with the smallest of systems where a Process Historian is used, a flood of data can easily overwhelm the alternative relational database.
When process data is stored in a Process Historian, the data is highly compressed. This makes the Process Historian a perfect repository for process data for the life of the system. In comparison, using alternative technologies would likely invoke a large commitment to disk storage or the use of cloud storage. The cloud storage option is a no brainer if the historian is already hosted in the cloud.
There are several options including hybrid topologies when designing an information system which includes both historians and other business data repositories. The reality is the most common topology includes a process data repository and a business data repository. This distinction is important due to the differing nature of the data.

How important is Data Access Speed for Process Data?

Process Historians provide several access methods for retrieving historical data for client applications or 3rd party interface. A 3rd party interface could be an Enterprise Asset Management System (EAM), Enterprise Resource Planning system (ERP), or any other business system that can consume the type of data available in the process historian. Methods could be SQL, Rest API, .NET APIs, etc. These interfaces give the Process Historian the same interface options as alternative technologies may provide. The speed at which high volumes of data can be retrieved is the key. Ever had to wait a few seconds to retrieve a bank statement online? The volume of data produced for this scenario is like comparing a grain of sand to the size of the moon. The profile and content of business data is simply not the same as process data and this must be considered when building interfaces, reports, client tools, and applications for process data.

New historian technologies are significantly faster than pioneer technologies. Comparing pioneering historian technologies with alternative technologies produces a skewed result.

Why are Process Historians so expensive?

Process historians can be very expensive to purchase outright. Moving to subscription models can provide softer optics, however the total lifecycle costs should be considered. The high ongoing cost of historians drives early adopters of the technology to look at options. This is particularly true for those where the historian is not critical to operations or production.

Alternatively, there are very creative ways to capture process data and store it into “databases”. The methods are varied including capture directly from PLCs, IoT devices, and of course traditional SCADA systems. The big end of town historians have been in place for decades. They are reliable, have a strong following and have mature application tools that allow out of the box functionality. It is this rich “out of the box” functionality that needs to be considered very carefully when embarking on the “no historian pathway”. With today’s modern “alternatives” to Historians, the client tool visualisations are still maturing, however, how they manage process data and in particular, cope with large volumes of time series data is limiting.

A common work around solution adopted by non-historian advocates is to cube or precondition the process data before it is stored. In other words, the raw data is converted into something new and light weight, based on a concept determined to be valid at the time the system is setup. For those who have worked with process data before, they realise that there is much to discover in how an asset performs over the entire lifecycle of the asset. The lessons to learn are not known at the time the system was setup, that’s why they are lessons to learn. Deprecating the raw data reduces the value of the data. For example, the smallest undetectable change to the naked eye can be used with AI to detect and protect against unnecessary asset failure. This is not functionally possible if the raw data is lost or intentionally pre-processed or displaced. The baseline concept is to store it, then additionally condition it into a second lightweight repository. In time and at will, more conditioning can occur as new knowledge is used to hypothesize about new pathways for improvement. This is only possible if preservation of raw data is the vision.

Can I use other cloud platforms instead of a Process Historian?

Yes, you can. You can also use a spreadsheet too, but who would? Most Process Historians have client tools including spreadsheet plug-ins. These types of analysis tools demonstrate the fundamentals of how important it is to dynamically create concepts and visualisations based on the current goals of the business. Whatever the solution, the client tools and the data captured should not inhibit a data driven business from discovering new insights.

There are several very important factors to consider when deciding on the type of data repository for process data. The key topics are:

Type of data (Real time or time series)
Payload cost (related to volume of data in and out of the cloud)
Open source or not (cybersecurity policy normally mandates the approach to open source)
Data acquisition methods (speed of storage and retrieval)
Cloud (data security mandate, privacy, and data recovery/loss)
Longevity of application solutions

This list is not comprehensive, but it does demonstrate the depth of consideration required when deciding about the architecture of a system, and consequently the most appropriate selection of technologies to meet the solution requirements.

Perhaps the greatest challenge to using alternative technologies is continuity and the lifecycle of products. Some organisations who have adopted “alternative IT solutions” in the last 3-5 years (as of 2023) are learning the hard way that their Applications have already reached End of Life. This is unheard of for Industrial Automation Vendors. Practitioners in the Automation Industry also know the cost of engineering far exceeds the cost of software, so when a decision to migrate to a new software platform or application occurs, the cost to the business is magnified. When an announcement is made and a new solution is forced upon a business to replace an information system asset, the impact is huge.

Is the Hybrid model viable?

The hybrid model provides the best of both worlds. It should never have been an “either/or” debate. The either/or approach has led many organisations to an embarrassing destination. The chances are for many, as they continue to invest in developers, reporting writing, and adding new plug ins, they will never achieve asset performance uplift.
At a glance, the business end of asset performance monitoring was realised with the birth of Big Data. With this birth, elements of industrial systems were simply replicated to another environment, i.e. into the corporate space. Dashboards and charts about performance were co-located to the Board Room, but the data was still sourced from the same place. The most attractive reasons to forgo the benefits of hardened process historian technologies are well understood to be cosmetic by data practitioners. Modern visualisation tools sell well. Process historian visualisation tools struggle to keep pace with big data hatched visualise tools because of the nature of being robust, hardened and conservative in industrial automation. To frequently change industrial applications at the rate that nonindustrial applications change (just to keep up with cosmetic benefits), would be irresponsible.
The hybrid model also deals with the silent killer, payload cost. In certain countries (including Australia) the cost to transfer high volumes of data repeatedly to and from the cloud is inhibiting to say the least. The architecture of the system including a well-considered Data Management Plan, allows the hybrid model to keep costs under control as the user base of data consumers explodes. This factor cannot be ill-considered. It’s not superficial and is fundamental. Unless the alternative technologies embed historians within their products, they will never be able to meet the needs of asset performance management. This is why Microsoft, Amazon and others have developed or acquired new technologies for time series data applications. Noting these applications are not for industrial process data solutions, this is in part acknowledgement that for scale, the right technology and not generic databases must be used.

Next Generation Historian Hypothesis

Next generation historians are likely to be more needed as technology data infrastructure components rather than feature rich client applications. In addition, with increasing complexities due to security in design, historian technology is not likely to become simplified rather follow the same pattern. The decision not to use feature rich client applications in the more secure layers within networks does offer some consolation.
As organisations invest more in developing their own analytical tools and inhouse skills, the benefits of feature rich historians is going to dissipate. The drive is for more open access to contextualise data. If a particular historian fails to provide contextualised data, then its days are over. In some respects, historians are likely to revert to their former primary purpose, but with far greater performance and more secure and robust than ever before.

Model Predictive Control emulates plant operation