Optimizing Outcomes In Complex Process Manufacturing Environments | A Multi-Dimensional Algorithmic Challenge

Developments, expansions and upgrades in the field of industrial information technology have introduced a large number of new threats to industries.
As pressures on resources, production performance optimization, or process innovations are being ramped up, so is the plethora of opportunities this is opening up for intrusions into industrial control systems by bad actors. Cyber threat actors have become exceptionally skilled at infiltrating their victim targets. In fact Industrial Control Systems (ICS) are recognized as one of the most attractive targets for threat actors.

These networks were generally thought to be more secure due to lack of connection to outside world of the corporate network or on the internet.The advent and rapid uptake of Industrial-IoT strategies among all large and medium scale manufacturing facilities has forced a convergence of operational (OT) systems directly to corporate (IT) systems. Data that was once isolated due to the sheer nature of their specific audience and intended use are now being used to build analytics and predictive algorithmic models to enable performance visibility with a holistic scope. Production optimization strategies implemented at individual production sites are now being generalized to include the entire fleet, and analytics being fed into reports at the board level.

The report delves into the nature of data and modeling that is being leveraged for purposes of implementing artificial intelligence algorithms that optimize production processes without intervention of engineers and operations staff. Therein lies the inherent risk of of bad actors inflicting high risk high-impact type cyber-intrusions attacks that could immobilize plants and production operations for days/weeks, leading to massive financial and credibility losses.

Attackers have managed to compromise systems and steal valuable production data. Additionally, mechanical control can be undertaken or compromised, centrifugation can be dynamically rearranged or devices can be reprogrammed in order to accelerate or slow down ICS operations post such specialized attacks. This may result in total industrial equipment being destroyed or permanently damaged, or also may endanger personnel’s life who is working at the ICS site. Upgrade of research and methodologies undertaken for measuring and managing industrial cyber threats risks, and a review of of industrial-control-system-specific security metrics have often been identified as a barrier to implementing these methodologies. Some of the most effective tools in combating these threats are the emerging techniques in Artificial Intelligence. By combining these threats with real-time data monitoring along with orchestration and automated response, AI analytics solutions are proving their best possible desirable outcome when compared to legacy systems and human-intervention driven response times.

Advanced malware campaigns like botnet, ransomware, APT are the main threats to computer security. During their maintenance, the infected side needs to communicate with the attacker to update commands and status, and exfiltrate valuable data. Also, the attacker needs to send them customized pay-loads and exploits for specified tasks. The delivery for commands, payloads, and other components must be conducted covertly and evasively to avoid malware being detected and traced.

Delivering malware covertly and evasively is critical to advanced malware campaigns. Neural network models are poorly explainable and have a good generalization ability. Research on AlexNet models have demonstrated that by embedding malware in neurons, the malware can be delivered covertly, with minor or no impact on the performance of neural network. Meanwhile, because the structure of the neural network model remains unchanged, it can pass the security scan of antivirus engines. With the widespread application of artificial intelligence, utilizing neural networks for attacks becomes a forwarding trend

It is therefore appropriate to first understand the intricate mechanism of algorithmic optimization in order to appreciate the insidious nature of damages that poor cyber risk management processes and procedure could lead to..

In order to home in on opportunity areas to target AI/ML supported algorithmic optimization in a process plant environment, it always helpful to go back to first principles and ask a few thumb-rule questions. Beginning a new optimization project with a process scope that is not only manageable, but one that lends itself to large potential payoffs through better prediction and optimization outcomes, allows data scientists and engineers to share earnings, and better showcase project success.

  • Which product lines have high quality-rejection rates?
  • What refining outputs face the most frequent pressure to maintain target production levels?
  • What is the economic impact of discarding reject batches?
  • How much of new economic value is created by an incremental quality improvement over a three year period?
  • What data pools exist in the operations side of the facility? – sensor data, lab data, SCADA/PLC data, maintenance ticket data, operator data.
  • Which OT/IT systems hold this event data?
  • Is there an identified executive champion to drive the project?
  • How can AI algorithmic outputs be operationalized quickly?

We examine innovations from three companies in this report. Each bring field experiences solving domain-specific problems using analytics in a IIoT setting. Two are early technology companies with a track record combining domain specific models with AI to solve client process issues. The third, a business unit of a large global conglomerate.

Oil & gas midstream refining and chemical process/batch manufacturing often throw the hardest multivariate problems for process optimization.   The three companies exemplify use of deep-domain expertise with operating experiences in upstream & midstream of oil &, gas and downstream aromatics and chemicals process industries.


Source: Technip / TOTAL


Source: Technip Stone & Webster Technology

Dynamic control as the process unfolds presents unique challenges – re-calibrating temperature or tweaking flow-rates, require frequent feedback signals in the form of process conditions data or lab checks of output quality. While standard-operating-procedures are well established and codified within the chemical-manufacturing industry, they are generally encoded assuming static operating situations. Quite often they do not have explicit procedures to handle dynamic changing conditions within actual manufacturing processes. Dynamic changes often introduce unknown variables that physics models (used for defining plant process conditions) were not designed to include, as part of the inbuilt process control action.

For example, the mixer’s vessels may have residuals from prior production process; ambient temperature used for process calculations may contain moisture not accounted for; fine airborne suspended solids in the production area may have a larger-than-anticipated role in influencing product quality, something not always covered in the assumptions for process calculation.

For a manufacturer these variables create productivity blind spots:

  • What influence do each of these factors play in impacting product quality outcomes? (How do we distinguish factors that are noise, from factors that are signals?)
  • How do we predict expected quality outcome based on current conditions algorithmically, and with reasonable accuracy?
  • Can we model incorporating each influencer variable? (Some variables have disproportionately more influence on quality outcomes than others.)
  • What real time prescriptive action can be recommended to operators on the production control floor,  for minimize wasteful outcomes?

AI plays a role here in processing multiple signals to predict quality of current production, and identify correlations between various input parameters and quality outcomes (as input lab-quality signals, sensor anomalies, process signals and ambient condition data).

Abstracting Hundreds Of Sensor Signal-Streams Into An Algorithm That Outputs A Few Discrete Virtual “Process-Health Indicators”.  This Is The First Milestone In An IIoT-Analytics Journey.

Another useful optimization concept to keep in mind here is based on control theory.    Traditional Feedback based Control correct for past errors and works mostly linearly on a single variable. Model based Controls, on the other hand correct for future errors and works on multiple variables in non-linear fashion. Their primary design focus is on optimization. Model based control also allow for improvements in existing control and optimization processes through better control within each unit, better coordination of production units, and improved control over inventory and final product quality.

Model of the interacting system with its surrounding medium.

Finally, complexities of dynamic optimizations vary with scope of the process supported.   Optimizing a Single-Unit process, are easier to solve because of smaller scope (10-20 independent variables, 50,000 equations) and have been the traditional focus of most optimization applications. However, while smaller and manageable in scope, it may not achieve full targeted benefits should upstream unit optimization results in downstream constraint violations.   A Multi-Unit Optimization, is more difficult to solve because of larger scope (50-100 independent variables, 200,000+ equations). However it holds the promise of achieving greater optimization benefits because unit interactions considered, resulting in global optimum. This kind of optimization is the current focus today, and is based on wider scope development. (more details below)



Machine learning technology absorbs sensor and maintenance data over long periods of time and then identifies patterns in that data – patterns that would typically remain elusive to an operator. A major benefit of machine learning is that it can learn under all different types of conditions (for instance, seasonally, different operating conditions,varying duty cycles, different inputs) based on the real-world behavior of the equipment. It measures learned failure signatures from a machine, as opposed to modeling machine environments. Those learned signatures can then transfer from one machine to another, to help that machine avoid conditions that caused failure in the initial state.

Optimization is a highly complex task where a large number of controllable parameters affect production in one way or other. Somewhere in the order of 100 different control parameters must be adjusted to find the best combination of all the variables. Machine learning-based prediction model provides us “production-rate landscape” with its peaks and valleys representing high and low production. The multi-dimensional optimization algorithm then moves around in this landscape looking for the highest peak representing the highest possible production rate.

By moving through this “production rate landscape”, the algorithm can give recommendations on how to best reach this peak, i.e. which control variables to adjust and how much to adjust them. Such a machine learning-based production optimization thus consists of three main components:

  1. Prediction algorithm: Your first, important step is to ensure you have a machine-learning algorithm that is able to successfully predict the correct production rates given the settings of all operator-controllable variables.
  2. Multi-dimensional optimization: You can use the prediction algorithm as the foundation of an optimization algorithm that explores which control variables to adjust in order to maximize production.
  3. Actionable output: As output from the optimization algorithm, you get recommendations on which control variables to adjust and the potential improvement in production rate from these adjustments.

Source: UOP

One of the companies featured in this report shared interesting experiences building such prediction and optimization models using machine learning algorithms in the upstream oil & gas space.

(Source: Vegard Flovik / Kongsberg Digital)


With sufficient information about the current situation, a well-made physics-based model enables us to understand complex processes and predict future events. Such models have already been applied all across our modern society for vastly different processes, such as predicting the orbits of massive space rockets or the behavior of nano-sized objects which are at the heart of modern electronics.

A common key question is how you choose between a physics-based model and a data-driven ML model. The answer depends on what problem you are trying to solve. In this setting, there are two main classes of problems:

1)   No Direct Theoretical Knowledge Available About The System. A Lot Of Experimental Data Is Available About System Behavior.    In absence of direct knowledge about the behavior of a system, one cannot formulate mathematical (physics, thermodynamics, mass balance) models to describe it and make accurate predictions. However, with a lot of example outcomes, one could use a machine-learning based model. Given enough example outcomes (the training data), the model should be able to learn any underlying pattern between the information available about the system (the input variables) and the outcomes its expected to predict (the output variables).

2)   Good Understanding Of The Physical System To Describe Its Dynamics Mathematically.   If a problem can be well described using a physics-based (thermodynamics, mass balance, fluid dynamics, structural dynamics) model, this approach is often the better good solution.

However, this approach has some limitations, especially in dynamic real time situations. One of the key aspects is the computational cost of the model. While it describes the system in detail using a physics-based model, solving this model could be complicated and time-consuming. A physics-based approach, therefore, might break down if we aim for a model that can make real-time predictions on live data. In this case, a simpler ML-based model could be an option. Given enough examples of how a physical system behaves, the machine learning model can learn this physical behavior and make accurate predictions. The computational complexity of an ML model is mainly seen in the training phase. Once the model has finished training, making predictions on new live data is straightforward.



Kongsberg Digital provides next generation software and digital solutions to customers within upstream oil & gas, as well as renewables & utilities industries. Its LedaFlow tool offers dynamic process and advanced transient multiphase flow simulations.

Kongsberg shared a great example of a case combining physics based simulators and machine learning. Its virtual flowmeter (VFM) for monitoring & predicting multiphase well flow rates, gas-oil ratios and water.

Virtual flowmeters work by calculating flow based on existing instrumentation, knowledge of a facility and fluid properties, which is made possible with the use of correlations that relate the flow rate to the pressure and temperature drop through the system. Typically, such flows are measured with expensive multiphase flow meters (MPFMs), which can be a significant amount of a facility’s capital expense. Using a VFM turns this into an analytics service that help their clients gain real-time data to understand the constituent properties of any given stream of produced fluids. One advantage of a VFM is also that the prediction solution is relatively insensitive to the loss of one or two physical measurements across a multi-well field.

The Kongsberg designed VFM is capable of modelling flows of individual phases of various intermingled fluids in a single stream. In order to train the algorithm, the team utilized data from two sources

  • Available field data on temperatures, pressures and choke-settings (typically elemental computing parameters for subsea multiphase meters, such as total mass flow rate, gas volume fraction, and water-liquid ratio).
  • Data generated by its own LedaFlow simulation tool. Access to simulated data was a valuable ingredient for the training. It overcomes one of the biggest constraints in machine learning – the availability of high-quality training sets.
  • Additional engineered features that described some basic physics of the system (such as, density of fluid mixture is a function of pressure differentials between drill-head and upstream inlet of the choke, heat capacity can be described by relevant temp. differentials, etc.)

The New Frontier For Cybersecurity In Industrial Systems