Following individual attacks from state cyber-armies in the past, organized cybercriminals have now also discovered the manufacturing industry. This reduces the time that companies have left to at least implement basic protection in operational technology (OT) and integrate this into IT security. Concepts and technologies are available and, especially on the process level, facilitate increasing automation and integration that can bring things up a gear.
Business efficiency processes are driving ahead with digitalization and the accompanying networking of manufacturing environments. Exceptional circumstances like the coronavirus pandemic have accelerated this process even more with the introduction of ad hoc remote access points in places, for example.
This is understandable: necessary business transformations have always led to (sometimes disruptive) changes and especially in manufacturing, the show must go on, even during the pandemic.
However, the stark rise in the number of cyber incidents in manufacturing worldwide makes it clear that we should pause this transformation. It’s important to address weak spots for cybersecurity in the right way. This involves first closing existing gaps as quickly as possible, so that we can move forward on a solid foundation.
The lockdown was the defining topic of the year in 2020 and it presented difficult choices for decision-makers in businesses. Companies that didn’t think much of working from home previously were also quick to discover that remote access points were the method of choice for business continuity. For office work, such a remote working environment was comparatively easy to achieve. This applied to the blue collar sector, but not to warehouses or manufacturing. Here, a large (if not the largest) proportion of added value is generated on site – like in the operation of critical infrastructures.
Home office in manufacturing? Difficult. But in fact, machines and plants can of course be maintained and operated from a distance. In line with this, the number of remote access points in operational technology (OT) also grew rapidly in 2020.
The lockdown forced us to act. At this time, I saw many manufacturing environments where this was implemented in a pragmatic way – in remote administration, for example. On the side wall of a 19-inch rack where we had just installed an OT security monitoring system for machine control, an “F” box (small DSL router) was hanging, taped onto a switch. And although I stopped and looked at it for a few seconds, people often said to me: “It’s just temporary”.
That’s perhaps an extreme example of how we opened up our OT networks and are now reacting in retrospect, thinking about segmentation and transparency (surveillance of network activity) and trying to catch up without disturbing production.
This has made us vulnerable. So it’s no wonder that damage to manufacturing plants from ransomware (that spreads across unsecured network connections and is most successful in infecting unpatched systems) in particular is currently the most relevant threat. A predictable development after the speed of networking outdated systems undertook the speed of protections.
We therefore need to be faster in two ways. First, we need to close the protection gap that has been produced. That includes for example the lack of segmentation, access controls (including secure, industrial-grade remote maintenance access points), monitoring, endpoint protection, and cybersecurity processes in manufacturing.
Second, there is the question of response times: especially in the case of a ransomware attack in older manufacturing environments, the infection can spread across the whole network in a matter of minutes under certain circumstances. Here, a manual response is too slow. However, an automatic response is no easy task in the manufacturing environment, with few standardized and sensitive (sensitive to changes to or installation of software) endpoints and protocols. This is because our standard IT solutions are often unusable due to their high rates of false positive detection results in certain instances. Under certain circumstances, they can even erroneously block critical process communications.
How can we speed up now?
It is well-known that the integration of IT and OT security is a necessary requirement of effective cybersecurity. If we look at the above example of ransomware infection, such attacks often start with activities in the IT sector (phishing emails, malware downloads, accessing infected websites, suspicious DNS requests, installing suspicious software at endpoints, remote access at unusual times, etc.) that are easily detected with current methods and technologies in the sector. Alarms triggered in this sector can therefore be used as an early-warning system. Response times in the OT sector are being accelerated through analyses from the IT sector. This integrated approach (IT plus OT) therefore facilitates more targeted responses. If necessary, SIEM or SOAR use cases and scenarios for OT systems can also be activated: Security information and event management or security orchestration automation and response automate and accelerate typical manual level 1 SOC analyst activities like verification, qualification and queries.
Any technology whose objective is to automate processes, machines, and plants can be described as automation technology. This means that these processes, machines, and plants are put in a position where they can work without human input and control themselves.
Institut für Integrierte Produktion (IPH: Institute for Integrated Manufacturing)
In manufacturing, all the technology that we are now wanting to protect in the field of OT security has led to a shift towards higher efficiency and profitability. In terms of IT system security, automated responses to specific threats equally led to a considerable increase in efficiency that we now take for granted (endpoint protection, security proxies, mail gateways, network filters, etc.). Until now, those responsible for OT systems have understandably been even more cautious and are acting passively: suspicious activities are detected, but not automatically blocked. They are worried that through blocking the connection they might block a critical control signal with a safety function (SIS), for example.
Nevertheless, it would also help here if each alarm didn’t have to first land in a ticket queue and wait there to be processed by an analyst. This is especially true for our example of the spread of a ransomware infection.
What can technology offer at the moment in terms of automation in the OT security sector? Some examples include:
security gateways suitable for industrial networks (aka OT firewalls). Among other things, these gateways can recognize and stop critical commands (Stop PLC, Upload Program, etc.). They can react to known attack signatures specific to control devices (e.g. in Server Message Block packages) and block packages (virtual patching).
Intrusion detection systems (IDS) for industrial networks that control firewalls: these IDS passively learn the “normal” behavior of the plant and recognize deviations from this, partially through machine learning. These systems can control firewalls across relevant open interfaces (typically REST APIs). Then it is possible to target and automatically block communication between an infected machine and the command & control (C2) server of an attacker, for example. In this way, you can prevent malware from being downloaded or activated in a matter of seconds.
SOAR: one of the most exciting topics in this sector for me is the effective use of the SOAR technology that is available today. With it comes the possibility to orchestrate and automate incident response processes. This can accelerate these processes considerably. For this, the SOAR tool is provided in “quasi real time” (in seconds, not really in real time for process control) using APIs with data (alarms, triggers, etc.) from different security systems. At the same time, it has access to data sources for automatic contextual enrichment (asset or vulnerability databases, IoCs, threat intelligence information, user information and physical context like the status of security cameras, etc.). If there is a situation where SOAR is triggered, a predefined playbook is followed. The SOAR system can then use further automation interfaces to carry out certain actions autonomously, where necessary.
Are the technologies described above now ready to go live? The topic of breakdowns due to false positive responses from cybersecurity components can’t be ignored, especially when they have a history in IT security. However, my opinion is that the approaches described above can considerably reduce cyber-risk without increasing failure risk. They also need to be implemented with understanding of the OT process they are protecting and in the right place. The key points here are “in the right place” and “semi-automated”.
In the right place refers to segmentation and automatic blocking of easily detectable attack patterns. For this purpose, there are known signatures with a high uniqueness and a very low false positive rate. The block needs to be carried out on the zone border of the OT process and not within it. This makes it highly probable that the connection between the attacker and their objective can be interrupted, preventing further damage. In addition, an incorrectly blocked connection won’t immediately halt the process, it will only temporarily block an ERP system from having read access.
Partially automated means that technical systems like SOAR or modern network detection & response (NDR) systems can automatically process as much information as possible in advance, present the result to an analyst and allow them to decide what to do. They can then initiate what they believe is the best response with the click of a button (which is also automated). An example: a network detection sensor identifies abnormal behavior in control components and suspicious changes in network communication topology through machine learning, such as where a new remote connection is coming from. The SOAR system reacts and
1. asks about the criticality and OT context of the affected systems in Asset-DB,
2. questions whether a maintenance window has been planned for this time period
3. and checks whether weaknesses across the network components listed in the Asset-DB used for remote access are well-known and if they have potentially already been actively used.
The result is then presented to the analyst, who uses their expertise to decide whether the system needs to block the remote connection, for example.
The ransomware example here serves to illustrate thoughts on automation. You can considerably reduce the risk of ransomware even with simple means: create and protect backups, at least segment your OT network roughly into zones which have to communicate with one another without interruption, and secure remote maintenance access points like engineering notebooks.
Finally, here is another exciting tip: since businesses in critical infrastructure are increasingly being exposed to the risk of physical and cyberattacks, security companies like Hitachi ABB, Deutsche Telekom Security, and Securitas are offering a common security concept: the Industrial Security Center for IT and OT infrastructures. Find more information here.