Today’s incident response teams operate like hospital emergency rooms (ERs). When an alert comes in—malware infection, intrusion, anomalous behavior, etc.—a specialist is called in for diagnosis and remediation.
In cybersecurity, as in healthcare, good preventive care can improve outcomes and reduce costs. At Cisco, I’ve been a part of a team responsible for imagining a preventive-care approach to cybersecurity—“cyber hygiene.” Here’s our current thinking on what’s needed.
Foundation: a Holistic View of Digital Assets
The first requirement is a way to discover which assets are at risk—that is, out of compliance with organizational or regulatory policy. Take Cisco IT, for instance. We have a policy that requires Linux servers to run Cisco Advanced Malware Protection (AMP) for Endpoints (read our blog here) and to require Cisco Duo for multi-factor authentication. But with tens of thousands of data center servers, it’s highly likely that some will be out of compliance at any given time. Why? Obsolete or decommissioned virtual machines sometimes linger on the network longer than expected. And some servers might not be registered with the configuration management database (CMDB).
Assessing risk requires collecting information that’s scattered across diverse, often independently administered systems. These include CMDBs, antimalware solutions, intrusion prevention systems, vulnerability management systems, and others. The aim is a holistic view of the digital estate—device information, configuration details, location, business function, applications, security control coverage, security control performance, and more.
It turns out that building a holistic inventory is not so easy. Most customers we speak with are still trying to make their CMDB a single source of truth—up-to-date and accurate. The sticking point is that neither CMDBs nor security controls are designed to reconcile conflicting information from various data silos. Solving this problem is an important first step to cyber hygiene.
Understanding Hygiene by Business Application, Geography, or Other Factors
Jumping ahead, imagine that you now have a holistic asset inventory. Suppose 82% of your data center servers comply with your organization’s security policy. Without a way to drill down, that number isn’t especially useful. Are all business applications earning a “B” grade, or is one application 95% compliant (very good) while others are 62% compliant (failing)? Are the non-compliant servers spread across functions (less risk), or are most used for one critical business process (unacceptable risk)?
To produce actionable information, the cyber hygiene dashboard needs multiple business-aligned views, such as geography, executive, business function, etc. By filtering the data through these lenses, the security team can better understand risk and conduct targeted campaigns. If just one of four geographical regions is lax, for example, the security team can discuss concerns with the responsible executive. This targeted, risk-based approach is more effective than sending a general email to everyone owning an out-of-compliance asset, and then not having the resources to follow up.
Triage: Setting Priorities Based on Asset Risk
Let’s say that only 62% of servers in a region have the necessary controls in place and that remediation will take four months. Which servers should you remediate first?
We recommend a triage process based on business risk. The standard risk formula is:
Asset Criticality x Probability of Loss Event = Risk
Criticality is the value of an asset to the organization. For example, an email server for the entire company is more critical than a development server for one engineer in a lab. The probability of loss event depends on the asset’s exposure (e.g., Internet-accessible versus internal only) and compensating controls (e.g., multi-factor authentication, anti-malware, and vulnerability scanning). Business risk calculations are more nuanced than this, but that’s the idea.
Manually calculating business risk isn’t practical because you’re dealing with dozens of criticality and probability factors, possibly for tens of thousands of servers. In our view, a comprehensive cyber hygiene platform needs to automate business risk calculations.
Information Flow from Incident Response to Cyber Hygiene
Finally, our team is exploring how to continuously improve cyber hygiene via a feedback loop. Returning to the healthcare analogy, after a patient is seen in the ER, the patient’s primary care doctor follows up to understand the cause of the incident and suggest behavior changes to prevent a recurrence. In the case of a broken hip, for instance, the doctor might recommend a grab bar in the shower.
Applying the analogy to cybersecurity, say the incident response team learns of an Exchange server breach from the intrusion detection system. An investigation reveals that the vector was a phishing attack. In response, the cyber hygiene team could step up workforce education and increase the frequency of phishing email tests.
As threat volume continues to grow, our research shows that a formal cyber hygiene strategy will be vital to protect assets, information, and the business. Must-haves include a holistic asset inventory, a process to automatically calculate business risk, and a process to identify the remediation actions with the most potential to reduce risk.
What are the roadblocks to cyber hygiene in your organization? What would your ideal solution look like?
Author Travis Sugarbaker is engineering technical lead at Cisco Systems. Read more guest blogs from Cisco here.