Incident Response involves the methods, policies, and procedures that are used by an organization to respond to a cyber attack. The aims of incident response are to limit the impact of the attack, assess the damage caused, and implement recovery procedures. Because of the potential large-scale loss of property and revenue that can be caused by cyber-attacks, it is essential that organizations create and maintain detailed incident response plans and designate personnel who are responsible for executing all aspects of that plan. In this article, I want to talk about some of the ways to establish Incident Response Capability in cybersecurity.
The U.S. National Institute of Standards and Technology (NIST) recommendations for incident response are detailed in their Special Publication 800-61, revision 2 entitled “Computer Security Incident Handling Guide,”
Note: Although this chapter summarizes much of the content in the NIST 800-61r2 standard, you should be familiar with the entire publication as it covers four major exam topics for the Understanding Cisco Cybersecurity Operations Fundamentals exam.
The NIST 800-61r2 standard provides guidelines for incident handling, particularly for analyzing incident-related data, and determining the appropriate response to each incident. The guidelines can be followed independently of particular hardware platforms, operating systems, protocols, or applications.
The first step for an organization is to establish a computer security incident response capability (CSIRC). NIST recommends creating policies, plans, and procedures for establishing and maintaining a CSIRC.
An incident response policy details how incidents should be handled based on the organization’s mission, size, and function. The policy should be reviewed regularly to adjust it to meet the goals of the roadmap that has been laid out. Policy elements include the following:
- Statement of management commitment
- Purpose and objectives of the policy
- Scope of the policy
- Definition of computer security incidents and related terms
- Organizational structure and definition of roles, responsibilities, and levels of authority
- Prioritization of severity ratings of incidents
- Performance measures
- Reporting and contact forms
A good incident response plan helps to minimize damage caused by an incident. It also helps to make the overall incident response program better by adjusting it according to lessons learned. It will ensure that each party involved in the incident response has a clear understanding of not only what they will be doing, but what others will be doing as well. Plan elements are as follows:
- Strategies and goals
- Senior management approval
- An organizational approach to incident response
- How the incident response team will communicate with the rest of the organization and with other organizations
- Metrics for measuring the incident response capacity
- How the program fits into the overall organization
The procedures that are followed during an incident response should follow the incident response plan. Procedures elements are as follows:
- Technical processes
- Using techniques
- Filling out forms,
- Following checklists
These are typical standard operating procedures (SOPs). These SOPs should be detailed so that the mission and goals of the organization are in mind when these procedures are followed. SOPs minimize errors that may be caused by personnel that are under stress while participating in incident handling. It is important to share and practice these procedures, making sure that they are useful, accurate, and appropriate.
Incident Response Stakeholders
Other groups and individuals within the organization may also be involved with incident handling. It is important to ensure that they will cooperate before an incident is underway. Their expertise and abilities can help the Computer Security Incident Response Team (CSIRT) to handle the incident quickly and correctly. These are some of the stakeholders that may be involved in handing a security incident:
- Management – Managers create the policies that everyone must follow. They also design the budget and are in charge of staffing all of the departments. Management must coordinate the incident response with other stakeholders and minimize the damage of an incident.
- Information Assurance – This group may need to be called in to change things such as firewall rules during some stages of incident management such as containment or recovery.
- IT Support – This is the group that works with the technology in the organization and understands it the most. Because IT support has a deeper understanding, it is more likely that they will perform the correct action to minimize the effectiveness of the attack or preserve evidence properly.
- Legal Department – It is a best practice to have the legal department review the incident policies, plans, and procedures to make sure that they do not violate any local or federal guidelines. Also, if any incident has legal implications, a legal expert will need to become involved. This might include prosecution, evidence collection, or lawsuits.
- Public Affairs and Media Relations – There are times when the media and the public might need to be informed of an incident, such as when their personal information has been compromised during an incident.
- Human Resources – The human resources department might need to perform disciplinary measures if an incident caused by an employee occurs.
- Business Continuity Planners – Security incidents may alter an organization’s business continuity. It is important that those in charge of business continuity planning are aware of security incidents and the impact they have had on the organization as a whole. This will allow them to make any changes in plans and risk assessments.
- Physical Security and Facilities Management – When a security incident happens because of a physical attack, such as tailgating or shoulder surfing, these teams might need to be informed and involved. It is also their responsibility to secure facilities that contain evidence from an investigation.
The Cybersecurity Maturity Model Certification
The Cybersecurity Maturity Model Certification (CMMC) framework was created to assess the ability of organizations that perform functions for the U.S. Department of Defense (DoD) to protect the military supply chain from disruptions or losses due to cybersecurity incidents. Security breaches related to DoD information indicated that NIST standards were not sufficient to mitigate against the increasing and evolving threat landscape, especially from nation-state treat actors. In order for companies to receive contracts from the DoD, those companies must be certified. The certification consists of five levels, with different levels required depending on the degree of security required by the project.
The CMMC specifies 17 domains, each of which has a varying number of capabilities that are associated with it. The organization is rated by the maturity level that has been achieved for each of the domains. One of the domains concerns incident response. The capabilities that are associated with the incident response domain are as follows:
- Plan incident response
- Detect and report events
- Develop and implement a response to a declared incident
- Perform post-incident reviews
- Test incident response
The CMMC certifies organizations by level. For most domains, there are five levels, however, for incident response, there are only four. The higher the level that is certified, the more mature the cybersecurity capability of the organization. A summary of the incidence response domain maturity levels is shown below.
- Level 2 – Establish an incident response plan that follows the NIST process. Detect, report, and prioritize events. Respond to events by following predefined procedures. Analyze the cause of incidents in order to mitigate future issues.
- Level 3 – Document and report incidents to stakeholders that have been identified in the incident response plan. Test the incident response capability of the organization.
- Level 4 – Use knowledge of attacker tactics, techniques, and procedures (TPT) to refine incident response planning and execution. Establish a security operation center (SOC) that facilitates a 24/7 response capability.
- Level 5 – Utilize accepted and systematic computer forensic data gathering techniques including the secure handling and storage of forensic data. Develop and utilize manual and automated real-time responses to potential incidents that follow known patterns.
NIST Incident Response Life Cycle
NIST defines four steps in the incident response process life cycle, as shown in the figure.
- Preparation – The members of the CSIRT are trained in how to respond to an incident. CSIRT members should continual develop knowledge of emerging threats.
- Detection and Analysis – Through continuous monitoring, the CSIRT quickly identifies, analyzes, and validates an incident.
- Containment, Eradication, and Recovery – The CSIRT implements procedures to contain the threat, eradicate the impact on organizational assets, and use backups to restore data and software. This phase may cycle back to detection and analysis to gather more information, or to expand the scope of the investigation.
- Post-Incident Activities – The CSIRT then documents how the incident was handled, recommends changes for future response, and specifies how to avoid a reoccurrence.
The incident response life cycle is meant to be a self-reinforcing learning process whereby each incident informs the process for handling future incidents. Each of these phases are discussed in more detail in this topic.
The image depicts the NIST incident response cycle, with arrows showing the normal workflow and feedback in an incident response.
Incident Response Life Cycle
The preparation phase is when the CSIRT is created and trained. This phase is also when the tools and assets that will be needed by the team to investigate incidents are acquired and deployed. The following list has examples of actions that also take place during the preparation phase:
- Organizational processes are created to address communication between people on the response team. This includes such things as contact information for stakeholders, other CSIRTs, and law enforcement, an issue tracking system, smartphones, encryption software, etc.
- Facilities to host the response team and the SOC are created.
- Necessary hardware and software for incident analysis and mitigation is acquired. This may include forensic software, spare computers, servers and network devices, backup devices, packet sniffers, and protocol analyzers.
- Risk assessments are used to implement controls that will limit the number of incidents.
- Validation of security hardware and software deployment is performed on end-user devices, servers, and network devices.
- User security awareness training materials are developed.
Additional incident analysis resources might be required. Examples of these resources are a list of critical assets, network diagrams, port lists, hashes of critical files, and baseline readings of system and network activity. Mitigation software is also an important item when preparing to handle a security incident. An image of a clean OS and application installation files may be needed to recover a computer from an incident.
Often, the CSIRT may have a jump kit prepared. This is a portable box with many of the items listed above to help in establishing a swift response. Some of these items may be a laptop with appropriate software installed, backup media, and any other hardware, software, or information to help in the investigation. It is important to inspect the jump kit on a regular basis to install updates and make sure that all the necessary elements are available and ready for use. It is helpful to practice deploying the jump kit with the CSIRT to ensure that the team members know how to use its contents properly.
The same boxes as the previous section are shown with the preparation box highlighted.
Detection and Analysis
The same boxes as the previous section are shown with the detection and analysis box highlighted.
Detection & Analysis Phase
Because there are so many different ways in which a security incident can occur, it is impossible to create instructions that completely cover each step to follow to handle them. Different types of incidents will require different responses.
An organization should be prepared to handle any incident but should focus on the most common types of incidents so that they can be dealt with swiftly. These are some of the more common types of attack vectors:
- Web – Any attack that is initiated from a website or application hosted by a website.
- Email – Any attack that is initiated from an email or email attachment.
- Loss or Theft – Any equipment that is used by the organization such as a laptop, desktop, or smartphone can provide the required information for someone to initiate an attack.
- Impersonation – When something or someone is replaced for the purpose of malicious intent.
- Attrition – Any attack that uses brute force to attack devices, networks, or services.
- Media – Any attack that is initiated from external storage or removable media.
Containment, Eradication, and Recovery
The same boxes as the previous section are shown with the containment, eradication, and recovery box highlighted.
Containment, Eradication, and Recovery Phase
After security incident has been detected and sufficient analysis has been performed to determine that the incident is valid, it must be contained in order to determine what to do about it. Strategies and procedures for incident containment need to be in place before an incident occurs and implemented before there is widespread damage.
For every type of incident, a containment strategy should be created and enforced. These are some conditions to determine the type of strategy to create for each incident type:
- How long it will take to implement and complete a solution?
- How much time and how many resources will be needed to implement the strategy?
- What is the process to preserve evidence?
- Can an attacker be redirected to a sandbox so that the CSIRT can safely document the attacker’s methodology?
- What will be the impact to the availability of services?
- What is the extent of damage to resources or assets?
- How effective is the strategy?
During containment, additional damage may be incurred. For example, it is not always advisable to unplug the compromised host from the network. The malicious process could notice this disconnection to the CnC controller and trigger a data wipe or encryption on the target. This is where experience and expertise can help to contain an incident beyond the scope of the containment strategy.
The same boxes as the previous section are shown with the post-incident activity box highlighted.
Post-Incident Activity Phase
After incident response activities have eradicated the threats and the organization has begun to recover from the effects of the attack, it is important to take a step back and periodically meet with all of the parties involved to discuss the events that took place and the actions of all of the individuals while handling the incident. This will provide a platform to learn what was done right, what was done wrong, what could be changed, and what should be improved upon.
After a major incident has been handled, the organization should hold a “lessons learned” meeting to review the effectiveness of the incident handling process and identify necessary hardening needed for existing security controls and practices. Examples of good questions to answer during the meeting include the following:
- Exactly what happened, and when?
- How well did the staff and management perform while dealing with the incident?
- Were the documented procedures followed? Were they adequate?
- What information was needed sooner?
- Were any steps or actions taken that might have inhibited the recovery?
- What would the staff and management do differently the next time a similar incident occurs?
- How could information sharing with other organizations be improved?
- What corrective actions can prevent similar incidents in the future?
- What precursors or indicators should be watched for in the future to detect similar incidents?
- What additional tools or resources are needed to detect, analyze, and mitigate future incidents?
Incident Data Collection and Retention
By having ‘lessons learned’ meetings, the collected data can be used to determine the cost of an incident for budgeting reasons, as well as to determine the effectiveness of the CSIRT, and identify possible security weaknesses throughout the system. The collected data needs to be actionable. Only collect data that can be used to define and refine the incident handling process.
A higher number of incidents handled can show that something in the incidence response methodology is not working properly and needs to be refined. It could also show incompetence in the CSIRT. A lower number of incidents might show that network and host security has been improved. It could also show a lack of incident detection. Separate incident counts for each type of incident may be more effective at showing strengths and weakness of the CSIRT and implemented security measures. These subcategories can help to target where a weakness resides, rather than whether there is a weakness at all.
The time of each incident provides insight into the total amount of labor used and the total time of each phase of the incident response process. The time until the first response is also important, as well as how long it took to report the incident and escalate it beyond the organization, if necessary.
It is important to perform an objective assessment of each Incident. The response to an incident that has been resolved can be analyzed to determine how effective it was. NIST Special Publication 800-61 provides the following examples of activates that are performed during an objective assessment of an incident:
- Reviewing logs, forms, reports, and other incident documentation for adherence to established incident response policies and procedures.
- Identifying which precursors and indicators of the incident were recorded to determine how effectively the incident was logged and identified.
- Determining if the incident caused damage before it was detected.
- Determining if the actual cause of the incident was identified, and identifying the vector of attack, the vulnerabilities exploited, and the characteristics of the targeted or victimized systems, networks, and applications.
- Determining if the incident is a recurrence of a previous incident.
- Calculating the estimated monetary damage from the incident (e.g., information and critical business processes negatively affected by the incident).
- Measuring the difference between the initial impact assessment and the final impact assessment.
- Identifying which measures, if any, could have prevented the incident.
- Subjective assessment of each incident requires that incident response team members assess their own performance, as well as that of other team members and of the entire team. Another valuable source of input is the owner of a resource that was attacked, in order to determine if the owner thinks the incident was handled efficiently and if the outcome was satisfactory.
There should be a policy in place in each organization that outlines how long evidence of an incident is retained. Evidence is often retained for many months or many years after an incident has taken place. In some cases, compliance regulations may mandate the retention period. These are some of the determining factors for evidence retention:
- Prosecution – When an attacker will be prosecuted because of a security incident, the evidence should be retained until after all legal actions have been completed. This may be several months or many years. In legal actions, no evidence should be overlooked or considered insignificant. An organization’s policy may state that any evidence surrounding an incident that has been involved with legal actions must never be deleted or destroyed.
- Data Type – An organization may specify that specific types of data should be kept for a specific period of time. Items such as email or text may only need to be kept for 90 days. More important data such as that used in an incident response (that has not had legal action), may need to be kept for three years or more.
- Cost – If there is a lot of hardware and storage media that needs to be stored for a long time, it can become costly. Remember also that as technology changes, functional devices that can use outdated hardware and storage media must be stored as well.
Reporting Requirements and Information Sharing
Governmental regulations should be consulted by the legal team to determine precisely the organization’s responsibility for reporting the incident. In addition, management will need to determine what additional communication is necessary with other stakeholders, such as customers, vendors, partners, etc.
Beyond the legal requirements and stakeholder considerations, NIST recommends that an organization coordinate with organizations to share details for the incident. For example, the organization could log the incident in the VERIS community database.
The critical recommendations from NIST for sharing information are as follows:
- Plan incident coordination with external parties before incidents occur.
- Consult with the legal department before initiating any coordination efforts.
- Perform incident information sharing throughout the incident response life cycle.
- Attempt to automate as much of the information sharing process as possible.
- Balance the benefits of information sharing with the drawbacks of sharing sensitive information.
Share as much of the appropriate incident information as possible with other organizations.