Update: August 21, 2024
It has been just over a month since the CrowdStrike update that caused the Windows blue screen. Our support team has been working with customers over the last month, and we wanted to send out an email with an update to all our customers.
According to CrowdStrike’s Root Cause Analysis (RCA), the issue was caused by a mismatch between the expected 20 input fields by the Falcon sensor and the 21 fields included in a Rapid Response update. This led to an out-of-bounds memory read and subsequent system crash.
Below are the improvements made by CrowdStrike, the steps we’ve taken at ActZero, and recommendations for our customers.
CrowdStrike Improvements
- Bug Fixes: Measures were put in place to prevent the creation of problematic Channel 291 files. New validation checks ensure the correct number of input fields, preventing this issue from recurring.
- Enhanced Sensor Error Checking: Additional safeguards, known as “bounds checking,” were added to the Content Interpreter for Channel File 291. These safeguards prevent the system from reading data outside of safe limits, reducing the risk of crashes.
- Improved QA Tools: Additional checks were implemented in the Content Validator.
- Staged Rollouts: Implemented new deployment layers ensuring updates pass multiple checks before full rollout.
- Manual Testing Updates: Testing procedures for Content Configuration Systems have been improved, including upgrades to template testing and automation.
- Allow Scheduling Updates: Customers now have more control over when Rapid Response Content updates are applied.
- Third-Party QA Teams Reviews: Two independent security vendors have been engaged to review the Falcon sensor code and processes.
ActZero Improvements
- “General Availability” Scheduling for Updates: Following the outage, CrowdStrike introduced new options for scheduling updates: Rapid Response Content (detection and prevention logic) and Sensor Operations (kernel updates and settings). Customers can choose from Early Access, General Availability (phased rollout), or Pause Updates. Based on CrowdStrike’s recommendation, we have set all our tenants to General Availability for both channels.
- Enhanced Notifications: ActZero is improving our notification process to include critical details, such as:
Affected systems that need remediation
Specific instructions for systems that are encrypted or hosted in Azure - Ensuring Accurate Contact Information: Customers should confirm contact information configured in their portal to ensure notifications are being received by all required staff. This helps avoid missed alerts and ensures timely communication during critical updates or incidents.
Recommended Action Steps for Customers
- Ensure Updated Contact Info: Ensure all relevant parties are added to your communication channel in the ActZero defense portal under the Onboarding section. Additionally, whitelist our IP to ensure emails are getting through.
- Table Top Test: Download our Incident Response Guide and simulate blue screen scenarios.
- Provide Feedback: Schedule time with our Customer Experience team to share your experience and insights.
We’re committed to helping you stay protected and informed. Please reach out if you have any questions or need further assistance.
Original Update: July 22, 2024
A recent update from CrowdStrike caused blue screen errors on Windows systems. CrowdStrike has provided a workaround to address this issue. Please follow the steps below if you are experiencing system crashes.
Reboot the host normally and If the host crashes again, then:
- Boot Windows into Safe Mode or the Windows Recovery Environment.
- Navigate to the C:\Windows\System32\drivers\CrowdStrike directory.
- Locate the file matching “C-00000291.sys” and delete it.*
- Channel file "C-00000291*.sys" with timestamp of 0527 UTC or later is the reverted (good) version.
- Channel file "C-00000291*.sys" with timestamp of 0409 UTC is the problematic version
- Boot the host normally.
- Note: Bitlocker-encrypted hosts may require a recovery key
For more detailed instructions if you are using Virtual (VDI), AWS, Azure based environments, click here.
Note:
- Windows hosts which have not been impacted do not require any action as the problematic channel file has been reverted.
- Hosts running WIndows7/2008 R2 are not impacted.
- This issue is not impacting Mac- or Linux-based hosts.
- Please do not attempt to remove the files without being in Safe Mode or the Windows Recovery Environment. You will not be able to remove the file, and this will generate an alert and ticket.
Our Recommendation
We strongly advise you to follow the steps provided by CrowdStrike if you encounter system crashes. Please do not remove CrowdStrike software, as doing so will leave your systems vulnerable and hinder our ability to monitor and detect any system activity. Attackers may exploit these issues, so it is crucial to maintain your protection.
If you have any questions or need further assistance, please do not hesitate to contact our support team.
How Did this Happen?
The CrowdStrike outage today was caused by a flawed security update that led to widespread disruptions in various sectors globally. Specifically, a defect found in a single content update for Windows, released by CrowdStrike, triggered significant issues with windows applications.
Issue:
- Kernel-level integration: CrowdStrike's Falcon sensor operates at the kernel level of Windows, which gives it deep access to system processes and resources. When a faulty update is deployed, it can cause conflicts with core Windows components.
- Driver issues: The update in question involved a problematic driver file. Specifically, a file named "C-00000291*.sys" in the CrowdStrike directory was causing Windows to crash. Drivers interact directly with hardware and the operating system, so a malfunctioning driver lead to system-wide instability.
- Boot process interference: The faulty update caused many systems to enter a boot loop or display the Blue Screen of Death (BSOD). The update was interfering with critical startup processes in Windows.
In this specific incident, the problematic update caused Windows hosts to experience the error related to the Falcon Sensor.
We are working closely with CrowdStrike to resolve this issue. In the meantime, if you require immediate assistance please call the ActZero 24x7 Breach number at 1-855-917-4981.