Image

How Microsofts Cloud Configuration Failure Exposed 38TB of Data?

  • Published On: September 22, 2023 Updated On: September 26, 2023

In a recent incident that sent shockwaves through the cybersecurity community, Microsoft's AI research team inadvertently exposed a staggering 38 terabytes of sensitive data. This incident serves as a stark reminder of the critical importance of securing your cloud storage buckets. Let's delve into the incident and use it as a backdrop to discuss the essential security measures you need to implement in AWS, Google Cloud, and Azure.

The Microsoft Incident: A Data Exposure Shockwave

Wiz Research recently unveiled a startling incident involving Microsoft’s AI research team: an accidental exposure of 38 terabytes of sensitive data. This case brings forth essential questions and lessons about data security, especially when operating in the realm of artificial intelligence and cloud storage.

The Configuration Mishap

The issue arose when Microsoft’s AI research team shared a bucket of open-source training data through GitHub. They used Azure’s Shared Access Signature (SAS) tokens to generate a URL for data sharing.

Normally, these tokens can limit access to specific files or folders, but in this instance, the team misconfigured the SAS token. Instead of sharing just the intended data, the URL granted access to the entire Azure storage account, revealing an extra 38 terabytes of confidential information.

The stakes were high

What makes this incident particularly concerning is the type of data exposed: disk backups of two Microsoft employees’ workstations, private keys and passwords, and a massive collection of 30,000 internal Microsoft Teams messages from 359 Microsoft employees.

Worse yet, the SAS token configuration allowed for “full control,” which means an attacker could not only read but also write, delete, or overwrite files.

A limited selection of confidential documents was discovered in the computer backups. (Source: Wiz Research)

What could have gone wrong?

While the team’s original aim was to share AI models for image recognition, the slip could have been disastrous. The AI models are stored in CKPT file format, processed by TensorFlow, and serialized using Python’s pickle format, known to be susceptible to arbitrary code execution. An attacker could easily inject malicious code into the AI models, compromising every user who trusts Microsoft’s GitHub repository for these models with the rights to write or rewrite the files via the misconfigured SAS.

On surface-level inspection, Microsoft’s Azure storage account would appear private. The SAS tokens provided a false sense of security, as they made the data look inaccessible while masking the very real exposure.

Cloud Storage Security Checklist

To ensure your cloud storage buckets are properly configured and avoid falling victim to a similar incident, follow this comprehensive checklist:

 

AWS (S3)

Google Cloud (GCS)

Azure (Blob Storage)

Access Control and Permissions

     

Implement the principle of least privilege for IAM roles and permissions.

Use IAM roles and policies for access control.

Ensure MFA is enabled for sensitive operations.

Regularly review and audit IAM policies and access.

Encryption

     

Enable server-side encryption for data at rest using encryption mechanisms (e.g., SSE-S3, SSE-KMS).

Enable server-side encryption for data in transit (SSL/TLS).

Manage encryption keys securely, separate from data.

Logging and Monitoring

     

Enable bucket-level logging and auditing.

Set up CloudWatch (AWS), Cloud Monitoring (GCP), or Azure Monitor for monitoring and alerts.

Configure CloudTrail (AWS) or Cloud Audit Logs (GCP) for auditing and tracking activities.

Access Policies

     

Use bucket policies (S3) or IAM policies (GCS) to control access.

Apply conditional policies based on user attributes (e.g., IP address, time of day).

Use signed URLs or tokens for temporary, controlled access when needed.

Public Access

Avoid making buckets or objects public unless necessary.

Use Access Control Lists (ACLs) to restrict public access when required.

Versioning and Backup

     

Enable versioning for the bucket or container.

Implement a backup strategy for critical data.

Regularly test data restoration processes.

Cross-Origin Resource Sharing (CORS)

     

Configure CORS settings to control which domains can access resources.

Implement proper CORS rules for web applications if necessary.

Network Access Controls

     

Use Virtual Private Cloud (VPC) peering for network-level access control (if applicable).

Configure firewalls, security groups, and network policies to restrict incoming and outgoing traffic.

Documentation and Training

     

Maintain clear documentation of bucket configurations and access policies.

Ensure the cloud engineering team is trained in best practices for bucket security.

Third-Party Solutions

     

Consider using third-party security solutions or services for additional protection.

Compliance Standards

     

Ensure bucket configurations align with relevant compliance standards (e.g., GDPR, HIPAA).

By adhering to these best practices and regularly reviewing and updating your cloud storage configurations, you can significantly enhance the security of your data in AWS, Google Cloud, and Azure.

Conclusion

The Microsoft incident serves as a stark reminder of the critical importance of securing your cloud storage buckets. Data breaches can have far-reaching consequences, and prevention is always better than remediation. By following the checklist and best practices outlined here, you can reduce the risk of inadvertent data exposure and keep your sensitive information safe.

Remember, cybersecurity is an ongoing process. Stay vigilant, stay informed, and continue to adapt your security measures to the evolving threat landscape.