A Primer on Cloud Logging for Incident Response
Overview
This blog post will provide an overview of common log sources in Azure and AWS, along with associated storage and analysis options.
At a high level, cloud-based incidents can be categorized into host-based compromises (that is, compromises primarily involving virtual machines hosted in the cloud) and identity-based or resource-based compromises (compromises primarily involving cloud-native services and identities). These scenarios often overlap depending on the scope of an incident, but the investigative approaches are distinct. For example, a compromise of a public-facing virtual machine for cryptocurrency mining purposes warrants disk acquisition and analysis, but the scope of the compromise may not extend into other resources or identities in the tenant. On the other hand, investigating an identity compromise relies heavily on tenant logs and may not involve any virtual machine artifacts if no hosts were targeted by the threat actor.
The steps below represent a high-level approach to investigating cloud-based incidents originating from a compromised identity. This process can be reversed if the investigation originates from a compromised system.
- Review log sources containing events that affect identity and access.
- In Azure, these events would include the Azure Active Directory (AD) Audit logs and Sign-in logs.
- In AWS, these events would be included in CloudTrail.
- Review resource logs, such as logs from virtual machine creation or storage account access. These logs indicate whether a resource has been created or destroyed, or whether data has been written or read from a storage bucket. There could be many other resource logs to review if they were configured ahead of time.
- Review network logs to investigate network communications within the virtual network(s). These might come from the cloud provider solution or third-party firewalls, depending on the environment.
- If needed, perform traditional host-based forensic analysis.
Overview of Logs
Azure
In Azure, logs are organized into five categories:
- Tenant logs: These contain information about operations conducted by tenant-wide services—most notably, the Azure AD log, which contains audit logs, sign-in logs, and provisioning logs
- Subscription logs: Available under the 'Activity log' service, these contain information about resources being created, modified, or deleted.
- Resource logs: These can be generated by any resource, such as Network Security Groups (NSG) and Storage Accounts. Resource logs are disabled by default.
- Operating System logs: Generated by and collected from virtual machines or containers, these logs need to be configured manually.
- Application logs: These include custom logs enabled by the developer. As such, these need to be configured manually.
Azure provides four ways to collect or examine tenant, subscription, and resource logs:
- Viewed directly in the Azure Portal
- Stored and viewed in a Log Analytics Workspace, which requires manual configuration
- Stored in a Storage Account and viewed using a tool such as Storage Explorer
- Streamed to a SIEM via an Event Hub
Amazon Web Services (AWS)
CloudTrail
CloudTrail is AWS’ go-to audit log—it records API calls made in the account, and since almost every action in AWS involves an Application Programming Interface (API) call (including web console actions), these logs are extremely valuable. These are also enabled by default.
Each account includes one default log—known as a 'trail'—which is free and retained for 90 days. Additional trails can be set up and sent to an S3 bucket or another platform. The trails themselves are a free feature, but the account is billed for storage.
Note that CloudTrail events are recorded in Coordinated Universal Time (UTC) and typically committed to CloudTrail within 15 minutes of the API call.[1]
Flow Logs
AWS supports network capture in the form of packet captures (PCAPs) and flow logs. Flow logs can be enabled at the host level (via the network interface), the subnet level, or the Virtual Private Cloud (VPC) level. These logs are collected Out-of-Band (OOB), which means they place no additional load on the network.
Flow logs can be configured to ship to an S3 bucket in 1–10-minute intervals. Note that even if your account does not have flow logs enabled at the start of an incident, you should consider enabling them to capture potentially ongoing malicious traffic!
In addition to storing in S3, flow logs can also be sent to CloudWatch (for additional cost).
Remember that if a load balancer is in use, flow logs will not record the public source IP of an attacker; these logs will need to be collected from the load balancer in addition to flow logs, to correlate the actual source IP.
DNS Logs
Route 53 is AWS' DNS solution, which has two log sets of interest:
- DNS zone query logs record queries made against your AWS-hosted domain names.
- Resolver query logs record all DNS queries made within a VPC.
If Route 53 is in use in your account, resolver query logs are of particular interest for identifying malicious outbound traffic, such as initial Command and Control (C2) queries.
S3
Recall that CloudTrail will record API calls, including those related to S3 interaction. However, CloudTrail does not record browser-based interactions within an S3 bucket; rather, such activity is recorded in the S3 Server Access log. This feature is unfortunately disabled by default. Be sure to enable this log manually to record web browser access to your S3 buckets!
Native Analysis Solutions
Finally, AWS offers several log analysis solutions, in addition to exporting to your own SIEM:
- Athena + Glue
- If you have large numbers of logs in S3 buckets, you can search all of them in-place with Athena + Glue.
- Glue crawls the bucket and adds the logs into a table for searching in Athena.
- Athena allows you to perform SQL-like queries across all of the logs that Glue crawled; no extra processing necessary.
- GuardDuty
- This is AWS' pay-per-usage CloudTrail analyzer. It is helpful as a first-pass triage of logs. Amazon describes it best:
“Amazon GuardDuty identifies unusual activity within your accounts, analyzes the security relevance of the activity, and gives the context in which it was invoked. This allows a responder to determine if they should spend time on further investigation. GuardDuty findings are assigned a severity, and actions can be automated by integrating with AWS Security Hub, Amazon EventBridge, AWS Lambda, and AWS Step Functions. Amazon Detective is also tightly integrated with GuardDuty, so you can perform deeper forensic and root cause investigation.”[2]
- AWS Detective
- Whereas GuardDuty is labeled as Amazon’s "intelligent threat detection" service, Detective is Amazon's "security investigation" service. Detective allows you to correlate events from multiple data sources for a unified view of security alerts. Detective requires that GuardDuty be enabled for 48 hours prior to enabling Detective.
Conclusion
In this post, we explored a brief overview of common log sources and analysis options in Azure and AWS. Our hope is that you can use the information in this primer to review your own cloud accounts, to gain familiarity with the available logging, and perhaps to enhance your configuration.