By admin • January 18, 2025
Recently, one of our Sitecore Managed Cloud clients reported performance slowness during a specific period. Our team conducted a thorough investigation to identify the root cause, and here's a summary of what we found and how we addressed it.
Observing High CPU Usage
The first indicator of an issue was a spike in CPU usage on the hosting infrastructure. This raised a flag for potential high traffic or resource-intensive operations being executed.
Identifying Traffic Patterns Using Azure Front Door Logs
To pinpoint the source of the traffic, we turned to Azure Front Door Logs. By querying the logs, we could analyze incoming requests, identify patterns, and trace them back to specific IP addresses generating significant traffic. Below are the key queries we utilized (sanitized for sensitive information):
Top Requests by IP
We started by identifying the IPs responsible for the highest number of requests:
AzureDiagnostics
| where ResourceType == "FRONTDOORS"
| summarize AggregatedValue = count() by clientIp_s
| top 100 by AggregatedValue
This allowed us to see which IPs were contributing the most traffic to the system during the period of slowness.
Drilling Down to Specific IP Behavior
After isolating the IPs generating significant traffic, we analyzed their request patterns:
Query: Requests from a Specific IP
let specificIp = "<IP_ADDRESS>"; // Replace <IP_ADDRESS> with the specific IP
AzureDiagnostics
| where ResourceType == "FRONTDOORS"
| where clientIp_s == specificIp
| project TimeGenerated, requestUri_s, httpMethod_s, userAgent_s
| order by TimeGenerated desc
This query helped us see what specific URLs were being accessed, the HTTP methods used, and the user-agent for each request.
Grouping Requests for Better Insights
To understand the behavior of the specific IP further, we grouped the requests by URI and hostname:
Query: Grouping Requests by URI
let specificIp = "<IP_ADDRESS>"; // Replace <IP_ADDRESS> with the specific IP
AzureDiagnostics
| where ResourceType == "FRONTDOORS"
| where clientIp_s == specificIp
| summarize Count = count() by requestUri_s
| order by Count desc
This revealed which URLs were being hit most frequently by the IP, helping us identify potential abuse or misconfiguration.
Query: Grouping Requests by Hostname
let specificIp = "<IP_ADDRESS>"; // Replace <IP_ADDRESS> with the specific IP
AzureDiagnostics
| where ResourceType == "FRONTDOORS"
| where clientIp_s == specificIp
| extend Hostname = tostring(extract(@"https?://([^/]+)", 1, requestUri_s))
| summarize Count = count() by Hostname
| order by Count desc
This query extracted the hostname from the request URIs, allowing us to see traffic patterns by domain.
Total Requests per Domain
To gain a broader understanding of the traffic across all IPs, we grouped the total requests by hostname:
AzureDiagnostics
| where ResourceType == "FRONTDOORS"
| extend Hostname = tostring(extract(@"https?://([^/]+)", 1, requestUri_s))
| summarize Count = count() by Hostname
| order by Count desc
This gave us a high-level view of how traffic was distributed across different domains, helping identify any anomalies.
Outcome and Next Steps
Based on the investigation, we identified specific IPs contributing disproportionately to traffic, potentially causing the slowness. We recommended the following actions:
- Implementing rate limiting on Azure Front Door to mitigate excessive requests from specific IPs.
- Adding WAF (Web Application Firewall) rules to block or throttle suspicious activity.
- Monitoring traffic patterns continuously using Azure Monitor and Log Analytics.
By leveraging Azure’s powerful diagnostics tools and targeted queries, we were able to quickly identify the root cause and propose actionable solutions, ensuring the client’s Sitecore platform remains performant and reliable.