Troubleshooting TCM Monitor and Snapshot Job Errors with errorDetails
When working with Tenant Configuration Management, or TCM, there are two asynchronous experiences where administrators typically need deeper troubleshooting information: monitor runs and snapshot jobs. A monitor run can come back as failed or partiallySuccessful. A snapshot job can do the same. The default Microsoft Graph response is useful for knowing that something went wrong, but it usually does not give you enough information to know what to fix.
The missing piece is the errorDetails property. It is not returned by default. You must ask Microsoft Graph for it explicitly by using the $select query parameter.
The short version
If you already have the identifier of the failing object, call the object directly and select errorDetails:
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationMonitoringResults/{configurationMonitoringResultId}?$select=id,monitorId,runStatus,errorDetails
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationSnapshotJobs/{configurationSnapshotJobId}?$select=id,displayName,status,errorDetails
The Microsoft Graph documentation uses the camel-case property name errorDetails. If you have seen examples written as ?$select=errordetails, I recommend keeping the documented casing in your automation to avoid surprises.
Why the default response is not enough
Both monitor results and snapshot jobs expose a status field. That field tells you the outcome of the operation, but not necessarily the reason behind the outcome. For example, the following response tells us that the monitor run failed:
{
"id": "66fa1689-22cb-49c1-8b5a-c94822b7b13b",
"monitorId": "<monitorId>",
"tenantId": "<tenantId>",
"runInitiationDateTime": "2026-06-22T12:00:36.1084955Z",
"runCompletionDateTime": "2026-06-22T12:01:11.1084955Z",
"runStatus": "failed",
"driftsCount": 0
}
That is a good health signal, but it is not yet a troubleshooting signal. If this is part of an automated monitoring pipeline, you do not want your alert to simply say "the monitor failed." You want it to say which resource type failed, which instance failed, and what Microsoft Graph reported as the error.
What errorDetails contains
The errorDetails collection is designed to provide exactly that missing context. For monitor runs, the documented type is errorDetail. For snapshot jobs, the documentation describes the property as the details of errors related to the reasons why the snapshot cannot complete. In practice, the important troubleshooting dimensions are:
- resourceType: the TCM resource type that failed, such as
microsoft.teams.meetingpolicyormicrosoft.exchange.transportrule. - resourceInstanceName: the specific resource instance that caused the issue, when available.
- errorMessage: the error text that points you toward the remediation.
A selected response for a failed monitor run could look like this:
{
"id": "<monitorId>",
"monitorId": "69b6b9ba-20c9-4ffb-beef-263c07063222",
"runStatus": "failed",
"errorDetails": [
{
"resourceType": "microsoft.teams.meetingpolicy",
"resourceInstanceName": "Global",
"errorMessage": "Access Denied."
}
]
}
That is much more actionable. Instead of guessing whether the issue is with the monitor definition, the TCM service principal, a workload role, or a resource-specific problem, you now have a concrete starting point.
Finding failed monitor runs
Monitor run history is exposed through configurationMonitoringResults. The list operation supports $select, $filter, $orderby, and $top, so you can quickly find recent failed or partially successful runs.
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationMonitoringResults?$filter=runStatus eq 'failed'&$orderby=runInitiationDateTime desc&$top=10
Once you have the failed result identifier, retrieve the error details:
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationMonitoringResults/<monitorId>?$select=id,monitorId,runInitiationDateTime,runCompletionDateTime,runStatus,driftsCount,errorDetails
You can also filter for partially successful monitor runs:
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationMonitoringResults?$filter=runStatus eq 'partiallySuccessful'&$orderby=runInitiationDateTime desc
This is important because a partially successful run may still contain useful drift results for some resources while one or more resources failed. Treating partiallySuccessful as "good enough" in automation can hide resource coverage gaps.
Finding failed snapshot jobs
Snapshot jobs are exposed through configurationSnapshotJobs. Snapshot jobs are asynchronous, so the normal workflow is to create the snapshot, poll the job until it completes, and then download the snapshot from the resourceLocation value when the job succeeds or partially succeeds.
To list recent failed snapshot jobs:
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationSnapshotJobs?$filter=status eq 'failed'&$orderby=createdDateTime desc&$top=10
Then call the specific job and select errorDetails:
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationSnapshotJobs/<jobId>?$select=id,displayName,status,createdDateTime,completedDateTime,errorDetails
For partially successful jobs, use the same pattern:
GET https://graph.microsoft.com/v1.0/admin/configurationManagement/configurationSnapshotJobs?$filter=status eq 'partiallySuccessful'&$orderby=createdDateTime desc
The snapshot job status enumeration includes partiallySuccessful as an evolvable enum value. If you are building strongly typed clients, make sure your code does not break when Graph returns future enum members. For raw REST calls, this usually just means treating status values as strings and not assuming that today's list is the complete list forever.
Using Microsoft Graph PowerShell
If you are using the Microsoft Graph PowerShell SDK, the command names are available in the Microsoft.Graph.ConfigurationManagement module. The generated cmdlets expose the same resources as the REST API. If the SDK cmdlet in your installed version does not expose a convenient select parameter for the non-default property, you can always use Invoke-MgGraphRequest and call the REST endpoint directly.
Import-Module Microsoft.Graph.Authentication
Import-Module Microsoft.Graph.ConfigurationManagement
Connect-MgGraph -Scopes 'ConfigurationMonitoring.Read.All'
$resultId = '<resultId>'
$uri = "/v1.0/admin/configurationManagement/configurationMonitoringResults/$resultId" +
'?$select=id,monitorId,runStatus,errorDetails'
$result = Invoke-MgGraphRequest -Method GET -Uri $uri
$result.errorDetails | Format-Table resourceType, resourceInstanceName, errorMessage -AutoSize
For snapshot jobs, only the path changes:
$snapshotJobId = '<jobId>'
$uri = "/v1.0/admin/configurationManagement/configurationSnapshotJobs/$snapshotJobId" +
'?$select=id,displayName,status,errorDetails'
$job = Invoke-MgGraphRequest -Method GET -Uri $uri
$job.errorDetails | Format-Table resourceType, resourceInstanceName, errorMessage -AutoSize
Building better automation around TCM errors
The practical value of errorDetails shows up when you build it into your operational workflows. A monitor failure should not require someone to manually open Graph Explorer, re-run the query, and discover that the service principal is missing a workload role. Your automation can do that triage automatically.

At a minimum, I recommend logging these fields whenever a monitor result or snapshot job is not fully successful:
- Object identifier: the monitoring result ID or snapshot job ID.
- Operation status:
failedorpartiallySuccessful. - Resource type: the TCM resource type that failed.
- Resource instance: the resource instance name, when present.
- Error message: the text returned by Graph.
From there, you can group errors by resourceType. If every Teams-related resource fails with an access error, you are probably dealing with a missing workload role or permission. If only one resource instance fails, the issue is likely more specific to that object or its configuration. If snapshot jobs fail across every requested resource, look first at the TCM service principal setup and the tenant-level permissions.
Common remediation patterns
The error message is the source of truth, but most failures tend to fall into a few operational categories:
- Missing Microsoft Graph permission: the calling application or user does not have the required
ConfigurationMonitoring.Read.AllorConfigurationMonitoring.ReadWrite.Allpermission for the operation being performed. - Missing TCM service principal permission: the Unified Tenant Configuration Management service principal does not have the workload permissions required to read or evaluate the selected resource type.
- Missing workload role: some workloads require the TCM service principal to hold specific Microsoft Entra roles in addition to Graph permissions.
- Unsupported or unavailable resource state: the requested resource type or instance cannot be processed in the tenant's current state.
- Transient workload issue: a backend workload error may require retrying the monitor run or snapshot job after the service recovers.
The key is to avoid treating all TCM failures as equal. A failed monitor run with one Exchange transport rule error is a different operational problem than a snapshot job where every Teams resource failed due to access denied.
Suggested pattern for scripts
The following pseudo-flow is what I normally recommend for automation:
# 1. Find recent failed or partially successful monitor runs.
# 2. For each result, call the result directly with $select=errorDetails.
# 3. Write one log record per error detail.
# 4. Group by resourceType and errorMessage.
# 5. Route the issue to the right owner or remediation workflow.
That same pattern applies to snapshot jobs. The only difference is the collection and status field names:
| Scenario | Collection | Status field | Error property |
|---|---|---|---|
| Monitor run | configurationMonitoringResults |
runStatus |
errorDetails |
| Snapshot job | configurationSnapshotJobs |
status |
errorDetails |
Conclusion
The errorDetails property is one of those small Graph API details that makes a big operational difference. Without it, a failed TCM monitor run or snapshot job is only a status value. With it, you can see the failing resource type, the affected instance, and the message that points you toward remediation.
If you are building automation around Tenant Configuration Management, make $select=errorDetails part of your failure-handling path. It will make your alerts more useful, your logs more actionable, and your troubleshooting much faster.