Troubleshooting Agent Certificate Issues

Topic

Backup fails with Critical backup failure during data transfer (Certificate error communicating with Agent)

Environment

  • Datto SIRIS
  • Datto ALTO
  • Datto NAS

Description

In order to better secure the agent software running in your production environment, we utilize a certificate-based trust system that rejects requests for a new backup or for system configuration information from anything besides the paired BCDR device. To maintain the trust chain at the core of this system, protected machines running the Datto Windows Agent occaisionally must renew their certificates and validate that the cert chain still tracks back to a trusted root CA.

Along with checking for and honoring certificate revocations, these processes involve two main operations that, if they fail, can render the agent software unable to consistently take backups: encrypted network traffic to Datto and CA internet resources and installation of new local and CA certificates. This guide provides a set of tools to address common issues that occur during those tasks.

Symptoms

Issues with the Datto Windows Agent certificate can manifest as:

  • Datto Backup Agent Service unable to start on the protected machine
  • Backup errors displayed in the BCDR device UI referencing agent certificates
  • Datto Windows Agent installation failures

If you suspect the above may be caused by issues generating or validating certificates, but cannot clearly identify them, our support team can assist identifying a certificate-based issue. Otherwise, proceed with the next few sections to evaluate the issue further and confirm which stage of the process may be experiencing issues.

Network Issues

Although all aspects of the Datto agent software's Network and Bandwidth Requirements must be met for the software to function, specific attention should be paid to the following highlights for protected machines experiencing certificate-based issues:

  • DNS / Hostname resolution including:

    • device.dattobackup.com

    • cacerts.digicert.com

  • TCP port / service restrictions at the host, gateway, or network level including:

    • HTTP: TCP port 80, outbound to WAN

    • HTTPS: TCP port 443, outbound to WAN

The most up to date list of required internet destinations can be found here, but in lieu of a comprehensive investigation, you can confirm network access to the primary resources involved in the certificate generation process with a couple quick tests.

Testing

Evaluation of network configuration for the purposes of certificate renewal can be accomplished with some simple network tests available via CMD.exe and PowerShell.

In Powershell:

Test-NetConnection -Port 443 device.dattobackup.com  

In CMD:

nslookup device.dattobackup.com
nslookup cacerts.digicert.com
nslookup ocsp.digicert.com
nslookup device.dattobackup.com <local_DNS_IP_addr>  

If any of these tests do not succeed, follow up with your network administrator to modify configuration or further troubleshoot an environmental issue. Common causes at this stage include issues with DNS forwarding, network firewall restrictions, or host firewall restrictions.

Cryptography Issues

Breaking secure communication protocols for the purpose of monitoring can be a powerful tool in a secure environment's arsenal, but in order to maintain strict trust relationships between Datto endpoints, the software must communicate with our cert resources without interference. Systems that might interfere with the certificate retrieval process include both full HTTPS proxies and network-level SSL monitoring like deep packet inspection (DPI).

Testing

One method to ensure the agent software can obtain and sign its certificate is by retrieving the signing cert from our cloud resources using CURL to simulate the encrypted conversation of the agent. This will show if network resources are interfering with TLS communication for the protected machine or if the machine lacks appropriate cryptographic libraries to communicate with our servers.

In CMD:

curl -sv -d "{\"caId\": \"dla\", \"caVersion\": 1, \"action\": \"retrieveCaCert\"}" https://device.dattobackup.com/certApi.php 

Example output of a successful test:

C:\Users\Administrator>curl -sv -d "{\"caId\": \"dla\", \"caVersion\": 1, \"action\": \"retrieveCaCert\"}" https://device.dattobackup.com/certApi.php
*   Trying 8.34.176.9:443...
* Connected to device.dattobackup.com (8.34.176.9) port 443 (#0)
* schannel: disabled automatic use of client certificate
* ALPN: offers http/1.1
* ALPN: server did not agree on a protocol. Uses default.
> POST /certApi.php HTTP/1.1
> Host: device.dattobackup.com
> User-Agent: curl/7.83.1
> Accept: */*
> Content-Length: 59
> Content-Type: application/x-www-form-urlencoded
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Wed, 15 Feb 2023 18:54:09 GMT
< Server: Apache
< Vary: Accept-Encoding
< Content-Length: 1920
< Content-Type: application/json
<
{"status":"success","message":"Successfully created certificate.","certificate":"-----BEGIN CERTIFICATE-----\nMIIE8zCCAtugAwIBAgIUHuLNTt9LMh\/nLRkQZXgVhlw8kBgwDQYJKoZIhvcNAQEL\nBQAwGzEZMBcGA1UEAwwQZGxhLmNhLmRhdHRvLmNvbTAeFw0yMDAxMjAxMDA1MjVa\nFw0yNjAxMTgxMDA1MjVaMBsxGTAXBgNVBAMMEGRsYS5jYS5kYXR0by5jb20wggIi\nMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQDoq+OmDYy3SAbRvVh0N8QXgvq1\nrcdn1D9Og5651m0ErojFdo5WoIwIDEcmJ5JDHi8Q3MjxZHfNyiogNzCtWqY+jQf+\nqqv6CRKHcx7BpjXGpzL41B75olckQ0jClaBeY+EmKzB5YlVAkDr+ooR20oElasZ\/\nIYu3giTyZ4I+8AMRXnsUul6yn160gMEFARcl+8T4CT\/frTu7hyKkk20jhQuS+dj6\nKVGDIQAoiGMossVuqq4N\/Pq8Co+O0lxQj3qtZoYTT43tBNu5wHDNW69WsA6lRZUW\nZwOqXwsX8P1Hu2b1R2O\/qxDAtaLTXbILK79Snivaf\/2kfoUYR1gAeoDcUxHZzsaW\nVjbwViWqf1TtUEeadF3BF4bn9RJBqzfZNeEEYkTNMalE4JHZz42Zm70qclmruDfk\nAGv0o1Yhaf5CHT4AYajObfpqI0aHbk7n+Lxi8+dYJ48b\/\/YFJZTFca2FJIwYugDy\n4+\/Bp2c6Py53GEPbGqj\/xddBGXi3zraWo\/DGAhF3mI2IefnWV06JGmM6wLxVkdYd\nBOhjR5ktpqV4VE7l+OLek1aTH5gLTochwoqYdKys4yu8j5rcw\/pu4daz4AJGhYKP\nMyrg1dXNnqIYfGsWU3SBQsiJ\/cKVRDENxx3NkgmS\/xdl01nUj7xaj97GuStQIZ5c\nJ9oFPIUoOLB2GQBEjwIDAQABoy8wLTAMBgNVHRMEBTADAQH\/MB0GA1UdDgQWBBT5\nmL2kbH555kVU+6dxhVoe1it2tTANBgkqhkiG9w0BAQsFAAOCAgEAdXQqGWeT1tdd\nRnnaT1\/VzLd8dnQ\/D9myeGeaWcm8MeDUnbhOEeuv2BH3geRplagZ0tSFnlWcEril\n3hzMrRJp1ck++dNjoW96+1yqNr7p0Kz7iJ0PL+nW5gIamcdR5\/hAYftU43hA0hpj\nKyWPstr0N5BfGUXqs7c\/7Xw\/WjALdbmLFwSUtoLYL\/qAsYXSxRG+Funf+Ax\/zqhZ\n+ShzAWihnHBkxMFPw\/VvZjT7LXYSlLTZwb2HiUB414ZItppwqoLjXPBzZMGvNVmn\nz5Q3bP81hDM4k9o4PyE815RonuL8OzwHEAlqQYwavLf2En4lLQ72by+Nwlo9lcYo\nDQHjS6fnJZhj8W6JbEYi+XK\/NkkJQmj88U2rLt3v1qIUlVXubOTSLBMvrAt2TuGn\nqHggR\/byGD976fDzhb+WQEKLE0sFiHRlcApdP7kSGWuF3vI6N\/k6RE1UZaCwZ5cw\nJWU5mt3F2pPyJSKHfMgWuucSlbXv0Yice6PT6BfC6MQtTHmasvJ3qinKHGQdY2\/K\nVSi\/T\/sC8hU\/DjQ+5hBabWYX88z7NNNv\/XwJKDIy2LREiYh1dUKcAXKsEeVF23sp\n7WD7ZCFl8vd3Zm8HoCNsPXA65nl+dvlX3VnW20KACaCqMQ+wjaoBMj9Cjbvb3UQb\nZPqlnvwfCrbi1sSsutYapeADy7LMej0=\n-----END CERTIFICATE-----\n"}* Connection #0 to host device.dattobackup.com left intact 

This will retrieve the CA certificate from Datto resources and will provide diagnostic output in case of a communication failure. For instance, SCHANNEL will report an inability to establish a TLS handshake with device.dattobackup.com if the protected machine cannot negotiate using TLS 1.2 or higher. Failures during this test should be discussed with network administrators or by applying the most up-to-date Windows patches available for the operating system.