TA 5 stops processing; time out errors in server log
Hello all,
I am going to email Parker support with this, but I thought I would give the forum a try too.
We installed ThinkAutomation ver 5 a few months ago, a new install on a Windows 2019 server in Azure cloud with an Azure SQL managed instance. We use TA to monitor a mailbox in MSOffice 365.
Since we installed TA it is working; however, from time to time it will just stop working and I have to manually check and find the errors below in the Server Log in Studio. When these errors happen the O365 monitoring just stops and we are losing data.
Errors:
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.) Executing clsMessageStoreRead.HasPendingOutbox
The connection is broken and recovery is not possible. The client driver attempted to recover the connection one or more times and all attempts failed. Increase the value of ConnectRetryCount to increase the number of recovery attempts. Executing clsMessageStoreRead.HasPendingOutbox
The TA services are running, it just stops processing messages. What is causing these errors, and how can I fix them? I tried to search through the web page (frames?? really?) for ConnectRetryCount and semaphore timeout but the web page search is awful and no luck there.
Not only do I need to fix these errors, but I also need better application documentation. Is there a searchable PDF version available?
I finally found the log file PSL.Log so I will set up a log monitor there, but these errors are unacceptable and I can't keep manually checking TA to make sure it is working.
Thanks in advance for any help.
- DDaniel Horton @daniel.horton
Hi JayJay,
The issue itself you are seeing here is connectivity to the Message Store database. The log is trying to assist you but it sounds like the root cause is the more important item to resolve, which is TCP/IP connectivity issues to your (presumed to be external) Message Store. ThinkAutomation has built in methods to handle disruptions where the connectivity is lost temporarily and it will halt processing until the database connection is restored. But temporary situations would only cover items like the database service restarting or similar. Core network failures cannot be handled in all circumstances as I'm sure you can appreciate as these cannot be predicted or controlled by the application.
Our recommendation here would be to look into any potential issues with connectivity over your designated remote connection between the two machines. There may be manner to improve resilience in the Connection String but this would not be advised. Our advice would be to review and resolve any underlying connectivity/communication issues that may be occurring during the times of this problem.
As it is a core database used by ThinkAutomation for its Message Store. It would be expected to cause problems if this connectivity was not possible.
If you are looking for 100% up time then it may be wise to consider loading the database locally to ThinkAutomation and replicating it externally?
Our Help documentation is available here and can be searched using a general text search of your web browser - https://support.thinkautomation.com/
Issues such as what you are experiencing are caused from factors outside of the product and our error handling is attempting to guide you to this understanding, but as with all technical products it is difficult to document every possible eventuality for errors with so many external factors. The Community is the place where further knowledge on these types of items can be expressed, so thank you for your contribution.