Cluster Has Inconsistent View of Existing Docker Containers #145
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It appears that the "clusterSpecificAgentInstances" in the "DockerPlugin" class becomes stale and inconsistent over time. When a SERVER_PING request is sent from the server it provides a list of agents in the request which can be referenced using the "pluginRequest.listAgents()" method. The plugin iterates through this list to cleanup the agents. During cleanup, a fetch is made against the "clusterSpecificAgentInstances" using the "instancesCreatedAfterTimeout" method of the "DockerContainers" class when the "ServerPingRequestExecutor" calls the "performCleanupForACluster" method here. The agents in question never get cleaned up or disabled because the "DockerContainers" instance used is stale and hasn't been refreshed since the plugin was instantiated. The code to add these agents to the list of agents to disable exists here.
To address the inconsistent view of the "clusterSpecificAgentInstances" I've introduced a periodic force refresh of the "clusterSpecificAgentInstances" by scheduling a task each hour to reset the "refreshed" variable to false. This seems to help the problem but I'm not sure if there is a better solution for the root cause.