We use our Operations and Service Desk to keep our systems running. But Smart Mover is difficult to monitor and finding the error in a message is a difficult task. It might be possible to improve:
a) Error report by mail
The current error is the entire log for the execution of a job. If that job does 10 steps and handles 100 files it becomes very difficult to spot the exact error. Would be easier to understand if the error reported:
Server/Host + Process + Task + File/Job + Error
b) Mails are plain text
As a result the red marker to identify the error isn't available. Not that it would solve things, but when you get 130 or more lines... it does make it easier to spot.
c) Errors can be extensive
With errors in Elvis we get the full Java stack trace. Instead of "Elvis version conflict" we get "SpringServiceExceptionHandler.Call.Failed" etc. etc. That is super difficult for support to act on.
d) Difficult to integrate with monitoring
There can only be 1 person using SM Manager and we're not on our servers all the time. We're using Datadog and would like to get SM errors shown on our dashboards. Datadog can import JSON or mail. The current very long messages don't make it easy.
e) Warning per file
For monitoring it is interesting to know if a problem is affecting one or more files. Using JSON could perhaps make it possible to get info on the number of files affected.
How do you monitor Smart Mover?
Please sign in to leave a comment.