Important: Because triggers are a new and evolving feature, backward compatibility between releases is not guaranteed at this time.
That’s happened to me when I tried to setup a new trigger to monitor dfs_capacity_used_non_hdfs metric in HDFS in CDH5.4.2 (verified that issue is still there for CDH5.5.0).
I used the Create Trigger button on the status page of HDFS service to create a new trigger, changed the default name, entered a metric, a value and changed the action to Mark as bad. It then showed that everything is ok and trigger was not fired. So I pressed Create Trigger and was now sure that it would send me an alert once non-dfs usage goes to high.
Well everything need to be tested. So I created another trigger with a lower value that should make this trigger fire. But I realised that noting happened. Having checked various things I figured out that the issue was that CM had created a trigger with a variable name $SERVICENAME instead of actual value. If you ever see the issue in the servicemonitor logs, it would be something like “Could not parse trigger expression: …”
The fix seems to be simple: replace it with HDFS and save. If you just do that it will complain about a manually changed expression for the trigger that was created in the editor. To prevent that you may want to remove expressionEditorConfig section. But the more consistent way is to remove the trigger before using the documented way from the Edit Trigger page. I prefer to do so as we can’t be sure that CM doesn’t keep any metadata somewhere else.
Another issue however is that you don’t have a link to this page in CM. It would usually appear at Health tests, if it was created without an initial issue. To get this Edit Trigger page you may use your browser history or build it manually. Just go to any health test from the service and replace the tail of the URL with healthTestName=alarm%3A<trigger name>. If you used spaces in <trigger name> replace them with plus sign.
The triggers are awesome but you should create them manually before the editor is fixed. Also if you have created triggers using editor, you may want to review if they are actually working. You should see them in the Health tests list. You shouldn’t see any parsing errors in the servicemonitor logs.
Interested in working with Valentin? Schedule a tech call.