Alerting integration for violations #580

ritazh · 2020-04-21T17:43:50Z

e.g. send violations to slack

This is from the CNCF webinar.
To summary the ask from the webinar: When a violation is detected, it would be good to get an alert from this event into systems like Slack, Datadog, or Prometheus.

maxsmythe · 2020-04-22T03:30:45Z

Could you add a link or something so we know what the goal of this bug would be?

This possibly sounds like an enforcement action. I also wonder how this would interact with alerts sourced from Prometheus.

One danger to watch out for: API request volume to the admin server can be extremely high for some kinds, so we risk spamming the alert pipeline without some volume-reducing solution.

swapnild2111 · 2020-04-22T06:27:22Z

Hello,

I think I am waiting for same. I have deployed Gatekeeper with dryrun enabled. I can see Violations in status field. However am not sure how to set alert for this violations in Datadog / Prometheus / slack anywhere.

Could you please help?

maxsmythe · 2020-04-22T23:10:17Z

What kind of alerts are you looking to have?

swapnild2111 · 2020-04-23T16:16:47Z

Alerts as in send a slack message or show it in logs / metrics in datadog saying violations found wit details.

sozercan · 2020-04-23T16:56:25Z

We can do a write-up about integrating gatekeeper metrics with prometheus and alertmanager (which includes integrations with slack, datadog and many others)

Other than violations over a certain threshold, is there anything else you would like alerts on?

swapnild2111 · 2020-04-23T17:14:51Z

That would be great :)

maxsmythe · 2020-04-24T03:48:03Z

Also, the logs can be parsed for more detailed data about rejections to alert on.

bytemare · 2020-05-06T08:16:57Z

Hello there 👋
One of the teams I'm working with have deployed OPA Gatekeeper, and we would like to do the same to monitor every policy/compliance violation, not yet block deployments (or the devs would kill us).

Ideally, we would need alerts sent over webhooks in json or syslog, containing all the info about the violations.

Is this possible/configurable at this moment, or planned?
I would gladly help if needed.

Thanks

maxsmythe · 2020-05-07T01:56:02Z

We are emitting audit violations via stderr/stdout logs. Are you able to pipe those into syslog/ELK/other log aggregator and use those to drive alerts?

That would probably give you the most detailed violation information.

swapnild2111 · 2020-05-07T03:52:05Z

What I have do is -

Enabled enforcementAction: dryrun
Added --log-denies
Added unique log message for violations.

After this I could see violations in logs, which I am streaming to Datadog.

In Datadog, I have created charts & added monitors by tracing those unique log messages.
The things I can do with this approach are pretty limited.

If I get dryrun_violation_count etc in metrics, things will become much more easier.

sozercan · 2020-05-07T04:16:20Z

@swapnild2111 you can get violations count, like:

gatekeeper_violations{enforcement_action="deny"} 19
gatekeeper_violations{enforcement_action="dryrun"} 7

See https://github.com/open-policy-agent/gatekeeper/blob/master/docs/Metrics.md for list of all metrics

swapnild2111 · 2020-05-11T09:57:17Z

thank you, it worked perfectly for me :)

lechuk47 · 2020-06-12T05:45:03Z

It would be useful to have the constraint details as metric tags. e.g. Having the constraint name and type as tags in the violation metrics will be enough to set alerts on Prometheus.

morganwalker · 2020-08-21T01:38:05Z

@swapnild2111 how did you leverage those metrics via Datadog? While I plan on parsing the logs ingested to DD for violations, ideally I'd like to be able to use the metrics in DD.

swapnild2111 · 2020-12-15T09:37:37Z

@morganwalker sorry for very late reply.

I have below annotations on my deployment to send prometheus metrics to Datadog:

ad.datadoghq.com/manager.check_names: '["prometheus"]'
ad.datadoghq.com/manager.init_configs: '[{}]'
ad.datadoghq.com/manager.instances: '[{"prometheus_url":"http://%%host%%:8888/metrics", "namespace": "gatekeeper-system", "metrics":["*"]}]'
prometheus.io/port: "8888"
prometheus.io/scrape: "true"

Would that be helpful for you?

teochenglim · 2021-07-12T01:39:52Z

Anyone working on slack yet?

allow an optional enable slack feature, then you just need 2 inputs usually which is "slack webhook url" and "which channel to send to". Also since webhook is used, you just need a HTTP Post to make it work so not much dependency.

My suggestion is to have 2 kind of slack massage

realtime, scan per violet message
every hour/day/week report of all how many occurrence count, report type. a bit more complex because you need to hold variable, but maybe can reuse prometheus existing metrics?

Same as many other frameworks, slack webhook URL could be created as a k8s secret

helm example

slack:
  enable: true ### default false
  slack_channel: ""
  slack_title: ""  ## If we monitor all security audit in 1 single room, it will be helpful to have a title to know
  slack_text_prefix: ""   ## we can create prefix to tell which cluster is this message from, staging/production
  slack_text_subfix: ""
  slack_webhook_url: ""  ## we don't want create the k8s secret, just give URL. 
  slack_secret_name: ""  ## where we define slack webhook URL
  slack_report:
    slack_cron: "* 2 * * *" ## minute, hour, day (month), month, day (week)

maxsmythe · 2021-07-12T23:04:03Z

@sozercan did we ever document alert manager integrations? That seems like it would address use case #2.

As for use case #1, that sounds similar to:

#1037
#898
The push based pipeline referenced in #897

IIRC we were also thinking about generic webhook-based reporting at some point

stale · 2022-07-23T06:17:30Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale · 2022-10-01T22:02:54Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

debu99 · 2023-06-15T01:43:27Z

Is this feature available now?

a-thorat · 2023-11-07T23:55:50Z

@swapnild2111 @maxsmythe
I am trying to implement the violation alerting with MS teams for Gatekeeper Operator installled on OpenShift V.4.13 but not able to achieve as everything coming out of operator. any idea how i can integrate here

ritazh added enhancement New feature or request help wanted Extra attention is needed labels Apr 21, 2020

ralgozino mentioned this issue Feb 3, 2021

Does it support exporting violations? sighupio/gatekeeper-policy-manager#108

Closed

sunstonesecure-robert mentioned this issue Sep 2, 2021

Gatekeeper Adapter kubernetes-sigs/wg-policy-prototypes#90

Closed

stale bot added the wontfix This will not be worked on label Jul 23, 2022

ritazh added docs Pure prose reporting and removed wontfix This will not be worked on labels Aug 2, 2022

ctrought mentioned this issue Aug 17, 2022

Admission Events InvolvedObject Namespace #2230

Closed

ralgozino mentioned this issue Sep 21, 2022

No any logs of violations sighupio/gatekeeper-policy-manager#433

Open

stale bot added the stale label Oct 1, 2022

ritazh added triaged and removed stale labels Oct 3, 2022

salaxander self-assigned this Feb 21, 2024

salaxander removed their assignment Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alerting integration for violations #580

Alerting integration for violations #580

ritazh commented Apr 21, 2020 •

edited

Loading

maxsmythe commented Apr 22, 2020

swapnild2111 commented Apr 22, 2020

maxsmythe commented Apr 22, 2020

swapnild2111 commented Apr 23, 2020

sozercan commented Apr 23, 2020

swapnild2111 commented Apr 23, 2020

maxsmythe commented Apr 24, 2020

bytemare commented May 6, 2020

maxsmythe commented May 7, 2020

swapnild2111 commented May 7, 2020

sozercan commented May 7, 2020

swapnild2111 commented May 11, 2020

lechuk47 commented Jun 12, 2020 •

edited

Loading

morganwalker commented Aug 21, 2020

swapnild2111 commented Dec 15, 2020

teochenglim commented Jul 12, 2021

maxsmythe commented Jul 12, 2021

stale bot commented Jul 23, 2022

stale bot commented Oct 1, 2022

debu99 commented Jun 15, 2023

a-thorat commented Nov 7, 2023

Alerting integration for violations #580

Alerting integration for violations #580

Comments

ritazh commented Apr 21, 2020 • edited Loading

maxsmythe commented Apr 22, 2020

swapnild2111 commented Apr 22, 2020

maxsmythe commented Apr 22, 2020

swapnild2111 commented Apr 23, 2020

sozercan commented Apr 23, 2020

swapnild2111 commented Apr 23, 2020

maxsmythe commented Apr 24, 2020

bytemare commented May 6, 2020

maxsmythe commented May 7, 2020

swapnild2111 commented May 7, 2020

sozercan commented May 7, 2020

swapnild2111 commented May 11, 2020

lechuk47 commented Jun 12, 2020 • edited Loading

morganwalker commented Aug 21, 2020

swapnild2111 commented Dec 15, 2020

teochenglim commented Jul 12, 2021

maxsmythe commented Jul 12, 2021

stale bot commented Jul 23, 2022

stale bot commented Oct 1, 2022

debu99 commented Jun 15, 2023

a-thorat commented Nov 7, 2023

ritazh commented Apr 21, 2020 •

edited

Loading

lechuk47 commented Jun 12, 2020 •

edited

Loading