10 KiB

Raw Blame History

Changelog

All notable changes from the upstream Prometheus Operator chart will be added to this file.

[Package Version 00] - 2020-07-19

Added

Added Prometheus Adapter as a dependency to the upstream Prometheus Operator chart to allow users to expose custom metrics from the default Prometheus instance deployed by this chart
Remove prometheus-operator/cleanup-crds.yaml and prometheus-operator/crds.yaml from the Prometheus Operator upstream chart in favor of just using the CRD directory to install the CRDs.
Added support for rkeControllerManager, rkeScheduler, rkeProxy, and rkeEtcd PushProx exporters for monitoring k8s components within RKE clusters
Added support for a k3sServer PushProx exporter that monitors k3s server components (kubeControllerManager, kubeScheduler, and kubeProxy) within k3s clusters
Added support for kubeAdmControllerManager, kubeAdmScheduler, kubeAdmProxy, and kubeAdmEtcd PushProx exporters for monitoring k8s components within kubeAdm clusters
Added support for rke2ControllerManager, rke2Scheduler, rke2Proxy, and rke2Etcd PushProx exporters for monitoring k8s components within rke2 clusters
Exposed prometheus.prometheusSpec.ignoreNamespaceSelectors on values.yaml and set it to false by default. This value instructs the default Prometheus server deployed with this chart to ignore the namespaceSelector field within any created ServiceMonitor or PodMonitor CRs that it selects. This prevents ServiceMonitors and PodMonitors from configuring the Prometheus scrape configuration to monitor resources outside the namespace that they are deployed in; if a user needs to have one ServiceMonitor / PodMonitor monitor resources within several namespaces (such as the resources that are used to monitor Istio in a default installation), they should not enable this option since it would require them to create one ServiceMonitor / PodMonitor CR per namespace that they would like to monitor. Relevant fields were also updated in the default README.md.
Added grafana.sidecar.dashboards.searchNamespace to values.yaml with a default value of cattle-dashboards. The namespace provided should contain all ConfigMaps with the label grafana_dashboard and will be searched by the Grafana Dashboards sidecar for updates. The namespace specified is also created along with this deployment. All default dashboard ConfigMaps have been relocated from the deployment namespace to the namespace specified
Added monitoring-admin, monitoring-edit, and monitoring-view default ClusterRoles to allow admins to assign roles to users to interact with Prometheus Operator CRs. These can be enabled by setting .Values.global.rbac.userRoles.create (default: true). In a typical RBAC setup, you might want to use a ClusterRoleBinding to bind these roles to a Subject to allow them to set up or view ServiceMonitors / PodMonitors / PrometheusRules and view Prometheus or Alertmanager CRs across the cluster. If .Values.global.rbac.userRoles.aggregateRolesForRBAC is enabled, these ClusterRoles will aggregate into the respective default ClusterRoles provided by Kubernetes
Added monitoring-config-admin, monitoring-config-edit and monitoring-config-view default Roles to allow admins to assign roles to users to be able to edit / view Secrets and ConfigMaps within the cattle-monitoring-system namespace. These can be enabled by setting .Values.global.rbac.userRoles.create (default: true). In a typical RBAC setup, you might want to use a RoleBinding to bind these roles to a Subject within the cattle-monitoring-system namespace to allow them to modify Secrets / ConfigMaps tied to the deployment, such as your Alertmanager Config Secret.
Added monitoring-dashboard-admin, monitoring-dashboard-edit and monitoring-dashboard-view default Roles to allow admins to assign roles to users to be able to edit / view ConfigMaps within the cattle-dashboards namespace. These can be enabled by setting .Values.global.rbac.userRoles.create (default: true) and deploying Grafana as part of this chart. In a typical RBAC setup, you might want to use a RoleBinding to bind these roles to a Subject within the cattle-dashboards namespace to allow them to create / modify ConfigMaps that contain the JSON used to persist Grafana Dashboards on the cluster.
Added default resource limits for Prometheus Operator, Prometheus, AlertManager, Grafana, kube-state-metrics, node-exporter
Added a default template rancher_defaults.tmpl to AlertManager that Rancher will offer to users in order to help configure the way alerts are rendered on a notifier. Also updated the default template deployed with this chart to reference that template and added an example of a Slack config using this template as a comment in the values.yaml.
Added support for private registries via introducing a new field for global.cattle.systemDefaultRegistry that, if supplied, will automatically be prepended onto every image used by the chart.
Added a default nginx proxy container deployed with Grafana whose config is set in the ConfigMap located in charts/grafana/templates/nginx-config.yaml. The purpose of this container is to make it possible to view Grafana's UI through a proxy that has a subpath (e.g. Rancher's proxy). This proxy container is set to listen on port 8080 (with a portName of nginx-http instead of the default service), which is also where the Grafana service will now point to, and will forward all requests to the Grafana container listening on the default port 3000.
Added a default nginx proxy container deployed with Prometheus whose config is set in the ConfigMap located in templates/prometheus/nginx-config.yaml. The purpose of this container is to make it possible to view Prometheus's UI through a proxy that has a subpath (e.g. Rancher's proxy). This proxy container is set to listen on port 8081 (with a portName of nginx-http instead of the default web), which is also where the Prometheus service will now point to, and will forward all requests to the Prometheus container listening on the default port 9090.
Added support for passing CIS Scans in a hardened cluster by introducing a Job that patches the default service account within the cattle-monitoring-system and cattle-dashboards namespaces on install or upgrade and adding a default allow all NetworkPolicy to the cattle-monitoring-system and cattle-dashboards namespaces.

Modified

Updated the chart name from prometheus-operator to rancher-monitoring and added the io.rancher.certified: rancher annotation to Chart.yaml
Modified the default node-exporter port from 9100 to 9796
Modified the default nameOverride to rancher-monitoring. This change is necessary as the Prometheus Adapter's default URL (http://{{ .Values.nameOverride }}-prometheus.{{ .Values.namespaceOverride }}.svc) is based off of the value used here; if modified, the default Adapter URL must also be modified
Modified the default namespaceOverride to cattle-monitoring-system. This change is necessary as the Prometheus Adapter's default URL (http://{{ .Values.nameOverride }}-prometheus.{{ .Values.namespaceOverride }}.svc) is based off of the value used here; if modified, the default Adapter URL must also be modified
Configured some default values for grafana.service values and exposed them in the default README.md
The default namespaces the following ServiceMonitors were changed from the deployment namespace to allow them to continue to monitor metrics when prometheus.prometheusSpec.ignoreNamespaceSelectors is enabled:
- core-dns: kube-system
- api-server: default
- kube-controller-manager: kube-system
- kubelet: {{ .Values.kubelet.namespace }}
Disabled the following deployments by default (can be enabled if required):
- AlertManager
- kube-controller-manager metrics exporter
- kube-etcd metrics exporter
- kube-scheduler metrics exporter
- kube-proxy metrics exporter
Updated default Grafana deploymentStrategy to Recreate to prevent deployments from being stuck on upgrade if a PV is attached to Grafana
Modified the default <serviceMonitor|podMonitor|rule>SelectorNilUsesHelmValues to default to false. As a result, we look for all CRs with any labels in all namespaces by default rather than just the ones tagged with the label release: rancher-monitoring.
Modified the default images used by the rancher-monitoring chart to point to Rancher mirrors of the original images from upstream.
Modified the behavior of the chart to create the Alertmanager Config Secret via a pre-install hook instead of using the normal Helm lifecycle to manage the secret. The benefit of this approach is that all changes to the Config Secret done on a live cluster will never get overridden on a helm upgrade since the secret only gets created on a helm install. If you would like the secret to be cleaned up on an helm uninstall, enable alertmanager.cleanupOnUninstall; however, this is disabled by default to prevent the loss of alerting configuration on an uninstall. This secret will never be modified on a helm upgrade.
Modified the default securityContext for Pod templates across the chart to {"runAsNonRoot": "true", "runAsUser": "1000"} and replaced grafana.rbac.pspUseAppArmor in favor of grafana.rbac.pspAnnotations={} in order to make it possible to deploy this chart on a hardened cluster which does not support Seccomp or AppArmor annotations in PSPs. Users can always choose to specify the annotations they want to use for the PSP directly as part of the values provided.
Modified .Values.prometheus.prometheusSpec.containers to take in a string representing a template that should be rendered by Helm (via tpl) instead of allowing a user to provide YAML directly.
Modified the default Grafana configuration to auto assign users who access Grafana to the Viewer role and enable anonymous access to Grafana dashboards by default. This default works well for a Rancher user who is accessing Grafana via the kubectl proxy on the Rancher Dashboard UI since anonymous users who enter via the proxy are authenticated by the k8s API Server, but you can / should modify this behavior if you plan on exposing Grafana in a way that does not require authentication (e.g. as a NodePort service).
Modified the default Grafana configuration to add a default dashboard for Rancher on the Grafana home page.

10 KiB Raw Blame History

Changelog

[Package Version 00] - 2020-07-19

Added

Modified

10 KiB

Raw Blame History