As part of the Helm CRD installation process shown [here](43853ea772/pkg/action/install.go (L160-L162)), charts that both attempt to install CRDs and then install resources (e.g. `len(totalItems) > 0`) invalidate the cache and attempt to get the CRDs again to `Give time for the CRD to be recognized.` before continuing the Helm installation.
Because Monitoring V2 currently does not observe this invalidation and wait, flakey race conditions documented in issues like https://github.com/rancher/rancher/issues/32025 and https://github.com/rancher/rancher/issues/29171 can be caused. This commit introduces that invalidation process.
This is required due to the change released in v0.1.0 of the charts-build-scripts to switch the package used for YAML marshalling / unmarshalling.
Note: This does not need a packageVersion bump since it has already been bumped once for this release.
(cherry picked from commit 7ec1716f66)
This commit adds an nginx proxy in front of Prometheus that intercepts the requests that are sent to the Prometheus pod. This change was necessary since the Prometheus pod encounters issues with Rancher proxy URLs that are formatted like:
```
<server-url>/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-prometheus:9090/proxy
```
Specifically, if using the root_url option, it doubles up this URL when making requests for resources, e.g.
```
<server-url>/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-prometheus:9090/proxy/<my-path>
=>
<server-url>/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-prometheus:9090/proxy/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-prometheus:9090/proxy/<my-path>
```
However, this does not resolve the issue in https://github.com/rancher/rancher/issues/29068.
This commit adds an nginx proxy in front of Grafana that intercepts the requests that are sent to the Grafana pod. This change was necessary since the Grafana pod encounters issues with Rancher proxy URLs that are formatted like:
```
<server-url>/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-grafana:80/proxy
```
Specifically, if using the root_url option, it doubles up this URL when making requests for resources, e.g.
```
<server-url>/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-grafana:80/proxy/<my-path>
=>
<server-url>/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-grafana:80/proxy/k8s/clusters/<cluster-id>/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-grafana:80/proxy/<my-path>
```
This commit adds support for deploying rancher-monitoring into hardened clusters.
It modifies some of the default securityContexts and does some misc. fixes such as:
- Removing default AppArmor PSP annotations from Grafana (related to https://github.com/helm/charts/issues/9090)
- Modifying rkeScheduler and rkeControllerManager to use localhost to scrape components since the endpoints aren't exposed in a hardened cluster
These changes have been verified on a hardened RKE cluster.
This commit renames `grafana-dashboards` to `cattle-dashboards` and deprecates the `grafana-datasources` namespace in favor of the normal release namespace.
Related Issue: rancher/rancher#28887
- `rke2-scheduler`
- `rke2-controller-manager`
- `rke2-proxy`
- `rke2-etcd`
All exporters are created from the cattle-pushprox chart. This commit
also modifies the relevant Grafana Dashboard ConfigMaps and
PrometheusRules to deploy if the PushProx exporters are enabled.
See changes to `overlay/CHANGELOG.md` for details on what has been added/modified.
This commit sets the following field to false:
`<serviceMonitor|podMonitor|rule>SelectorNilUsesHelmValues: true`
As a result, we look for all CRs with any labels in all namespaces rather than just
the ones tagged with `release: rancher-monitoring`.
This commit adds NoExecute / NoSchedule tolerations by default to all of the
PushProx exporters since the default expectation when deploying these exporters
is that they are deployed on the expected nodes based on nodeSelector labels
regardless of any taints added to those nodes.
Users can always choose to override these settings if necessary.
- Moves `monitoringRole` settings into `global.rbac` in values.yaml
- Moves user ClusterRoles into one file: `rancher-monitoring/clusterroles.yaml`
- Reformats user ClusterRoles format to look like upstream format
- Enables aggregateRolesForRBAC by default
- Updates README.md and CHANGELOG.md for relevant ClusterRole changes
Removes the code that supports the Helm 2 hack for crd/ (`prometheus-operator/cleanup-crds.yaml` and
`prometheus-operator/crds.yaml`) and removes crd-install hooks from crds.
Also updates the README.md and CHANGELOG.md accordingly.
This commit adds script changes to automatically allow packages to split
the CRD components located in a crd/ directory into a separate package.
It also automatically adds in a validation yaml helper to the main package
to prevent a user from installing the base package without installing the crd
install package first.
Any package can enable the creation of a separate crd package by just adding
`splitCRDsIntoSeparatePackage: true` into the package.yaml, as shown in the
rancher-monitoring chart.