Skip to content

Fluentd logs is full of backslash and kibana doesn't show k8s pods logs #2545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
avarf opened this issue Aug 6, 2019 · 17 comments
Open

Fluentd logs is full of backslash and kibana doesn't show k8s pods logs #2545

avarf opened this issue Aug 6, 2019 · 17 comments
Labels
bug Something isn't working

Comments

@avarf
Copy link

avarf commented Aug 6, 2019

Describe the bug
I set up an EFK stack for gathering my different k8s pods logs based on this tutorial: https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/ on a Microk8s single node cluster. Everything is up and working and I can connect kibanna to elasticsearch and see the indexes but in the discovery section of kibana there is no log related to my pods and there are kubelete logs.

When I checked the logs of fluentd I saw that it is full of backslashes:

2019-08-05 15:23:17 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-08-05T17:23:10.167379794+02:00 stdout P 2019-08-05 15:23:10 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-08-05T17:23:07.09726655+02:00 stdout P 2019-08-05 15:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\"2019-08-05T17:23:04.433817307+02:00 stdout P 2019-08-05 15:23:04 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\"2019-08-05T17:22:52.546188522+02:00 stdout P 2019-08-05 15:22:52 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\\\\\\\\\"2019-08-05T17:22:46.694679863+02:00 stdout F \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

There are much more backslashes but I just copied this amount to show the log.

Your Environment

  • Fluentd or td-agent version: I tested this with two images: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch and also v1.3 but the results were the same
  • Operating system: I am using Ubuntu 18.04 but the fluentd is running in a container and in a single node kubernetes cluster running on Microk8s

Your Configuration
Based on the tutorial that I mentioned earlier I am using two config files for setting up fluentd:

  1. fluentd-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-system
  1. fluentd-daemonset.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  # namespace: default
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.logging"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENT_UID
            value: "0"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
@avarf avarf added the bug Something isn't working label Aug 6, 2019
@repeatedly
Copy link
Member

When I checked the logs of fluentd I saw that it is full of backslashes:

Is this log single line, right? If so, it seems several logs are merged into one.
Do you show the configuration/application example to reproduce the problem?

@avarf
Copy link
Author

avarf commented Aug 8, 2019

No, the log is full of backslashes and there are single lines of actual log and then pages of backslashes but I didn't want to copy all the meaningless backslashes and when I searched for the "error" there wasn't any.
Regarding the configuration, you have all the configuration, I followed that tutorial, used an image and the environment variables that you can see in the yaml files and I ran it on Microk8s and Ubuntu 18.04.

@codesqueak
Copy link

Any progress on this issue ? I seem to have just hit exactly the same problem

I use a slightly different setup using

image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch

but otherwise substantially the same.

Looking at the logs, it appears to be repeatedly reprocessing the same information, objecting to the format, which generates a new, longer log entry which is then reprocessed .... and around we go.

@ebastos
Copy link

ebastos commented Dec 9, 2019

I have the same problem after following this tutorial, but using k3s as my kubernetes deployment.

If I strip the backslashs I can see something like:

# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |tr -s "\\"
tr: warning: an unescaped backslash at end of string is not portable
\"
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.66350503Z stdout F \"\"\""
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.664147887Z stdout P 2019-12-09 20:23:24 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:21.243596958Z stdout P 2019-12-09 20:23:21 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:07.807619666Z stdout P 2019-12-09 20:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:01.152628152Z stdout F \"

But otherwise it's not even possible to see what is going on:

# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |egrep -o '\\'|wc -l
32650

My fluend.yaml is as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  labels:
    app: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-logging
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.kube-logging.svc.cluster.local"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

@etegan
Copy link

etegan commented Jan 13, 2020

Same issue. Does anyone have a solution for this?

@neoxack
Copy link

neoxack commented Jan 30, 2020

Same issue \\\\

@johnfedoruk
Copy link

If your fluentd logs are growing in backslashes, then your fluentd container is parsing its own logs and recursively generating new logs.

Consider creating a fluentd-config.yaml file that is setup to ignore /var/log/containers/fluentd* logs. My example here will help you parse Apache logs... RTFM for more information on configuring sources.

Here is my fluentd-config.yaml file:

kind: ConfigMap
apiVersion: v1
metadata:
  name: fluentd-config
  namespace: kube-logging
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  containers.input.conf: |-
    <source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      exclude_path ["/var/log/containers/fluentd*"]
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      format /^.* (?<source>(stderr|stdout))\ F\ (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
      time_format %d/%b/%Y:%H:%M:%S %z
    </source>
  output.conf: |-
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      type kubernetes_metadata
    </filter>
    <match **>
       type elasticsearch
       log_level info
       include_tag_key true
       host elasticsearch.kube-logging.svc.cluster.local
       port 9200
       logstash_format true
       # Set the chunk limits.
       buffer_chunk_limit 2M
       buffer_queue_limit 8
       flush_interval 5s
       # Never wait longer than 5 minutes between retries.
       max_retry_wait 30
       # Disable the limit on the number of retries (retry forever).
       disable_retry_limit
       # Use multiple threads for processing.
       num_threads 2
    </match>

Then you will want to update your fluentd DaemonSet. I have had success with the gcr.io/google-containers/fluentd-elasticsearch:v2.0.1 image. Attach your fluentd-config to your fluentd DaemonSet.

Here's what that looks like:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: gcr.io/google-containers/fluentd-elasticsearch:v2.0.1
        env:
          - name: FLUENTD_SYSTEMD_CONF
            value: "disable"
          - name: FLUENTD_ARGS
            value: "--no-supervisor -q"
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlogcontainers
          mountPath: /var/log/containers
          readOnly: true
        - name: config
          mountPath: /etc/fluent/config.d
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlogcontainers
        hostPath:
          path: /var/log/containers/
      - name: config
        configMap:
          name: fluentd-config

Best of luck!

@repeatedly
Copy link
Member

repeatedly commented Mar 12, 2020

@lecaros
Copy link

lecaros commented Oct 20, 2020

I see 2 possible concurrent causes:

  1. You're not excluding fluentd logs (hence the numerous '\' and the circular log messages)
  2. k3s will prefix the log lines with datetime, stream (stdout, stderr) and a log tag. So, if your message is "hello Dolly", k3s will save it to the file as:
    2020-10-20T18:05:39.163671864-05:00 stdout F "hello Dolly"

The pattern not match explains why kibana doesn't see any error message. They're not being sent to your elastic service.

Having a proper filter/parser would help on this.
Can you post your fluentd conf?

@mariusgrigoriu
Copy link

Is there a good way for fluentd's own logs to be shipped if possible?

@mickdewald
Copy link

I got this issue as well, because I was using containerd instead of docker. I solved it by putting in the following configuration:

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/

@108356037
Copy link

@micktg
Your solution fixed my problem! Much appreciation!

@repeatedly
Copy link
Member

For lastest images, use cri parser is better than regexp: https://github.com/fluent/fluentd-kubernetes-daemonset#use-cri-parser-for-containerdcri-o-logs

@varungupta19
Copy link

I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this

    env:
      - name:  FLUENT_ELASTICSEARCH_HOST
        value: "elasticsearch.kube-logging.svc.cluster.local"
      - name:  FLUENT_ELASTICSEARCH_PORT
        value: "9200"
      - name: FLUENT_ELASTICSEARCH_SCHEME
        value: "http"
      - name: FLUENTD_SYSTEMD_CONF
        value: disable
      - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
        value: /var/log/containers/fluent*
      - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
        value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ 

@rawinng
Copy link

rawinng commented Aug 13, 2021

I found @micktg and @varungupta19 answer solve the problem.

@jsvasquez
Copy link

Thanks, @micktg and @varungupta19. Problem solved.

@Athuliva
Copy link

Athuliva commented Apr 6, 2024

I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this

    env:
      - name:  FLUENT_ELASTICSEARCH_HOST
        value: "elasticsearch.kube-logging.svc.cluster.local"
      - name:  FLUENT_ELASTICSEARCH_PORT
        value: "9200"
      - name: FLUENT_ELASTICSEARCH_SCHEME
        value: "http"
      - name: FLUENTD_SYSTEMD_CONF
        value: disable
      - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
        value: /var/log/containers/fluent*
      - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
        value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ 

adding value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ this didn't help me. I am trying to implement it in microk8s

openstack-mirroring pushed a commit to openstack/openstack-helm-infra that referenced this issue Jul 9, 2024
+ prevent Fluentd from parsing its own logs and fix an issue with
  endless backslashes (fluent/fluentd#2545)
+ increase chunk limit size
+ add storage for systemd plugin configuration
+ add pos_file parameter for the tail sources

Change-Id: I7d6e54d2324e437c92e5e8197636bd6c54419167
openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Jul 9, 2024
* Update openstack-helm-infra from branch 'master'
  to 01e66933b3c2b93c6677c04a00361ceeb70a9634
  - [fluentd] Adjust configuration for v1.15
    
    + prevent Fluentd from parsing its own logs and fix an issue with
      endless backslashes (fluent/fluentd#2545)
    + increase chunk limit size
    + add storage for systemd plugin configuration
    + add pos_file parameter for the tail sources
    
    Change-Id: I7d6e54d2324e437c92e5e8197636bd6c54419167
openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Mar 18, 2025
* Update openstack-helm from branch 'master'
  to b95bb6767865077c0ccec867d52429e8549bcdb8
  - Merge remote-tracking branch 'openstack-helm-infra/master'
    
    The openstack-helm and openstack-helm-infra are merged into one. See
    https://lists.openstack.org/archives/list/[email protected]/thread/YRWSN6X2MTVGFPCULJ344RSDMCQDO7ZG/
    for the discussion which led up to this.
    
  - Prepare for upcoming merger with OSH
    
    * Remove unnecessary scripts
    * Sync identical scripts
    * Update chart_version.sh so to make it identical with OSH
    * Sync README.rst, tox.ini, .gitignore, Makefile
    * Rename some files to prevent merge conflicts
    * Sync releasenotes config
    
    Change-Id: Ibfdebcb62a416fc1b15989a1fd89b897a783d8f4
    
  - Move docs to the openstack-helm repo
    
    This is to prepare for the upcoming merger
    to the openstack-helm repo and to reduce
    the number of merge conflicts.
    
    Depends-On: I6a4166f5d4d69279ebd56c66f74e2cbc8cbd17dd
    Change-Id: I3cb3f2c44d8401e1d0de673bf83f8e294433b8df
    
  - Merge "Do not install reno globally using pip"
  - [ceph-rook] Skip check iteration if TOOLS_POD empty
    
    When we deploy Ceph cluster using Rook we
    check its status in a loop using Rook tools pod.
    If tools pod is not found, let's skip the iteration.
    
    Change-Id: Ib6bc90034961f89b8bb53081db5bf03c4d736110
    
  - Do not install reno globally using pip
    
    It does not work on Ubuntu Noble which requires using virtual
    env when you try to install packages using pip. Also
    for deployment tests reno is not needed because we build
    charts with SKIP_CHANGELOG=1
    
    Change-Id: I8f0578ed2e1d0e757add155c618eea2e8a2e30d2
    
  - [deploy-env] Do not use kubernetes.core ansible module
    
    It fails with the error: `No module named 'yaml'`.
    So let's use native helm with the ansible shell instead.
    
    Change-Id: If652d603cfcaeb0b70c9b566b90d98e627d3bada
    
  - Remove ceph repos from deploy-env role
    
    Also do not install ceph-common package on the
    test nodes.
    
    Change-Id: Ia33f2e7f26f3ccaec4863a22702946b4383d39a5
    
  - Update ceph-rook.sh script
    
    While we are waiting for Ceph cluster to be ready
    we check Ceph status in a loop using tools pod provided by Rook.
    We have to get this tools pod name every iteration
    within the loop because K8s can terminate the pod for
    some reason and this is expected behavior.
    
    Change-Id: Iabb98e94d7470fe996091bf77787637f3e8f4798
    
  - Merge "Add ingress deployment to deploy-env role"
  - Merge "Cleanup FEATURE_GATES env var"
  - Add ingress deployment to deploy-env role
    
    Change-Id: I910862d391650c443c6f0e352b3687120af14a91
    
  - Cleanup FEATURE_GATES env var
    
    A while ago we changed the way how we define features
    for the test jobs and this variable is not needed any more
    
    Also change keystone jobs so they don't deploy
    Rook but instead use local volumes.
    
    Depends-On: Ib4afe58b27cd255ce844626b1eee5ecc82e3aeb3
    Change-Id: Ia8971bd8c3723542a275c9658db7f9a5bb943f92
    
  - Update jobs
    
    * Make TLS job non-voting because it is unstable and
      we don't want it to block the gate.
    * Ceph migration job is not very important now. Let's
      run it in a periodic-weekly pipeline.
    
    Change-Id: Iadb67e1c5218794d15e60538abb2e869ae7e67c0
    
  - Merge "Remove resource limits for Rook"
  - Remove resource limits for Rook
    
    Change-Id: I857f75974a2ba0e3374fb46e06c7bce7fa04980c
    
  - Cleanup old scripts
    
    The env-variables.sh get-values-overrides.sh
    and wait-for-pods.sh are not needed any more
    since they are now a part of openstack-helm-plugin.
    
    Change-Id: I044ee7e7182822a9d7e5fd3e56c444fbfea9a753
    
  - helm-toolkit: always add pod mounts for db-sync job
    
    Always include mounts defined for the db-sync job under the pods section
    rather than requiring every chart to pass it in explicitly. Now the
    passed in value can be just for overrides. Since charts today already
    pass this in, we need to de-duplicate it to ensure we don't create this
    multiple times.
    
    Change-Id: I042e79cee7859ebdc001a056edc75eb89dd3e5b3
    
  - Rolling update on secret changes
    
    Change-Id: If1bb0218eb70a2bed55f18b9fb6dd36ea042286c
    
  - [ceph] Update Ceph and Rook
    
    This change updates all of the charts that use Ceph images to use
    new images based on the Squid 19.2.1 release.
    
    Rook is also updated to 1.16.3 and is configured to deploy Ceph
    19.2.1.
    
    Change-Id: Ie2c0353a4bfa181873c98ce5de655c3388aa9574
    
  - Merge "Ceph migration gate improvement"
  - Ceph migration gate improvement
    
    This change addresses instability in the Ceph migration gate job
    by adding extra nodes to the cluster.
    
    Change-Id: Id60d61274a42f87280748f0b4b9c0c3c7adb7357
    
  - [k8s,ceph,docker] Apt repository filename cleanup
    
    The prerequisites and containerd tasks that add the Kubernetes,
    Ceph, and Docker repos to apt list the filenames as
    kubernetes.list, ceph.list, and docker.list, which results in the
    files created under /etc/apt/sources.list.d being named with
    '.list.list' extensions. This change simply removes the redundant
    '.list' from the filenames to clean that up.
    
    Change-Id: I3672873149d137ad89c176cabad4c64dcff2bfee
    
  - Merge "Add OVN network logging parser"
  - Add OVN network logging parser
    
    Change-Id: I03a1c600c161536e693743219912199fabc1e5a5
    
  - [openvswitch] Make --user flag optional
    
    Add the ability to run the OVS server as root
    since the following change lacks backward compatibility:
    https://review.opendev.org/c/openstack/openstack-helm-infra/+/939580
    
    Change-Id: I071f77be0d329fbe98ce283324466bf129fe190d
    
  - [deploy-env] Install reno for OSH jobs
    
    This commit requires reno installed on the system:
    https://review.opendev.org/c/openstack/openstack-helm/+/940142
    
    Change-Id: I874fabc0199229d8e05d1e0bb2626d7630c06a12
    
  - Generate CHANGELOG.md for charts
    
    We use reno>=4.1.0 features to combine release notes from
    a bunch of release notes files. Reno uses git tags
    to figure out which release notes files to include
    to a given release.
    
    When building charts for deployment tests we skip
    generating CHANGELOG.md files.
    
    Change-Id: I2f55e76844afa05139a5c4b63ecb6c0ae2bcb5b2
    
  - Merge "Revert "Temporarily disable voting for ovn job""
  - Merge "Ceph rook gates improvement"
  - Ceph rook gates improvement
    
    This patchset fixes the instability of the
    ceph-rook gates by adding extra nodes to the
    cluster.
    
    Also improved ceph deployment process monitoring.
    
    Change-Id: I405e501afc15f3974a047475a2b463e7f254da66
    
  - Merge "ovn: implement Daemonset overrides"
  - Revert "Temporarily disable voting for ovn job"
    
    This reverts commit 7c6c32038d6e49bd8188d02199f054002266432d.
    
    Reason for revert: OVN issue is resolved
    
    Change-Id: I2425701e6075335433b90c949bac444fcebe3ac9
    
  - Merge "Run ovn controller with non root openvswitch user"
  - Run ovn controller with non root openvswitch user
    
    We recently updated the openvswitch chart to run
    ovs db server as non root.
    
    See: https://review.opendev.org/c/openstack/openstack-helm-infra/+/939580
    
    Also ovn-kubernetes script ovnkube.sh that we are using for
    lifecycle management of OVN components tries to update the
    ownership of OVS run and config directories before start.
    
    So we have to pass the correct username to the script
    so it does not break the OVS files permissions.
    
    Change-Id: Ie00dd2657c616645ec237c0880bbc552b3805236
    
  - Ensure python and pip installed for lint and build chart jobs
    
    Change-Id: I7819d67894eff03e57fe1c22f02e167a6c63b346
    
  - Update create db user queries
    
    This commit changes the queries to use % instead
    of %% in the Host field of CREATE USER and GRANT
    ALL statements.
    
    It also uplifts fresh jammy images for mariadb.
    
    Change-Id: I6779f55d962bc9d8efc3b3bfe05b72cbe0b7f863
    
  - Temporarily disable voting for ovn job
    
    OVN jobs is failing due to recent changes:
    https://review.opendev.org/c/openstack/openstack-helm-infra/+/939580
    https://review.opendev.org/c/openstack/openstack-helm-images/+/939589
    
    This is to unblock unrelated PRs.
    
    Change-Id: Id6f411c8ddf819e3f96401995afe5fcdca2386af
    
  - Merge "update openvswitch to run with non-root user"
  - Merge "[memcached] Expose exporter port via service"
  - Install reno>=4.1.0 on test env nodes
    
    This is needed to generate CHANGELOG.md files from
    release nodes while building chart tarballs.
    
    Change-Id: I3c52f4ace6770515d64bfdf4433d27fd3a674eb0
    
  - [memcached] Expose exporter port via service
    
    Pods may be discovered via prometheus endpoint scraper [0]
    expose exporter port via service to have ability to scrape over endpoints.
    
    [0] https://prometheus.io/docs/prometheus/latest/configuration/configuration/#endpoints
    
    Change-Id: I59a4472f13753db0ff2dc48559dd644d2648d97e
    
  - ovn: implement Daemonset overrides
    
    Change-Id: I2735748a200071c9488810456b8cccfc3bb2cff6
    
  - Merge "Add OVN Kubernetes support"
  - update openvswitch to run with non-root user
    
    Change-Id: I27a0927fb8b01b4eb997e8e7b840adc7a9e56d26
    
  - Merge "set hugepage mount point permission for nova when using dpdk"
  - Merge "Add release note template"
  - set hugepage mount point permission for nova when using dpdk
    
    Change-Id: Ic4b6e8aac5a4c6b6398e5ef03fa9608c43f766ed
    
  - Add OVN Kubernetes support
    
    This patch introduce OVN Kubernetes support.
    With OVN Kubernetes (https://github.com/ovn-org/ovn-kubernetes)
    OVN services control gets more native in Kubernetes way.
    
    At this point we only use OVN Kubernetes utilities
    to run and probe OVN components. We don't use OVN-Kubernetes
    CNI and CRD features.
    
    Depends-On: I2ec8ebb06a1ab7dca6651f5d1d6f34e417021447
    Change-Id: I5821149c987070125f14d01c99343b72f234fc36
    
  - Merge "[helm-toolkit] Allow to pass raw network policy"
  - Add release note template
    
    Change-Id: Ied6af6bf7521a92c70170a62d6ad8b29c731eac0
    
  - Update values_overrides to use images from buildset registry
    
    Recently we moved all overrides to a separate directory and
    if we want to test images published to buildset registry
    we have to update those overrides before deployment.
    
    Change-Id: I9a515b5ba98be7ee0225fc1c95a35828055383f6
    
  - [helm-toolkit] Allow to pass raw network policy
    
    Allow to pass raw network policy via values, labels
    without spec are ingnored in this case.
    
    values: |
      network_policy:
        myLabel:
          spec:
            <RAW SPEC>
    Change-Id: I87fce44f143fbdf9771ad043133dee22daced3f3
    
  - Merge "[memcached] Allign with security best practices"
  - Merge "[memcached] Unhardcode port in exporter"
  - Merge "[memcached] Enasure liveness probe is enabled"
  - Merge "Delete setup.py to avoid validate_build_sdist"
  - Merge "[memcached] Drop max_surge option"
  - Merge "Ensure memcached pods antiaffinity"
  - Delete setup.py to avoid validate_build_sdist
    
    To create git tags we have to submit PRs to
    the openstack/releases which checks if a project
    contains setup.py file. If it does then the validation
    test tries to build sdist package. For openstack-helm
    this is not needed.
    
    Change-Id: I3030dcf21d58d54d37b03e2db20004d086dbfaa9
    
  - [memcached] Allign with security best practices
    
    * Add runAsNonRoot directive
    * Drop all capabilities
    * Mount bianries with 550 and 65534 fsgroup
    
    Change-Id: I0636088b40ce8ebaef84dad017ddbcaaecfc8221
    
  - [memcached] Unhardcode port in exporter
    
    * Pick up port for exporter from endpoints
    * Drop exporter port from service as we should not use
      service that do loadbalancing among pods which are independent
    
    Change-Id: I0408039ba87aca5b8b3c9333644fa0c92f0ca01a
    
  - [ceph-osd] Remove wait_for_degraded_objects
    
    This PS removes the wait_for_degraded_objects
    function from ceph-osd helm-test script because
    not all pgs may be in good condition even if all
    osds are up and running. The pgs will get healthy
    after complete osd charts set upgrade is complete.
    
    Change-Id: Ia8da3d96e01b765c5cb691dd0af15f36a7292e89
    
  - Merge "Append metadata suffix when building charts"
  - [memcached] Enasure liveness probe is enabled
    
    Change-Id: I4980d2e9ec4fbfc8e57bd643b703d37c12b32dfa
    
  - [memcached] Drop max_surge option
    
    We do not use service proxy to comminicate to memcached.
    All services has exact number of endpoints to communicate.
    Having max_surge is useless as clients will never use it.
    
    Change-Id: I74a665c96cfc99cbb8d31c4a17700c467c746c9e
    
  - Ensure memcached pods antiaffinity
    
    Use required* antiaffinity to make sure we do not have
    two pods sitting on same node as it does not make any
    sense.
    
    Change-Id: I6c0c55733b75eb1bd53eee855907533d672dbf22
    
  - Append metadata suffix when building charts
    
    Change-Id: Ic9af11193f097c3bad99b63c63abc5e8dd93de53
    
  - [deploy-env] Fix fetching images
    
    Even with the docker proxy cache we often get
    jobs failed due to Docker Hub rate limits.
    As per recommendation from the Opendev Infra team
    let's pull as many as possible images from other
    registires.
    
    This PR updates the dnsmasq and nginx images used
    for auxiliary purposes during deployments.
    
    Change-Id: I58946e6fc63d726e08d83ea7f96e7fef140ddf21
    
  - Update versions of all charts to 2024.2.0
    
    As per agreement with
    https://docs.openstack.org/openstack-helm/latest/specs/2025.1/chart_versioning.html
    
    Change-Id: Ia064d83881626452dc3c0cf888128e152692ae77
    
  - Update Chart.yaml apiVersion to v2
    
    Change-Id: I66dcaedefd0640f8a7b5343363354ba539d70627
    
  - Enable temporarily disabled jobs
    
    Here I7bfdef3ea2128bbb4e26e3a00161fe30ce29b8e7
    we disabled some jobs that involve scripts from
    OSH git repo because these scripts had to be
    aligned with the new values_overrides location and
    directory structure.
    
    Change-Id: I7d0509051c8cd563a3269e21fe09eb56dcdb8f37
    
  - Move values overrides to a separate directory
    
    This is the action item to implement the spec:
    doc/source/specs/2025.1/chart_versioning.rst
    
    Also add overrides env variables
    
    - OSH_VALUES_OVERRIDES_PATH
    - OSH_INFRA_VALUES_OVERRIDES_PATH
    
    This commit temporarily disables all jobs that involve scripts
    in the OSH git repo because they need to be updated to work
    with the new values_overrides structure in the OSH-infra repo.
    Once this is merged I4974785c904cf7c8730279854e3ad9b6b7c35498
    all these disabled test jobs must be enabled.
    
    Depends-On: I327103c18fc0e10e989a17f69b3bff9995c45eb4
    Change-Id: I7bfdef3ea2128bbb4e26e3a00161fe30ce29b8e7
    
  - [ceph] Fix for ceph-osd pods restart
    
    This PS updates ceph-osd pod containers making
    sure that osd pods are not stuck at deletion. In
    this PS we are taking care of another background
    process that has to be terminated by preStop
    script.
    
    Change-Id: Icebb6119225b4b88fb213932cc3bcf78d650848f
    
  - [ceph] Fix for ceph-osd pods restart
    
    This PS updates ceph-osd pod containers making sure
    that osd pods are not stuck at deletion.
    
    It adds missed lifecycle preStop action for log0runner container.
    
    Change-Id: I8d6853a457d3142c33ca6b5449351d9b05ffacda
    
  - [ceph] Fix for ceph-osd pods restart
    
    This PS updates ceph-osd pod containers making sure
    that osd pods are not stuck at deletion. Also
    added similar approach to add lifecycle ondelete
    hook to kill log-runner container process before pod restart.
    
    And added wait_for_degraded_object function to
    helm-test pod making sure that newly deployed pod
    are joined the ceph cluster and it is safe to go
    on with next ceph-osd chart releade upgrade.
    
    Change-Id: Ib31a5e1a82526906bff8c64ce1b199e3495b44b2
    
  - Merge "Remove tini from ceph-osd chart"
  - Remove tini from ceph-osd chart
    
    Removing tini from ceph daemon as this didn't resolve
    an issue with log runner process as will be resolved in
    another change in post-apply job.
    
    Change-Id: I4ebb1d12e736d387e6e34354619a532dd50dfeae
    
  - Bump K8s to v1.31
    
    Change-Id: I384b10ef7b2da42d2227b4134e4ece4c5f9aa6d1
    
  - Merge "Remove 2023.1 build jobs"
  - Merge "[mariadb] Add probes for exporter"
  - Merge "Allow to use default storage class"
  - Merge "[mariadb] Add terminationGracePeriodSeconds"
  - Merge "[mariadb] Use service IP to discover endpoints"
  - Merge "[mariadb] Implement mariadb upgrade on start"
  - Merge "[mariadb] Avoid using deprecated isAlive"
  - [mariadb] Add probes for exporter
    
    Implement readiness/liveness probes for exporter
    
    Change-Id: I7e73872dd35b8e6adf67d585e7d4d9250eca70c3
    
  - Allow to use default storage class
    
    When name of storage class is specified as default, do not add
    storageClassName option to let kubernetes pick a default
    
    Change-Id: I25c60e49ba770ce10ea2ec68c3555ffea49848fe
    
  - [mariadb] Add terminationGracePeriodSeconds
    
    Allow to set terminationGracePeriodSeconds for server instace to let
    more time to shutdown all clients gracefully.
    Increase timeout to 600 seconds by default.
    
    Change-Id: I1f4ba7d5ca50d1282cedfacffbe818af7ccc60f2
    
  - [mariadb] Use service IP to discover endpoints
    
    It was observed that under certain circumstances
    galera instances can use old IP address of the node
    after pod restart. This patch changes the value of
    wsrep_cluster_address variable - instead of listing
    all dns names of the cluster nodes the discovery service
    IP address is used. In this case cluster_node_address is set to IP
    address instead of DNS name - otherwise SST method will fail.
    
    Co-Authored-By: Oleksii Grudev <[email protected]>
    
    Change-Id: I8059f28943150785abd48316514c0ffde56dfde5
    
  - [mariadb] Implement mariadb upgrade on start
    
    Call mysql_upgrade during start to check and upgrade if needed
    
    Change-Id: I9c4ac1a5ea5f492282bb6bb1ee9923b036faa998
    
  - [mariadb] Avoid using deprecated isAlive
    
    The method was deprecated and later dropped, switch to is_alive()
    
    Co-Authored-By: dbiletskiy <[email protected]>
    
    Change-Id: Ie259d0e59c68c9884e85025b1e44bcd347f45eff
    
  - Remove 2023.1 build jobs
    
    The 2023.1 release is unmaintained since 2024-10-30.
    See https://releases.openstack.org/
    
    Change-Id: I8375b16338b172a5875b7a379df085020490305c
    
  - Merge "Update ceph-osd to be able to use tini"
  - Merge "ovn: fix resources"
  - [mariadb] Refactor liveness/readiness probes
    
    * Move all probes into single script to reduce code duplication
    * Check free disk percent, fail when we consume 99% to avoid
      data corruption
    * Do not restart container when SST is in progress
    
    Change-Id: I6efc7596753dc988aa9edd7ade4d57107db98bdd
    
  - [mariadb] Give more time on resolving configmap update conflicts
    
    Make 'data too old' timeout dependent on state report interval. Increase
    timeout to 5 times of report interval.
    
    Change-Id: I0c350f9e64b65546965002d0d6a1082fd91f6f58
    
  - Prevent TypeError in get_active_endpoint function
    
    Sometimes "endpoints_dict" var can be evaluated to None
    resulting in "TypeError: 'NoneType' object is not iterable"
    error. This patch catches the exception while getting
    list of endpoints and checks the value of
    endpoints_dict.  Also the amount of active endpoints is being logged
    for debug purposes.
    
    Change-Id: I79f6b0b5ced8129b9a28c120b61e3ee050af4336
    
  - [mariadb] Remove useless retries on conflics during cm update
    
    The retries were originally added at [0] but they were never working.
    We pass fixed revision that we would like to see during patch to avoid
    race condition, into the safe_update_configmap. We can't organize retries
    inside function as it will require change of the original revision which
    may happen only at upper layer. Revert patch partially.
    
    [0] https://review.opendev.org/c/openstack/openstack-helm-infra/+/788886
    
    Change-Id: I81850d5e534a3cfb3c4993275757c244caec8be9
    
  - [mariadb] Stop running threads on sigkill
    
    Stop monitor cluster and leader election threads on sigkill.
    This allows to terminate all threads from start.py and actually
    exit earlier than terminationGracePeriod in statefulset.
    Drop preStop hook which is redundant with stop_mysqld() function call.
    
    Change-Id: Ibc4b7604f00b1c5b3a398370dafed4d19929fd7d
    
  - ovn: fix resources
    
    Change-Id: I2b0c70550379dd214bc67869a7c74518b7004c7f
    
  - [mariadb] Improve python3 compatibility
    
    Decode byte sequence into string before printing log.
    
    Change-Id: Icd61a1373f5c62afda0558dfadc2add9138cff6d
    
  - [mariadb] Improve leader election on cold start
    
    During cold start we pick leader node by seqno. When node is running
    of finished non gracefully seqno may stay as -1 unless periodic task
    update its based on local grastate.dat or will detect latest seqno via
    wsrep_recover. This patch adds an unfinite waiter to leader election
    function to wait unless all nodes report seqno different that -1 to make
    sure we detect leader based on correct data.
    
    Change-Id: Id042f6f4c915b21b905bde4d57d40e159d924772
    
  - [mysql] Use constant for mysqld binary name
    
    Change-Id: I996141242dac9978283e5d2086579c75d120ed8b
    
  - Update ceph-osd to be able to use tini
    
    Sometimes the pod fails to terminate correctly,
    leaving zombie processes. Add option to use tini
    to handle processes correctly. Additionally update
    log-tail script to handle sigterm and sigint.
    
    Change-Id: I96af2f3bef5f6c48858f1248ba85abdf7740279c
    
  - Merge "Mariadb chart updates"
  - Merge "Update grafana helm test"
  - Merge "ovn: make gateway label configurable"
  - Mariadb chart updates
    
    This PS is for improvements for wait_for_cluster mariadb job.
    
    Change-Id: I46de32243e3aaa98b7e3e8c132a84d7b65d657cc
    
  - Update grafana helm test
    
    Adds setting XDG_CONFIG_HOME and XDG_CACHE_HOME to
    a writable path.
    
    Change-Id: Ieb2a6ca587ecefe24d04392970c415409c8f5e1b
    
  - Update helm test for Elasticsearch
    
    Removing the use of python during helm test script as it
    is no longer in the image.
    
    Change-Id: Id8feff1bee8c3f2dd277307d176f6a535c5f7ba6
    
  - ovn: make gateway label configurable
    
    Change-Id: I88ab77e61e9766e12eb3aff899e0d6dd24a8d3c0
    
  - Merge "Add 2024.2 overrides"
  - Merge "[helm-toolkit] Fix db-init and db-drop scripts"
  - [memcached] Fix statefulset spec format
    
    Recently we switched from Deployment to Statefulset
    to make it possible to work with memcached instances
    directly w/o load balancer. The strategy field is not
    valid for statefulsets, so here we remove it.
    
    Change-Id: I52db7dd4563639a55c12850147cf256cec8b1ee4
    
  - Add 2024.2 overrides
    
    Change-Id: Ic43f14e212f4de6616b4255bdd5ce562c5bcf9b0
    
  - [helm-toolkit] Fix db-init and db-drop scripts
    
    Wrap queries into sqlalchemy.text before executing them.
    
    Change-Id: I783bd05bdd529c73825311515e1390f3cc077c4f
    
  - Merge "Add app.kubernetes.io/name label to openstack pods"
  - Merge "[mariadb] Add cluster wait job"
  - Add app.kubernetes.io/name label to openstack pods
    
    This commit adds recommended kubernetes name label to pods definition.
    
    This label is used by FluxCD operators to correctly look for the
    status of every pod.
    
    Change-Id: I866f1dfdb3ca8379682e090aca4c889d81579e5a
    Signed-off-by: Johnny Chia <[email protected]>
    
  - Merge "Allow share OVN DB NB/SB socket"
  - Merge "Revert "[rabbitmq] Use short rabbitmq node name""
  - Allow share OVN DB NB/SB socket
    
    This will help other services to access to OVN DB.
    So services like Octavia can use OVN Octavia provider agent.
    
    Change-Id: Iddaa6214ece63a5f1e692fe019bcba1b41fdb18f
    
  - Merge "[mariadb] Remove ingress deployment"
  - Merge "Allow to package charts in specified directory"
  - Merge "Allow to pass custom helm charts version"
  - Allow to package charts in specified directory
    
    Use make PACKAGE_DIR=/foo/bar/
    
    Change-Id: I37db3f507c9375c64081adcf994ede3829dbb34b
    
  - Allow to pass custom helm charts version
    
    * Allow to pass custom helm chart version during build like
        make all version=1.2.3+custom123
    * add get-version target that allows to get version based on
      number of git commits in format <git-tag>+<commits number>
    
    Change-Id: I1f04aeaa8dd49dfa2ed1d76aabd54a0d5bf8f573
    
  - [helm-toolkit] Update toolkit to support fqdn alias
    
    This change add the ability to add fqdn alias to namespace and cluster ingress resources. This change is specifically required for keystone so HA of backup solution can be implemented.This change allows user to specify alias_fqdn in the endpoints section, and user can have alias configued. This change is backward compatible, so without specifying this option in charts gives one fqdn ingress rule without cname alias as default behaviour.
    
    Change-Id: Ib1c60524e2f247bb057318b1143bfbc3bde5b73a
    
  - Revert "[rabbitmq] Use short rabbitmq node name"
    
    Rabbitmqcluster does not work with short node names, as
    there is unresolvable dependency in dns resolution, it is
    not possible to resolve only pod name svc must be added.
    
    This reverts commit bb7580944a5268a1e5f7fcd195b156f53dc668c5.
    
    Change-Id: I42b25ba4f569bae94bbc2939a1022bd14e66e527
    
  - [libvirt] Add 2023.1 overrides
    
    Recently we fixed the libvirt.sh script
    and removed the conditionals cgroup commands
    which were introduced for smooth transition
    to Jammy and cgroups v2
    
    https://review.opendev.org/c/openstack/openstack-helm-infra/+/929401
    
    But because we didn't have overrides for 2023.1
    we used to run 2023.1 with the default libvirt image
    openstackhelm/libvirt:latest-ubuntu_focal
    which does not work with cgroups v2 on the host
    system with this recent fix (see above).
    
    So the 2023.1 Ubuntu Jammy compute-kit test jobs fails.
    This PR fixes this job by means of introducing
    explicit image overrides for 2023.1.
    
    Change-Id: Ie81f8fb412362388274ea92ad7fa5d3d176c0441
    
  - Merge "Add local volume provisioner chart"
  - Merge "[libvirt] Implement daemonset overrides for libvirt"
  - Merge "[libvirt] Make readiness probe more tiny"
  - Merge "[libvirt] Allow to generate dynamic config options"
  - Merge "[memcached] Change deployment type to statefulset"
  - [libvirt] Implement daemonset overrides for libvirt
    
    The patch implements libvirt daemonset to use overrides daemonset_overrides_root
    
      .Values:
        overrides:
          libvirt_libvirt:
            labels:
              label::value:
                values:
                  override_root_option: override_root_value
                  conf:
                    dynamic_options:
                      libvirt:
                        listen_interface: null
    
    Change-Id: If4c61f248d752316c54955ebf9712bb3235c06fd
    
  - Merge "[mariadb] Switch to controller deployment"
  - Merge "[mariadb] Deploy exporter as sidecar"
  - Merge "[mariadb] Avoid using cluster endpoints"
  - Merge "[helm-toolkit] Add daemonset_overrides_root util"
  - Merge "Remove trailing slash in endpoinds"
  - Merge "Add ability to get multiple hosts endpoint"
  - Add local volume provisioner chart
    
    Some applications require perisitant volumes to be stored
    on the hosts where they running, usually its done via
    kubernetes PV. One of PV implementations is local-volume-provisioner [0]
    
    This patch adds helm chart to deploy LVP. Since LVP creates a volumes for
    each mountpoint, helm chart provides a script to create  mountpoints
    in the directory, which later exposed to kubernetes as individual volumes.
    
    Change-Id: I3f61088ddcbd0a83a729eb940cbf9b2bf1e65894
    
  - [memcached] Change deployment type to statefulset
    
    For effective cache use all endpoints should be specified
    explicitly as memcache client use specific algorithm to
    identify on which cache server key is stored based on
    servers availability and key name.
    If memcached deployed behind the service unless same key is
    stored on all memcached instances clients will always got
    cache misses and will require to use heavy calls to database.
    So in the end all keys will be stored on all memcached instances.
    Furthermore delete operations such as revoke token or remove
    keystone group call logic in service to remove data from cache
    if Loadbalancer is used this functionality can't work as we
    can't remove keys from all backends behind LB with single call.
    
    Change-Id: I253cfa2740fed5e1c70ced7308a489568e0f10b9
    
  - [mariadb] Add cluster wait job
    
    Add job that waits when initial bootstrapping of cluster is completed
    which is required to pause db creation and initialization when cluster
    is not fully bootstrapped.
    
    Change-Id: I705df1a1b1a34f464dc36a36dd7964f8a7bf72d9
    
  - [mariadb] Remove ingress deployment
    
    Ingress deployment is not used for a while and there are
    more elegant ways to provide same functionality based on
    controller to pick up master service.
    Remove ingress deployment completely.
    
    Change-Id: Ica5d778f5122f8a4f0713353aa5e0ef4e21c77f8
    
  - [mariadb] Switch to controller deployment
    
    Move primary node selector into mariadb controller, this patch
    partially reverts 07bd8c92a259557d07119525c85bea4b8fc6006e
    
    Change-Id: Id53a6503b177f0c46e89a7def2c0773a68b8d8e8
    
  - Merge "Add snippet configmap_oslo_policy"
  - [libvirt] Make readiness probe more tiny
    
    Use virsh connect instead of list which is heavy and may
    stuck for a while when libvirt creating domains.
    
    Change-Id: I515c70b0b3a050599726ca2548eeeb7fd3f3e6ea
    
  - [libvirt] Allow to generate dynamic config options
    
    It may be required to use some dynamic options such as IP address
    from interface where to bind service. This patch adds ability to
    use dynamic logic in option detection and fill it in the configuration
    file later.
    
    Co-Authored-By: dbiletskiy <[email protected]>
    
    Change-Id: I8cc7da4935c11c50165a75b466d41f7d0da3e77c
    
  - Merge "[libvirt] Allow to initialize virtualization modules"
  - [helm-toolkit] Add daemonset_overrides_root util
    
    The helm-toolkit.utils.daemonset_overrides function have some limitations:
    
     * it allows to override only conf values specifid in configmap-etc
     * it doesn't allow to override values for daemonsets passed via env variables
       or via damoenset definition. As result it is impossible to have mixed
       deployment when one compute is configured with dpdk while other not.
     * it is impossible to override interface names/other information stored in
       <service>-bin configmap
     * It allows to schedule on both hosts and labels, which adds some
       uncertainty
    
    This implementation is intended to handle those limitations:
    
     * it allows to schedule only based on labels
     * it creates <service>-bin per daemonset override
     * it allows to override values when rendering daemonsets
    
     It picks data from the following structure:
    
      .Values:
        overrides:
          mychart_mydaemonset:
            labels:
              label::value:
                values:
                  override_root_option: override_root_value
                  conf:
                    ovs_dpdk:
                      enabled: true
                    neutron:
                      DEFAULT:
                        foo: bar
    
    Change-Id: I5ff0f5deb34c74ca95c141f2402f375f6d926533
    
  - Remove trailing slash in endpoinds
    
    This patch removes trailing slash in endpoint address
    in case the path is empty.
    
    Co-Authored-By: Vasyl Saienko [email protected]
    
    Change-Id: I11ace7d434b7c43f519d7ec6ac847ef94916202f
    
  - Add ability to get multiple hosts endpoint
    
    For memcache we should set specify all hosts directly in the config
    as client do key spreading based on what hosts are alive, when LB
    address is used memcached can't work effectively.
    This patch updates endpoint_host_lookup	to handle this scenario
    
    Change-Id: I8c70f8e9e82bf18d04499a132ef9a016d02cea31
    
  - Add snippet configmap_oslo_policy
    
    Openstack policies can be applied without service restart
    keep all policies in single configmap to have ability to
    do not restart services on policy changes.
    
    This patch adds a snippet of configmap that will later be used
    in other helm charts.
    
    Change-Id: I41d06df2fedb7f6cf0274c886dc9b94134507aca
    
  - Merge "[rabbitmq] Use short rabbitmq node name"
  - Merge "[rabbitmq] Set password for guest user rabbitmq"
  - Merge "[memcached] Allow to configure additional service parameters"
  - Merge "[mariadb] Add mariadb controller support"
  - Merge "Add service params snippet"
  - Merge "[libvirt] Remove hugepages creation test"
  - Merge "[libvirt] Handle cgroupv2 correctly"
  - Merge "Add compute-kit-2023-1-ubuntu_focal job"
  - Merge "[etcd] Add cronjob with database compaction"
  - Merge "[etcd] Switch etcd to staetefulset"
  - [libvirt] Allow to initialize virtualization modules
    
    Add init-modules libvirt container which allows to initialize
    libvirt modules during start. The script is provided via
    .Values.init_modules.script data structure
    
    Change-Id: I9d5c48448b23b6b6cc18d273c9187a0a79db4af9
    
  - [libvirt] Remove hugepages creation test
    
    The tests is useless as libvirt is not running in the pod
    cgroup so pod settings are not applied to it.
    
    Change-Id: Ice3957c800e29a0885a341103c453c4d6c921fd3
    
  - [libvirt] Handle cgroupv2 correctly
    
    The list of default kernel cgroup controllers may be changed
    an example is kernel upgrade from 5.4.x to 5.15.x where misc controller
    is enabled by default. Unhardcode list of controllers to have ability
    to override them for never kernel version and allow to do not kill
    qemu processes with container restart.
    
    Change-Id: Ic4f895096a3ad2228c31f19ba1190e44f562f2a0
    
  - Add compute-kit-2023-1-ubuntu_focal job
    
    This is necessary to test if libvirt changes
    are compatible with cgroups v1.
    
    Change-Id: I3cfb4e747a4cd23bc2d7051ef526fd58dc38aaf8
    
  - [mariadb] Deploy exporter as sidecar
    
    Deploy exporter as a sidecar to provide correct mysql metrics.
    
    Co-Authored-By: Oleh Hryhorov <[email protected]>
    
    Change-Id: I25cfeaf7f95f772d2b3c07a6a91220d0154b4eea
    
  - [mariadb] Avoid using cluster endpoints
    
    Switch to namespaced based endpoints to remove requirement
    configure kubernetes internal cluster domain name which can't
    be get from kubernetes API.
    
    Change-Id: I8808153a83e3cec588765797d66d728bb6133a5c
    
  - [memcached] Allow to configure additional service parameters
    
    Use the following structure in values to define addtional service
    parameters:
    
    Values: network:
        memcached:
          service:
            type: loadBalancer
            loadBalancerIP: 1.1.1.1
    Change-Id: I94c87e530d90f603949ccacbf0602273feec741a
    
  - [mariadb] Add mariadb controller support
    
    This patch adds mairadb controller that is responsible to mark one
    ready pod as mariadb_role: primary to forward all traffic to it.
    This will allow to drop nginx ingress controller which adds extra
    hops between client and server and uses heavy customized nginx templates.
    
    Change-Id: I3b29bc2029bfd39754516e73a09e4e14c52ccc99
    
  - Add service params snippet
    
    Allows to add custom parameters to services, and ingress services
    from values as is.
    
    Co-Authored-By: Mykyta Karpin <[email protected]>
    
    Change-Id: I42b8d07126de2cf12ddc3a934d1fd4e3a2ee0051
    
  - [etcd] Add cronjob with database compaction
    
    etcd database need to be periodically compacted and defrag
    This patch adds jobs to perform required maintenance actions
    automatically.
    
    Co-Authored-By: Oleh Hryhorov <[email protected]>
    
    Change-Id: I31b48bb198f7322c343c7d0171322759893e374f
    
  - [etcd] Switch etcd to staetefulset
    
    * Switch etcd to statefulset
    * Allow to use persistant volumes to store etcd data
    * Allow to deploy in clustered mode
    
    Change-Id: I2baf5bdd05c280067991bb8b7f00c887ffd95c20
    
  - [rabbitmq] Use short rabbitmq node name
    
    The patch switches rabbitmq to use short node names, this will
    allow to do not care about internal domain name as it is can't
    be get from k8s API.
    
    Change-Id: I6d80bc4db4e497f7485fb5416818e0b61f821741
    Related-Prod: PRODX-3456
    
  - [rabbitmq] Set password for guest user rabbitmq
    
    Guest account is enabled by default and has access to all
    vhosts. Allow to change guest password during rabbitmq
    configuration.
    
    Change-Id: If23ab8d5587b13e628bce5bcb135a367324dca80
    
  - [rabbitmq] Allow to bootstrap rabbitmq with initial config
    
    Prepare rabbitmq to be running in non clustered mode, in which
    it may be useful to bootstrap cluster with fresh data each time
    since we do not use durable queues in openstack that are stored
    on filesystem.
    
    The two new data strucutre in rabbitmq Values are added:
    
      users:
        auth:
          keystone_service:
            username: keystone
            password: password
        path: /keystone
      aux_conf:
        policies:
          - vhost: "keystone"
            name: "ha_ttl_keystone"
            definition:
              ha-mode: "all"
              ha-sync-mode: "automatic"
              message-ttl: 70000
            priority: 0
            apply-to: all
            pattern: '^(?!amq\.).*'
    
    Change-Id: Ia0dd1a8afe7b6e894bcbeafedf75131de0023df0
    
  - [rabbitmq] Do not use hardcoded username in rabbitmq chown container
    
    Pick up UID from .Values.pod.security_context.server.pod.runAsUser as this is
    user that we are using to run service.
    
    Change-Id: Id4c53b0a882b027e320b08ed766cb473ab9ab535
    
  - [rabbitmq] Update readiness/liveness commands
    
    Use lightweigh rabbitmqctl ping command to check readiness and liveness probe.
    check_port_connectivity - is not suatable for liveness as it does not check
      that instance of rabbitmq is actually running and we can authenticate.
    
    Change-Id: I6f157e9aef3450dba1ad7e0cb19491a41f700bbc
    
  - Decode url-encoded password for rabbit connection
    
    Resolve that access fails when the Rabbitmq password contains special characters by the changes below.
    
    https://pikachu.space/openstack/openstack-helm-infra/commit/6c5cc2fdf04d32fbf5fed2b90c6fdca60286d567
    
    story: 2011222
    task: 50999
    Change-Id: I0cfc6e2228bc4b1327efb7da293849d6d1bbff19
    
  - Run utils-defragOSDs.sh in ceph-osd-default container
    
    The Ceph defragosds cronjob script used to
    connect to OSD pods not explicitly specifying
    the ceph-osd-default container and eventually
    tried to run the defrag script in the log-runner
    container where the defrag script is mounted with
    0644 permissions and shell fails to run it.
    
    Change-Id: I4ffc6653070dbbc6f0766b278acf0ebe2b4ae1e1
    
  - Merge "Update deploy-env role"
  - Update deploy-env role
    
    - Use kubeadm configuration to not set taints
      on control plain nodes (instead of removing them after
      deployment).
    - Fix ssh client key permissions.
    - Update the Mariadb ingress test job so it is inherinted
      from the plain compute-kit test job. And also remote
      it from the check pipeline.
    
    Change-Id: I92c73606ed9b9161f39ea1971b3a7db7593982ff
    
  - [osh-selenium] Upgrade image to ubuntu_jammy
    
    + run tests in a read-only file system
    + change google-chrome data directory from ~/.config/google-chrome
      (which is immutable) to /tmp/google-chrome (writable), otherwise
      Chrome fails to launch
    + activate new headless mode as the old one will be soon removed
      https://developer.chrome.com/docs/chromium/new-headless
    
    Change-Id: I7d183b3f3d2fdc3086a5db5fa62473f777b9eb7a
    
  - Ingress-nginx controller upgrade for mariadb
    
    This PS bumps up ingress-nginx controller version
    to v1.11.2 in mariadb chart due to CVE
    vulnerability.
    
    nginx.tmpl from mariadb chart has been updated to
    match the latest 1.11.2 ingress-controller image.
    
    Change-Id: Ie2fd811f8123515f567afde62bbbb290d58dd1b2
    
  - Merge "Add the ability to use custom Nagios plugins"
  - Add the ability to use custom Nagios plugins
    
    Change-Id: Ib309499140994448d7b3e0eef0c875c6edb3a2ac
    
  - Add retry logic to index creation script
    
    - Re-add the retry logic back to the index creation script.
    - Fixed small regex bug.
    - Also added function to lookup the id of a view, because the new
      views API requires an id to set the default view.
    - Set noglob to make sure the asterisks in the view names aren't
      expanded.
    
    Change-Id: Idfd56f09a739731f2ce3153b8fc284bb499a91d4
    
  - Merge "[ceph] Remove dependencies on legacy provisioners"
  - [ceph] Remove dependencies on legacy provisioners
    
    The legacy RBD provisioner and the CephFS provisioner haven't been
    used in some time. This change removes them.
    
    Change-Id: I313774627fcbaed34445ebe803adf4861a0f3db5
    
  - parse nova metadata in libvirt exporter
    
    Change-Id: Ib49968d919bda72caffd09d57a283587ae867fec
    
  - Merge "Updating script to use data views to support kibana 8.0 and beyond as some of api is now depreacated."
  - Updating script to use data views to support kibana 8.0 and beyond
    as some of api is now depreacated.
    
    Change-Id: I58d5c388cc0f6ba56c5fe646be352a0641e0661d
    
  - Upgrade env
    
    - K8s 1.30.3
    - Helm 3.14.0
    - Crictl 1.30.1
    - Calico 3.27.4
    - Cilium 1.16.0
    - Ingress-nginx Helm chart 4.11.1
    
    Change-Id: I3d5a3d855b0b4b0b66e42d94e1e9704f7f91f88b
    
  - Add 2024.1 overrides to some charts
    
    - Add 2024.1 overrides to those charts where
      there are overrides for previous releases.
    - Update some jobs to use 2024.1 overrides.
    - Update default images in  grafana, postgresql,
      nagios, ceph-rgw, ceph-provisioners,
      kubernetes-node-problem-detector
    - Install tzdata package on K8s nodes. This
      is necessary for kubernetes-node-problem-detector
      chart which mounts /etc/localtime from hosts.
    
    Change-Id: I343995c422b8d35fa902d22abf8fdd4d0f6f7334
    
  - Merge "Use predefined Helm repo in deployment scripts"
  - Update deploy-env role
    
    When generating keys and sharing them between nodes
    in a multinode env it is important that task which
    generates keys is finished before trying to use these
    keys on another node.
    
    The PR splits the Ansible block into two blocks and
    makes sure the playbook deploy-env is run with the linear
    strategy. Thus we can be sure that keys are first generated
    on all affected nodes and only then are used to setup
    tunnels and passwordless ssh.
    
    Change-Id: I9985855d7909aa5365876a24e2a806ab6be1dd7c
    
  - Use predefined Helm repo in deployment scripts
    
    Change-Id: Icd55637a8909cc261e6bde307e556476cacb1c1f
    
  - Merge "ovn: Use chart name in oci_image_registry secret"
  - Remove gateway node role
    
    With elasticsearch 8 gateway is no longer a valid node role
    
    https: //www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#node-roles
    Change-Id: I4f522bc29b51645b6cfc16faaa3d250d7b18c51f
    
  - ovn: Use chart name in oci_image_registry secret
    
    The current values.yaml uses the service name to create separate
    secrets. However, helm-toolkit indexes into oci_image_registry using
    .Chart.Name and not $serviceName so the secrets are not used.
    
    Change-Id: I50f575f951c19ab728f9e40a73bc893e4f7356f2
    
  - Add Flannel deployment to deploy-env role
    
    Change-Id: I72f3f29196ea1d433655c8862ac34718df18c7ea
    
  - Update kubernetes-entrypoint image
    
    Use quay.io/airshipit/kubernetes-entrypoint:latest-ubuntu_focal
    by default instead of 1.0.0 which is v1 formatted and
    not supported any more by docker.
    
    Change-Id: I6349a57494ed8b1e3c4b618f5bd82705bef42f7a
    
  - Align db scripts with sqlalchemy 2.0
    
    Change-Id: I0b6c500e8257c333c16c15d7d338651ee5b2ca27
    
  - [fluentd] Adjust configuration for v1.15
    
    + prevent Fluentd from parsing its own logs and fix an issue with
      endless backslashes (https://github.com/fluent/fluentd/issues/2545)
    + increase chunk limit size
    + add storage for systemd plugin configuration
    + add pos_file parameter for the tail sources
    
    Change-Id: I7d6e54d2324e437c92e5e8197636bd6c54419167
    
  - Test job for legacy OSH Ceph to Rook migration
    
    At the moment the recommended way of managing Ceph clusters
    is using Rook-Ceph operator. However some of the users
    still utilize legacy OSH Ceph* charts. Since Ceph is
    a critical part of the infrastructure we suggest a migration
    procedure and this PR is to test it.
    
    Change-Id: I837c8707b9fa45ff4350641920649188be1ce8da
    
  - Add Cilium deployment to deploy-env role
    
    Change-Id: I7cec2d3ff09ec3f85992162bbdb8c351660f7de8
    
  - Merge "Couple tiny fixes for deploy-env role"
  - Couple tiny fixes for deploy-env role
    
    - typo in the setup of wireguard tunnel
    - wrong home directory when setup k8s client for root user
    
    Change-Id: Ia50f9f631b56538f72843112745525bc074e7948
    
  - Setup passwordless ssh from primary to cluster nodes
    
    Here we add Ansible tasks to the deploy-env role
    to setup passwordless ssh from the primary node
    to K8s cluster nodes. This is necessary for some
    test scripts like for example Ceph migration script.
    
    Change-Id: I1cae1777d51635a19406ea054f4d83972e5fe43c
    
  - Update curator to 8.0.10
    
    Update es curator to 8.0.10 and use appropriate config options for
    the es_client python module that has been incorporated in 8.0.9
    
    https://github.com/elastic/curator/compare/v8.0.8...v8.0.9
    
    https: //github.com/elastic/curator/blob/bd5dc942bbf173d5e456f1a3c5ca8bec1c0df2ac/docs/usage.rst#log-settings
    Change-Id: I88071162f5bc0716bfb098525ed2eacd48367d98
    
  - Merge "Simplify ceph-adapter-rook"
  - Merge "Update deploy-env role to support root user"
  - Simplify ceph-adapter-rook
    
    - Do not deploy anything in the ceph namespace
    - Prepare admin key secret in the openstack namespace.
      Get admin key from the Ceph tools pod
    - Prepare Ceph client config with the mon_host
      taken from the rook-ceph-mon-endpoints configmap
      as recommended in the Rook documentation.
    
    Change-Id: Idd4134efab49de032a389283e611c4959a6cbf24
    
  - Add value for rendering sidecar without feature
    
    Add option to deploy rendering sidecar without the k8s
    sidecar feature.
    
    Change-Id: I4b8052166bad8965df9daa6b28e320d9132150cd
    
  - Update deploy-env role to support root user
    
    Change-Id: I4126155eec03677cf29edfb47e80f54ab501705d
    
  - Add image rendering sidecar
    
    This PS is to add a sidecar for the grafana image renderer. Starting
    with Grafana v10 it will be necessary to use an image rendering plugin
    or remote renderer.
    
    https://grafana.com/docs/grafana/latest/setup-grafana/image-rendering/
    
    Change-Id: I4ebdac84769a646fa8154f80aaa2692c9f89eeb8
    
  - [openstack-exporter] Switch to jammy-based images
    
    Change-Id: I5326bb5231d3339d722ac67227e60bac592eb916
    
  - Updating openvswitch to run as child process
    
    On containerd v1.7+ openvswitch restarts when
    containerd is restarted. To prevent this add tini
    and run OVS as a child process.
    
    Change-Id: I382dc2db12ca387b6d32304315bbee35d8e00562
    
  - Use OSH helm plugin rabbitmq and memcached scripts
    
    Change-Id: Ia06ee7f159c6ed028ab75fcb5707ee6e42179d98
    
  - Merge "Fix selenium test for additional compatibility."
  - Fix selenium test for additional compatibility.
    
    Change-Id: I2b5bd47d1a648813987ff10184d2468473454dfd
    
  - Bump K8s version to 1.29.5
    
    Change-Id: I4a3c7a17f32b5452145e1677e3c5072875dc9111
    
  - Merge "Escape special characters in password for DB connection"
  - Escape special characters in password for DB connection
    
    The passwords with special characters need to be URL encoded to be
    parsed correctly
    
    Change-Id: Ic7e0e55481d9ea5ce2621cf0d67e80b9ee43cde0
    
  - Cleanup unused scripts
    
    Change-Id: I3bad13cc332fd439b3b56cfa5fc596255bc466f2
    
  - Merge "Fix typo in the ovn chart"
  - Fix typo in the ovn chart
    
    Change-Id: Ib69c6af7b79578090e23ea574da0029cf3168e03
    
  - Merge "Add configurable probes to rabbitmq"
  - Add configurable probes to rabbitmq
    
    Currently rabbitmq probes are hardcoded with no ability to
    customize via values.
    
    Signed-off-by: Ruslan Aliev <[email protected]>
    Change-Id: Ibbe84e68542296f3279c2e59986b9835fe301089
    
  - [deploy-env] Add mirror to Docker configuration
    
    There are some docker_container tasks which pull docker images.
    This commit adds mirror configuration to daemon.json to prevent
    encountering issues related to the pull rate limit.
    
    + update tls job according to the changes in openstack-helm
    
    Depends-On: Ia58916e3dc5e0f50b476ece9bba31d8d656b3c44
    Change-Id: Iac995500357336566cdbf9ddee0ae85b0b0347cd
    
  - [chromedriver] Loosen compatibility up with Chrome
    
    Chromedriver had strict version selection. This commit allows
    it to pick the closest patch version to google-chrome-stable
    
    Change-Id: I435985573f69ee4bb0f6009416452649f302c0fe
    
  - Add env variables to deploy from Helm repos
    
    These env variables will be defined in test
    jobs. By default we will deploy from local charts
    but some jobs will deploy from charts published
    on a HTTP server (local or public).
    
    - OSH_HELM_REPO
    - OSH_INFRA_HELM_REPO
    - DOWNLOAD_OVERRIDES
    
    Change-Id: Ic92b97eb5df4f7f8c4185c06654de4b4d890fbc6
    
  - Remove ingress chart
    
    We have not been using it for a while since some
    time ago we switched to the upstream ingress-nginx.
    
    Change-Id: I2afe101cec2ddc562190812fc27bb3fad11469f1
    
  - Install OSH Helm plugin
    
    Depends-On: I71ab6ad104beb491b5b15b7750e2fc0988db82bf
    Change-Id: I8f30fbdf94d76ef9fa2985a25c033df290995326
    
  - [chromedriver] Change json api endpoint
    
    Choose a more reliable json file from the upstream to refer to.
    "Stable" versions of Chrome and Chromedriver became unsynchronized for some reason.
    
    Change-Id: I1688a867ea1987105e7a79c89ba7ea797819a12f
    
  - Merge "Clean up outdated deploy k8s scripts"
  - Update test jobs
    
    - Remove openstack-helm-infra-openstack-support* jobs.
      Instead of these jobs we run compute-kit, cinder and tls
      jobs defined in the openstack-helm repo.
    - Remove all experimental jobs since they are outdated and
      do not work. We will later add some of the test cases
      including apparmor, network policy, tenant Ceph and others.
    
    Change-Id: I8f3379c06b4595ed90de025d32c89de29614057d
    
  - Clean up outdated deploy k8s scripts
    
    Change-Id: I8481869a6547feae2ac057b65c8c4aecc2c1f505
    
  - Enable job for DPDK
    
    Depends-On: I3ad5b63a0813761a23573166c5024e17d87f775d
    Change-Id: I4851767a79bc4571a0f38622fe309807b53a7504
    
  - Merge "helm-toolkit: Enable custom secret annotations"
  - Merge "Add conf file for MongoDB"
  - Merge "make ovn db file path as configurable"
  - make ovn db file path as configurable
    
    Change-Id: I8b0f5c0bda2f1305e0460adc35e85b130f4cf9ff
    
  - Add conf file for MongoDB
    
    Change-Id: If6635557d4b0f65188da0d7450ad37630b811996
    
  - helm-toolkit: Enable custom secret annotations
    
    Enable custom annotations for secrets [registry, tls]
    
    Change-Id: I811d5553f51ad2b26ea9d73db945c043ee2e7a10
    
  - Merge "Update deploy-env role README.md"
  - Merge "Add 2023.2 Ubuntu Jammy overrides"
  - add custom job annotations snippet and use it
    
    Add the ability for charts that use helm-toolkit to allow the users to
    set custom annotations on jobs. Use the snippet in a generic way in the
    job templates provided by helm-toolkit.
    
    Change-Id: I5d60fe849e172c19d865b614c3c44ea618f92f20
    Depends-On: I3991d6984563813d5a3a776eabd52e2e89933bd8
    Signed-off-by: Doug Goldstein <[email protected]>
    
  - Update deploy-env role README.md
    
    Change-Id: Ia2ace3541be97577f1225d54417f6a287b7a8eb2
    
  - Run more test jobs when helm-toolkit updated
    
    Specifically we would like at least the following
    deployments to be tested when helm-toolkit is updated
    - compute-kit
    - cinder
    - tls
    
    Change-Id: I3991d6984563813d5a3a776eabd52e2e89933bd8
    
  - Merge "Add 2024.1 overrides"
  - Fix coredns resolver
    
    Forward requests for unknown names to 8.8.8.8
    
    NOTE: Temporarily disable DPDK job which turned to
    be incompatible with this PR
    https://review.opendev.org/c/openstack/openstack-helm/+/914399
    It wasn't tested with the DPDK job.
    
    Change-Id: I936fb1032a736f7b09ad50b749d37095cce4c392
    
  - Add 2024.1 overrides
    
    Depends-On: Iadc9aec92b756de2ecfcb610e62c15bdbad4bb9e
    Change-Id: Icf98f9af863f60fa93ff70d2e8256810bed2b9f9
    
  - Add 2023.2 Ubuntu Jammy overrides
    
    Change-Id: Ia23370d07faf1f8a1e05447459ce9872e8d4e875
    
  - Rename dpdk job name to reflect Openstack version
    
    Change-Id: I9c04a60ae8b7fde35a8a970e3b74bcaad7bd564f
    
  - Merge "Add custom secret annotations helm-toolkit snippet"
  - Add custom secret annotations helm-toolkit snippet
    
    Change-Id: Ic61afcb78495b35ee42232b435f54344f0a0a057
    
  - Bump RabbitMQ version 3.9.0 -> 3.13.0
    
    Also
    - Update default Heat image to 2023.2 used for
      init and test jobs
    - Add overrides for
      - yoga-ubuntu_focal
      - zed-ubuntu_focal
      - zed-ubuntu_jammy
      - 2023.1-ubuntu_focal
      - 2023.1-ubuntu_jammy
      - 2023.2-ubuntu_jammy
    
    Change-Id: I516c655ea1937f9bd1d363ea86d35e05e3d54eed
    
  - Merge "Refactor deploy-env role"
  - Merge "Add custom pod annotations helm-toolkit snippet"
  - Refactor deploy-env role
    
    - Make it less mixed. Each task file
      deploys one feature.
    - Deploy Metallb
    - Deploy Openstack provider network gateway
    
    Change-Id: I41f0353b286f817cb562b3bd59992e4baa473568
    
  - Merge "Bump containerd sandbox image from 3.6 to 3.9"
  - Merge "Update ovn controller init script"
  - Add custom pod annotations helm-toolkit snippet
    
    Change-Id: I898afae7945c03aec909e5edcd1c760c4d8ff9d6
    
  - Update ovn controller init script
    
    - OVN init script must be able to attach an interface
      to the provider network bridge and migrate IP from the
      interface to the bridge exactly like Neutron OVS agent
      init script does it.
    
    - OVN init script sets gateway option to those OVN controller
      instances which are running on nodes with l3-agent=enabled
      label.
    
    Change-Id: I24345c1f85c1e75af6e804f09d35abf530ddd6b4
    
  - Bump containerd sandbox image from 3.6 to 3.9
    
    Fixes the following kubeadm warning:
    
    W0321 01:33:46.409134   14953 checks.go:835] detected that the
    sandbox image "registry.k8s.io/pause:3.6" of the container
    runtime is inconsistent with that used by kubeadm.
    It is recommended that using "registry.k8s.io/pause:3.9"
    as the CRI sandbox image.
    
    Change-Id: I8129a6e9ad3acdf314e2853851cd5274855e3209
    
  - [rook-ceph] Add a script to migrate Ceph clusters to Rook
    
    This change adds a deployment script that can be used to migrate a
    Ceph cluster deployed with the legacy openstack-helm-infra Ceph
    charts to Rook. This process is disruptive. The Ceph cluster goes
    down and comes back up multiple times during the migration, but the
    end result is a Rook-deployed Ceph cluster with the original
    cluster FSID and all OSD data intact.
    
    Change-Id: Ied8ff94f25cd792a9be9f889bb6fdabc45a57f2e
    
  - Fix registry bootstrap values
    
    The quay.io/airshipit/kubernetes-entrypoint:v1.0.0 image format is
    deprecated and not supported any more by the docker registry.
    
    This is temporary fix to download the image from third party repo
    until we update the quay.io/airshipit/kubernetes-entrypoint:v1.0.0.
    
    The deprecation message is as follows:
    
    [DEPRECATION NOTICE] Docker Image Format v1 and Docker
    Image manifest version 2, schema 1 support is disabled
    by default and will be removed in an upcoming release.
    Suggest the author of quay.io/airshipit/kubernetes-entrypoint:v1.0.0
    to upgrade the image to the OCI Format or Docker Image
    manifest v2, schema 2. More information at
    https://docs.docker.com/go/deprecated-image-specs/
    
    The docker-registry container must start not
    earlier than docker-images PVC is bound.
    
    Change-Id: I6bff98aa7d0b23e13a17a038f3039b7956703d40
    
  - Fixing rolebindings generation for init container
    
    This part has to use the same configuration
    as init container: see line 96
    
    Change-Id: I06c1f3ad586863d4dcfab559d13a592fc576f857
    
  - Merge "Update Ceph images to patched 18.2.2 and restore debian-reef repo"
  - Update Ceph images to patched 18.2.2 and restore debian-reef repo
    
    This change updates the Ceph images to 18.2.2 images patched with a
    fix for https://tracker.ceph.com/issues/63684. It also reverts the
    package repository in the deployment scripts to use the debian-reef
    directory on download.ceph.com instead of debian-18.2.1. The issue
    with the repo that prompted the previous change to debian-18.2.1
    has been resolved and the more generic debian-reef directory may
    now be used again.
    
    Change-Id: I85be0cfa73f752019fc3689887dbfd36cec3f6b2
    
  - Include values_overrides for OpenStack components
    
    Fixes issue where override files for OS charts were
    missing due to specifying the wrong project directory.
    
    Change-Id: I4af6715a33c7de43068ed76a8115c12a2c0969ed
    
  - Merge "bugfix: updated permissions of ceph user created to allow rbd profile"
  - [ceph-osd] Allow lvcreate to wipe existing LV metadata
    
    In some cases when OSD metadata disks are reused and redeployed,
    lvcreate can fail to create a DB or WAL volume because it overlaps
    an old, deleted volume on the same disk whose signature still
    exists at the offsets that trigger detection and abort the LV
    creation process when the user is asked whether or not to wipe to
    old signature. Adding a --yes argument to the lvcreate command
    automatically answers yes to the wipe question and allows lvcreate
    to wipe the old signature.
    
    Change-Id: I0d69bd920c8e62915853ecc3b22825fa98f7edf3
    
  - Workaround for debian-reef folder issue
    
    This PS changes ceph repo to debian-18.2.1 from
    debian-reef due to some issues with debian-reef
    folder at https://download.ceph.com/
    
    Change-Id: I31c501541b54d9253c334b56df975bddb13bbaeb
    
  - bugfix: updated permissions of ceph user created to allow rbd profile
    
    Change-Id: I9049e4312aa6cb92a832d5100ba1da995233c48e
    
  - [mariadb] Switch to ingress-less mariadb
    
    This PS switches mariadb to use primary service by
    default instead of ingress based deployment. The
    primary service that is getting created and
    automatically updated based on the leader election
    process in start.py entrypoint script.
    
    Mariadb primary service was introduced by this PS:
    
    https://review.opendev.org/c/openstack/openstack-helm-infra/+/905797
    
    Change-Id: I4992276d0902d277a7a81f2730c22635b15794b0
    
  - Merge "Remove unused nodesets"
  - Add compute-kit job with DPDK enabled
    
    + add role for enabling hugepages
    
    Change-Id: I89d3c09ea3bedcba6cb51178c8d1ac482a57af01
    Depends-On: I2f9d954258451f64eb87d03affc079b71b00f7bd
    
  - Merge "[deploy-env] Docker env setup"
  - Merge "Remove some aio jobs"
  - [deploy-env] Docker env setup
    
    This PS adds connection reset for ansible session
    letting zuul user to use newly installed docker
    environment without sudo
    
    Change-Id: I37a2570f1dd58ec02338e07c32ec15eacbfaf4b6
    
  - Remove calico chart
    
    Tigera provides tools for managing Calico deployments (helm chart,
    operator and even plain kubectl manifest). Also there are plenty of
    other networking solutions on the market and it looks like users can choose
    on their own the CNI implementation.
    
    There have not been many contributions to this chart for quite some time
    and we don't use this chart in any test jobs. In the deploy-env role we use
    the upstream Calico manifest.
    
    Change-Id: I6005e85946888c52e0d273c61d38f4787e43c20a
    
  - Remove unused nodesets
    
    Change-Id: Ifc5ea6a83729fc2313c209f683ef7476d6a14272
    
  - Remove some aio jobs
    
    These two jobs openstack-helm-infra-aio-monitoring and
    openstack-helm-infra-aio-logging were only needed for
    backward compatibility.
    
    Depends-On: I9c3b8cd18178aa57ce44564490ef1b61f275ae29
    Change-Id: I09d0e48128a3fd98fa9148b8e520df75d6e5be50
    
  - Merge "Bump Calico version to v3.27.0"
  - Fix prevent trailing whitespace lint command
    
    Recently we added a jpg file to OSH documentation
    but the lint job didn't run due to the job configuration.
    
    But then for the next PR link job did run and failed
    due to trailing whitespace in the jpg file.
    
    Change-Id: I9abf8f93a4566411076190965f282375846dc5db
    
  - Bump Calico version to v3.27.0
    
    Change-Id: I8daa54e70c66cec41733d6b9fd5c9dd4597ff9c1
    
  - Merge "Use upstream ingress-nginx chart"
  - Use upstream ingress-nginx chart
    
    Change-Id: I90a1a1e27f0b821bbecfe493057eada81d4f9424
    
  - Merge "Use containerized Openstack client"
  - Merge "[openvswitch] Add overrides values for dpdk"
  - Use containerized Openstack client
    
    Change-Id: I17c841b74bf92fc3ac375404b27fa2562603604f
    
  - [openvswitch] Add overrides values for dpdk
    
    Change-Id: I756f35f1251244bc76f87a18a1a2e51f13a8c010
    
  - [ceph] Update Ceph images to Jammy and Reef 18.2.1
    
    This change updates all Ceph images in openstack-helm-infra to
    ubuntu_jammy_18.2.1-1-20240130.
    
    Change-Id: I16d9897bc5f8ca410059a5f53cc637eb8033ba47
    
  - [ceph-rook] Update Rook and increase ceph-mon memory limit
    
    This change updates Rook to the 1.13.3 release. It also increases
    the memory limit for ceph-mon pods deployed by Rook to prevent
    pod restarts due to liveness probe failures that sometimes result
    from probes causing ceph-mon pods to hit their memory limit.
    
    Change-Id: Ib7d28fd866a51cbc5ad0d7320ae2ef4a831276aa
    
  - Merge "[mariadb] Add mariadb-server-primary service"
  - [elasticsearch-exporter] Update to the latest v1.7.0
    
    The current version of the exporter is outdated, switch to the upstream
    + rename --es.snapshots to --collector.snapshots (v1.7.0) and
      --es.cluster_settings to --collector.clustersettings (v1.6.0)
    
    Change-Id: I4b496d859a4764fbec3271817391667a53286acd
    
  - [mariadb] Add mariadb-server-primary service
    
    This PS adds mariadb-server-primary service that is getting created
    and automatically updated based on the leader election process in
    start.py entrypoint script.
    
    Change-Id: I1d8a8db0ce8102e5e23f7efdeedd139726ffff28
    Signed-off-by: Sergiy Markin <[email protected]>
    
  - Merge "Change default ingress path type to prefix"
  - Change default ingress path type to prefix
    
    Due to CVE-2022-4886 the default pathType for an ingress should be
   …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests