Clearing the OpenShift log data using the curator pod (OpenShift 3.6)

openshift

In our test OpenShift cluster, we had this space problem with the infra node. See the numbers below.

Previously:

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel-root 50G 2.0G 49G 4% /
devtmpfs 7.8G 0 7.8G 0% /dev
/dev/sdd2 50G 50G 44K 100% /srv/logging
/dev/sdc1 300G 14G 287G 5% /srv/nfs
/dev/sdd1 51G 33M 51G 1% /srv/metrics
/dev/sda1 1014M 186M 829M 19% /boot
/dev/mapper/rhel-var 15G 12G 3.7G 76% /var
/dev/mapper/rhel-tmp 1014M 33M 982M 4% /tmp
/dev/mapper/rhel-usr_local_bin 1014M 33M 982M 4% /usr/local/bin
tybsrhosinode01.defence.local:/srv/nfs/registry-storage 300G 14G 287G 5% /var/lib/origin/openshift.local.volumes/pods/f53d5429-fd45-11e8-a82a-0050569897ab/volumes/kubernetes.io~nfs/vol

 

As you can see, logging was taking a huge amount of space since we did not aggregate the log info to an outside logging aggregator.

This is what I did to clear this huge log data of this OpenShift cluster (3.6) using the curator pod. In a similar case, you may wanna backup your log data before following the steps below.

1- Login to your master node

2- Go to “logging” project/namespace

3- oc get pods will output something like this

NAME READY STATUS RESTARTS AGE
logging-curator-1-8jbjc 1/1 Running 0 32m
logging-es-data-master-1rfk7xci-6-xbthc 1/1 Running 263 71d
logging-fluentd-5pfll 1/1 Running 3 1y
logging-fluentd-gpqmh 1/1 Running 2 62d
logging-fluentd-r2hjr 1/1 Running 18 1y
logging-fluentd-tznxt 1/1 Running 1 67d
logging-kibana-1-4hlzq 2/2 Running 2 166d

Observe the curator pod.

4- oc get cm will output someting llike below

NAME DATA AGE
logging-curator 1 1y
logging-elasticsearch 2 1y
logging-fluentd 3 1y

5- oc edit cm logging-curator (edit the cm for the curator)

It will be something like this:

Here, what I did was to uncomment the “.dafaults” section of the configuration.

6- oc delete pod logging-curator-1-8jbjc (Delete the curator pod)

The curator pod will be deleted, re-initialized, will pick up the latest configuration, and start clearing the logging data.

Now:

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel-root 50G 2.0G 49G 4% /
devtmpfs 7.8G 0 7.8G 0% /dev
/dev/sdd2 50G 1.2G 49G 3% /srv/logging
/dev/sdc1 300G 14G 287G 5% /srv/nfs
/dev/sdd1 51G 33M 51G 1% /srv/metrics
/dev/sda1 1014M 186M 829M 19% /boot
/dev/mapper/rhel-var 15G 12G 3.9G 75% /var
/dev/mapper/rhel-tmp 1014M 33M 982M 4% /tmp
/dev/mapper/rhel-usr_local_bin 1014M 33M 982M 4% /usr/local/bin
tybsrhosinode01.defence.local:/srv/nfs/registry-storage 300G 14G 287G 5% /var/lib/origin/openshift.local.volumes/pods/f2c43ef8-fd53-11e8-a82a-0050569897ab/volumes/kubernetes.io~nfs/vol

For more OpenShift stuff see my devops sub-blog.

Hope this helps.
Good Luck,
Serdar