Optimizing ETCD Performance: Compaction, Defragmentation, and Tuning in OpenShift (4.16)

 What static factors influence the load on etcd? (2 examples)

  • Number of nodes
  • Number of pods

What dynamic factors influence the load on etcd? (3 examples)

  • Changes in endpoints (pod scaling, HPA..)
  • Pod restarts
  • Job executions

Does etcd maintain historical key values?

Yes. Etcd stores historical key values until compaction is performed.


What does history compaction in etcd do?

  • Removes old key versions

What is the effect of etcd history compaction?

  • Reduces database size
  • Improves performance

Does history compaction return reclaimed space to the filesystem?

No. It only marks the space as unused. Therefore, defragmentation is necessary.


Does OpenShift automatically perform compaction of etcd?

 Etcd is compacted every 5 minutes


Why is defragmentation of etcd important?

Improves performance


What does defragmentation in etcd do?

  • Reclaims free space in the etcd database
  • Makes the space available to the filesystem

Does OpenShift automatically perform defragmentation of etcd?

Yes - etcd operator


Can defragmentation of etcd be performed manually?

Yes, using the command etcdctl defrag


Does OpenShift have mechanisms suggesting when defragmentation of etcd should be performed?

Yes - AlertManager/Prometheus checks certain utilization correlations with the etcd database size.


Can the settings for automatic defragmentation be changed?

No.
https://access.redhat.com/solutions/5564771

https://github.com/openshift/cluster-etcd-operator/blob/release-4.16/pkg/operator/defragcontroller/defragcontroller.go#L28C1-L28C6


Can defragmentation impact cluster operation?

Yes - it can lead to API interruptions.


Can automatic defragmentation of the etcd database be disabled?

Yes - by creating a empty config map:
oc create configmap etcd-disable-defrag -n openshift-etcd-operator

https://access.redhat.com/solutions/6960380


What can be done to ensure that etcd database defragmentation does not impact cluster stability?

Regular reboots can be performed. 

During the node startup, the etcd operator may detect the need for defragmentation and execute it. This occurs at a stage when the node is not yet providing business workloads

For example, if a customer regularly updates OpenShift to newer .z versions (x.y.z), this process occurs transparently.


Why is defragmentation necessary even when using SSD/NVMe disks (which don’t have seek-time issues)?

Defragmentation is done at the database level, not the filesystem. A smaller database has better performance due to better data structure management.


What should be monitored in the context of etcd performance?

  • Fsync time for WAL (writes from RAM to disk)
  • Number of leader changes
  • Network latency between etcd members

What tuning parameters are key for etcd?

  • etcd_database_size
  • hardware_speed_tolerance

Where are tuning parameters for etcd set?

oc edit etcd/cluster 


What is hardware speed tolerance in etcd?

A parameter defining tolerance for hardware delays during data writes.


When should hardware speed tolerance be considered for modification?

  • High write latency on disk
  • Performance issues with etcd

Why modify hardware speed tolerance?

To adjust etcd to operate in environments with slower or unstable disks. It changes timeouts values for example.


In which environments may hardware_speed_tolerance need to be set to Slow?

  • Installations with network disks, e.g., iSCSI
  • Frequently on virtualization

Why increase the etcd database size?

To store more data (for large clusters)


In what phase is the etcd database expansion functionality?

Tech-preview


How much RAM must the master node have in relation to the etcd database size?

Recommendation is at least 3 times the size of the etcd database.


Why does a larger etcd database affect master node RAM requirements?

The API Server Cache will need to cache more elements.


What can a slow disk used by etcd affect?

  • Performance
  • Stability

What is critical for the performance of etcd?

Low disk latency (SSD or NVMe)


What can happen with high disk latency?

  • Leader loss
  • Timeouts
  • Slower OpenShift API operations
  • Impact on all cluster applications

What can cause high etcd latencies?

  • Other processes with intensive I/O
  • High network latency

Can OpenShift have a dedicated disk for etcd?

Yes


Where are procedures and recommendations about etcd for OpenShift is available?

https://docs.redhat.com/en/documentation/openshift_container_platform/4.16/html/scalability_and_performance/recommended-performance-and-scalability-practices#recommended-etcd-practices

Comments