Blog
17 July 2020
Ivan Mikheykin, software engineer

Kubernetes operators made easy with shell-operator: project status & news

Operators are an excellent choice when you need to expand the capabilities of Kubernetes. No wonder this way of automation has become really popular among K8s users. Last year, we announced our shell-operator that significantly simplifies the process of creating Kubernetes operators.

To make it possible, we have implemented a framework allowing you to run custom scripts (written in Bash, Python, etc.) triggered by specific events in the K8s cluster.

Over the past year, shell-operator has found its loyal users (see details below) as well as new features and enhancements. Bearing in mind the recent v1.0.0-beta.11 release (our rationale for keeping the beta status is provided below), we have decided to outline the current state of the project as well as cover its new features implemented since the launch of the first public version.

Design and purpose

But let’s start with a brief explanation of how shell-operator works and what it is meant for.

Shell-operator runs in the pod of the Kubernetes cluster. It consists of:

  • a Go binary that subscribes to API events and runs hooks (while passing them details about the event);
  • a set of hooks, where each hook can be a Bash, Python script or any other executable file.

At the same time, hooks:

  • determine which events they need and for which Kubernetes objects;
  • perform the required actions if these events occur in K8s.

Thus, shell-operator is a bridge between Kubernetes API events and scripts that process them.

More details on how you can use shell-operator to create your K8s operators are available in the project’s announcement as well as its README.

Why did we even create a shell-operator in the first place? In Kubernetes, operators act as a common pattern for the “correct automation.” However, the development of a fully-featured operator (i.e. written in Go and developed using the appropriate SDK) is not so easy. The basic framework in the form of shell-operator significantly reduces the difficulty curve, allowing you to solve small operational tasks inside the cluster quickly and effectively. And, just as important, it offers the correct way to do that.

What are those tasks exactly? You can find examples of using shell-operator in the project repository. We at Flant use it as a library as well (yes, there is such a possibility!): shell-operator serves as a basis for addon-operator that manages additional components for a Kubernetes cluster.

NB: Here is our announcement of this other Open Source project, addon-operator, for those who might be interested.

Now let us proceed to the most crucial changes that shell-operator has experienced over the past year!

Main improvements

In the early versions of shell-operator, hook could access the only object — the one bound to the event in the cluster. Further evolution of the hooks within addon-operator made it possible for hooks to subscribe to object changes. However, they still had to call kubectl to get an up-to-date list of other objects. To get rid of excessive kubectl calls and, thus, speed up the process, we have implemented several ways to access up-to-date lists of objects:

  • The Synchronization + Event mode. The hook gets a list of current objects at the start and then works with a single object. This mode is enabled by default (it looks similar to the reconcile loop in operator-sdk).
  • The snapshots mode. The hook gets a complete list of current objects at each start. (Snapshot is a list of cached Kubernetes objects used in hooks).
  • Mode with getting a group of snapshots. It is best suited for cases when some hook is subscribed to different types of resources, and it has to respond to changes based on up-to-date information about all of them, regardless of what has changed.
  • Also, you can watch the resource while not reacting to its changes (in other words, “accumulate a snapshot”). For example, a hook can respond to changes in a CustomResource and, at the same time, get the current ConfigMap object while avoiding extra invocation of kubectl. (See executeHookOnSynchronization and executeHookOnEvent flags for more information).

Other notable new features:

  • Thanks to switching to the dynamic Kubernetes client, shell-operator now can subscribe to any existing kind (Kubernetes resource type), including Custom Resources.
  • You can run hooks in different queues (see the queue parameter). Later, we have also implemented the relevant command as well as endpoints to examine the state of hook queues.
  • Hooks can now subscribe to multiple resource names.
  • Hooks can now subscribe to “dynamic namespaces,” i.e. to watch the resources in namespaces having specific labels.
  • Hooks can now export custom metrics for scraping by Prometheus. You can specify a dedicated port for those metrics.
  • A specific framework that simplifies writing shell-based hooks has been added.

Less significant changes

  • Hooks can now return the configuration in YAML format (in addition to JSON).
  • Logging in JSON format using logrus was added (see the LOG_TYPE environment variable).
  • The listen-address setting was added for hooks to run in the hostNetwork: true mode.
  • Rate limit (qps, burst) settings were added for the Kubernetes API client.
  • Now you can specify the address of the Kubernetes API server via the kube-server flag.
  • Various constraints to speed up the simultaneous running of hooks were removed.
  • jqFilter expressions are now run using the libjq-go, thus avoiding the need to run an additional jq process.
  • A zombie reaper that processes SIGCHLD signals and reaps orphan processes (can be a result of running Bash scripts) was removed. Shell-operator now uses tini for reaping zombies instead of any internal implementation.
  • Using the shell-operator as a library was streamlined.
  • The kubectl version was updated (from 1.13 to 1.17.4) and alpine-3.11-based build was made.

Current state and future plans

The shell-operator project still has the beta status. Despite this, we are heavily using it as the basis for addon-operator, a tool we continuously operate in multiple (100+) Kubernetes clusters.

In order to release a stable version of shell-operator as a public project, we are planning to implement (at minimum):

  • e2e testing (#63),
  • multi-architecture build process (#184),
  • update client-go to 0.18.0, implement context and (finally) take care of caching objects in client-go (#188).

Adoption by the community

Over the last year, we have seen clear signs of community interest in shell-operator:

  • The project has been included in various lists of useful Kubernetes tools (awesome-kubernetes, Cloud Zone), as well as mentioned during webinars (Weaveworks), meet-ups (K8s Meetup Tokyo), and even in books.
  • It has found practical use in real-life projects. Of those that we know about, the most exciting one is the installer for KubeSphere platform. There are a lot of basic operators on GitHub that use the shell-operator framework (here is a small list).
  • Third-party contributors to the project have emerged as well: their input is not so huge yet, but we welcome everyone who wants to participate in the development.
  • Currently, the project has over 600+ stars on GitHub, and we would love to see new ones! 😉

Please note also that we’re going to present our shell-operator during KubeCon + CloudNativeCon Europe 2020 virtual event that will happen this coming August. More details are available here.

Thank you for your interest in shell-operator! If you have any questions, please, do not hesitate to ask them here in comments.

Afterword

This article has been originally posted on Medium. New texts from our engineers are placed here, on blog.flant.com. Please follow our Twitter or subscribe below email to get last updates!