Another Image

    Share

    Twitter

    LinkedIn

    Facebook

    Published on 00/00/0000
    Last updated on 00/00/0000
    Published on 00/00/0000
    Last updated on 00/00/0000

    Share

    Twitter

    LinkedIn

    Facebook

    STRATEGY & INSIGHTS

    STRATEGY & INSIGHTS

    clock icon

    7 min read

    Blog thumbnail
    Another Image

    Share

    Sandor Guba

    by

    Sandor Guba

    Published on 02/21/2022
    Last updated on 06/18/2024
    Published on 02/21/2022
    Last updated on 06/18/2024

    Get More Out of Your Kubernetes Events

    Share

    Subscribe card background
    Subscribe
    Subscribe to
    the Shift!
    Get emerging insights on emerging technology straight to your inbox.
    Kubernetes has become the go-to platform for hosting container-based applications. Although Kubernetes is widely adopted, there are still “secret” benefits of running your applications on it. This post shows you why Kubernetes events are so important and how they help tackle simple and complex problems as well. Before we dig deeper, let’s get an overview of what Kubernetes events are! Kubernetes

    Kubernetes events

    We have an earlier blog post about Kubernetes events to get your feet wet, but here's just a quick overview of what events are. The foundation of Kubernetes is that there are several different controllers that keep the state of the system in sync with the resource definitions. These controllers communicate with the users via events. The most basic example is when you create a deployment:
    • The controller manager creates a replicaset for that deployment.
    • The replica set determines how many pods it should deploy.
    • The scheduler (another controller) assigns nodes to pods.
    • And finally, kubelet (a per node controller) executes the containers.
    As you can see, a simple deployment goes through several controllers and a lot can go wrong during this process. Both success or failure of an operation results in an event in Kubernetes. You can check those events via kubectl or your preferred GUI or CLI tool. If you don’t want to filter events you can simply use:
    kubectl get events
    and the result will be something similar to this:
    LAST SEEN   TYPE      REASON              OBJECT                                               SUBOBJECT                          SOURCE                                                                             MESSAGE                                                                                                                                                                                         FIRST SEEN   COUNT   NAME
    3m7s        Normal    LeaderElection      configmap/banzaicloud-thanos-operator                                                   one-eye-thanos-operator-6467d7bd65-8xb27_01984c3f-b24d-4ebe-8156-d9a321a3a5d5      one-eye-thanos-operator-6467d7bd65-8xb27_01984c3f-b24d-4ebe-8156-d9a321a3a5d5 became leader                                                                                                     3m7s         1       banzaicloud-thanos-operator.16cb552d7b33e74b
    3m7s        Normal    LeaderElection      lease/banzaicloud-thanos-operator                                                       one-eye-thanos-operator-6467d7bd65-8xb27_01984c3f-b24d-4ebe-8156-d9a321a3a5d5      one-eye-thanos-operator-6467d7bd65-8xb27_01984c3f-b24d-4ebe-8156-d9a321a3a5d5 became leader                                                                                                     3m7s         1       banzaicloud-thanos-operator.16cb552d7b341142
    2m39s       Normal    LeaderElection      configmap/banzaicloud-thanos-operator                                                   one-eye-thanos-operator-6467d7bd65-8xb27_6b649c1c-1cc3-47cf-ae12-894b19b4ee99      one-eye-thanos-operator-6467d7bd65-8xb27_6b649c1c-1cc3-47cf-ae12-894b19b4ee99 became leader                                                                                                     2m39s        1       banzaicloud-thanos-operator.16cb5533d626f885
    2m39s       Normal    LeaderElection      lease/banzaicloud-thanos-operator                                                       one-eye-thanos-operator-6467d7bd65-8xb27_6b649c1c-1cc3-47cf-ae12-894b19b4ee99      one-eye-thanos-operator-6467d7bd65-8xb27_6b649c1c-1cc3-47cf-ae12-894b19b4ee99 became leader                                                                                                     2m3

    Structure of an event

    If you have a resource related to an event, you can query events for that particular resource.
    kubectl get event one-eye-thanos-operator.16cb552b0653a67d
    apiVersion: v1
    count: 1
    eventTime: null
    firstTimestamp: "2022-01-18T10:02:12Z"
    involvedObject:
      apiVersion: apps/v1
      kind: Deployment
      name: one-eye-thanos-operator
      namespace: default
      resourceVersion: "4521231"
      uid: ee22d555-1bdf-4424-a1ea-19a1382c958d
    kind: Event
    lastTimestamp: "2022-01-18T10:02:12Z"
    message:
      Scaled up replica set one-eye-thanos-operator-6467d7bd65
      to 1
    metadata:
      creationTimestamp: "2022-01-18T10:02:12Z"
      name: one-eye-thanos-operator.16cb552b0653a67d
      namespace: default
      resourceVersion: "4521234"
      selfLink: /api/v1/namespaces/default/events/one-eye-thanos-operator.16cb552b0653a67d
      uid: e5cf909c-53c2-4e5e-be4c-af92a956a12c
    reason: ScalingReplicaSet
    reportingComponent: ""
    reportingInstance: ""
    source:
      component: deployment-controller
    type: Normal
    As you can see, Kubernetes events are essentially resources similar to deployments or pods. They have the same version and metadata fields. However, we have a couple of event-specific fields as well. Let's see the most important ones:
    • eventTime The timestamp of an atomic event
    • firstTimestamp The first timestamp of a continuous event
    • lastTimestamp The last timestamp of a continuous event
    • count The number of times this event was triggered
    • message Human readable message
    • reason Short description of the event
    • involvedObject Reference to the Kubernetes resource the event is related to
    • metadata The event's own metadata including name, uid, etc.
    • source The source object of the event
    • type Event type like Normal, Warning, and so on.

    To store or not to store

    Events are garbage collected by the Kubernetes API Server after a short period of time. This TTL is configurable, a typical value is an hour, but there are exceptions like 5 minutes in case of EKS. However, events can be really useful when debugging what happened in your cluster. That is why storing events is a common practice. The problem with events is that they are not really metrics, a bit different from logs, and have some trace-like properties as well.

    Store events as logs

    A trivial approach is to store events as logs. Although, there are some problems with this approach: events have fields that make connections between different components. If you treat events like standard log lines and ingest them into a log database like Loki, you miss a lot of information. Of course, it is possible to later retrieve that information at query time, but you need to be prepared to parse those fields from your raw data.

    Store events as metrics

    As events have a lot of simple attributes (like reason), they can be translated into metrics. A good transformation would be to use the reason field as metric name and the count field as value. All the other relevant attributes can be labels on the metric. From this information you can create a nice overview of what's happening in your cluster. This seems like a good idea and it provides you with an overall health indicator, yet you lose a lot of important information. Time series databases don't handle high cardinality information well. If you need more than aggregated values, like message and/or the name field, they become individual time series per event. That does not sound good, does it?

    Store events as traces

    An interesting approach is to store events as traces. Traces have the ability to not just show individual events, but represent hierarchy and time ranges visually. Kspan is a proof of concept of how to represent events as traces. I don't want to go into details, you can follow up on the kspan project page. Store_events_as_traces Screenshot from the kspan project

    Best of both worlds

    All the above solutions have their pros and cons, but we wanted something truly useful. Most of the time you need events tied to a resource. This can happen when you investigate an application behavior maze because you got a response time alert. Because of this, you want to filter alerts for related objects and need timelines when the alert was active. Eventually, events become another aspect of correlation. In a following post we will discuss the correlation feature of MCOM as well, so we don't let you hang dry.

    Handling events the MCOM way

    Let's talk about how Cisco MCOM handles events. First of all, we need to collect them all. To extract events from Kubernetes we use a modified version of Heptio's eventrouter. This simple yet great tool is able to fetch events from the Kubernetes API server and print them to the container's standard output. From there we have just the right tool to parse and send them to OpenSearch. Cisco MCOM provides the Flow and Output resources out of the box for ingesting Kubernetes events. We decided to use OpenSearch as our event backend because of the extensive query language it provides. Handling_events_the_MCOM_way

    Querying events

    As previously mentioned, we store historical event data in OpenSearch and leverage ElasticSearch's Query DSL to filter and aggregate results. Query DSL can express complex queries using a tree of clauses encoded as JSON. When fetching events for correlation, we use it to filter events by involved object and time range, but also sort and aggregate them before they leave OpenSearch — all in a single expression. Let's see an example:
    {
    	"collapse": {
    		"field": "event.metadata.name.keyword"
    	},
    	"query": {
    		"constant_score": {
    			"filter": {
    				"bool": {
    					"must": [
    						{
    							"term": {
    								"event.involvedObject.apiVersion.keyword": "v1"
    							}
    						},
    						{
    							"term": {
    								"event.involvedObject.kind.keyword": "Pod"
    							}
    						},
    						{
    							"term": {
    								"event.involvedObject.namespace.keyword": "default"
    							}
    						},
    						{
    							"term": {
    								"event.involvedObject.name.keyword": "nginx-558bd4d5db-6v9sc"
    							}
    						},
    						{
    							"bool": {
    								"should": [
    									{
    										"range": {
    											"event.eventTime": {
    												"from": "2022-02-14T01:00:00Z",
    												"include_lower": true,
    												"include_upper": true,
    												"to": null
    											}
    										}
    									},
    									{
    										"bool": {
    											"must": {
    												"range": {
    													"event.lastTimestamp": {
    														"from": "2022-02-14T01:00:00Z",
    														"include_lower": true,
    														"include_upper": true,
    														"to": null
    													}
    												}
    											}
    										}
    									}
    								]
    							}
    						}
    					]
    				}
    			}
    		}
    	},
    	"size": 10000,
    	"sort": [
    		{
    			"event.lastTimestamp": {
    				"missing": "_first",
    				"order": "desc",
    				"unmapped_type": "date"
    			}
    		},
    		{
    			"event.eventTime": {
    				"missing": "_first",
    				"order": "desc",
    				"unmapped_type": "date"
    			}
    		}
    	]
    }
    As you can see, there's quite a hierarchy of objects to express all these conditions and transformations, but we'll take it clause-by-clause. The collapse clause is responsible for aggregating events by event name. The query clause describes the logical combination of different filters. In our case, the first four term clauses filter the events by involved object, and the last clause defines the time range predicate for both event kinds — events with eventTime and events with firstTimestamp and lastTimestamp. Lastly, the size clause limits the result set size and the sort clause specifies an ordering by event timestamp. And that's it! Not so complicated after all. The results are then represented on a timeline on our correlation view: correlation_view

    How to try it out?

    All steps are manually reproducible but there is quite a bit of configuration required. To simplify the deployment, Cisco MCOM provides command-line options to deploy the event backend as described above.
    one-eye logging install -us
    one-eye opensearch install
    one-eye event-backend install
    And we are ready to browse our Events! In a future post we will show a practical example about logs, metrics and events in the correlation view!
    Another Image
    Subscribe card background
    Subscribe
    Subscribe to
    the Shift!

    Get emerging insights on emerging technology straight to your inbox.

    Subscribe
    Subscribe
 to
    The Shift
    !
    Get
    emerging insights
    on innovative technology straight to your inbox.

    The Shift is Outshift’s exclusive newsletter.

    Get the latest news and updates on generative AI, quantum computing, and other groundbreaking innovations shaping the future of technology.

    Outshift Background

    Welcome to the future of agentic AI: The Internet of Agents

    Outshift is leading the way in building an open, interoperable, agent-first, quantum-safe infrastructure for the future of artificial intelligence.

    * No email required

    thumbnail
    Download Whitepaper

    * No email required

    Footer BG
    Footer BG
    Image

    Initiatives

    Our Work
    Internet of Agents
    AI/ML
    Quantum
    Open Source
    Our Collaborators
    DevNet
    Research
    Quantum Labs

    About us

    Company
    About Us
    Our Team
    The Shift
    Apply
    Job Openings
    Design Partner Portal
    Connect
    Events
    Contact Us
    YouTube
    LinkedIn
    GitHub
    X
    BlueSky

    Blog

    Categories
    AI/ML
    Quantum
    In-depth Tech
    Strategy & Insights
    Research
    Inside Outshift

    Resources

    Resource Hub
    View all
    Ebooks
    Webinars & Videos
    White papers
    Explore Cisco
    cta
    Help
    Terms & Conditions
    Statement
    Cookies
    Trademarks
    © 2025 Outshift by Cisco Systems, Inc
    Outshift Logo

    Related articles

    Featured home blog
    Icon
    Security

    Inject secrets into Kubernetes pods in a continuous way

    Kubernetes
    Featured home blog
    Icon
    Security

    Kubernetes and multi-cloud security

    KubernetesSecurity
    Featured home blog
    Icon
    Strategy & Insights

    Accordion - A cloud native framework to enable fast SDLC for SaaS and on-prem projects

    KubernetesSecurity
    Another Image