How Tinder Eased Development Pain With Ignis

Posted on:
November 15, 2024

Authored by: Edward Owens

At Tinder, we run hundreds of microservices that all have unique network configurations, roles, metric rollups, and environmental variables (“env vars”). This myriad of configurations within the cloud makes it difficult for developers to know if the code they write will actually work once deployed. Developers were forced to push every change they made to a feature branch, deploy the code to staging, and look at the logs and metrics to gain insight into their changes.

The latency in the development loop caused a significant bottleneck in our feature development and reduced overall stability. We needed a general-purpose solution that leveraged the same (or very similar) configuration as prod, and was accessible early in the development cycle.

Developing locally

To solve the above problems, we decided to go with a software-based solution called Ignis. Ignis is a CLI tool written in Golang that empowers our developers to run only the service they’re developing locally and forward/receive their traffic to/from a shared development Kubernetes cluster. This multi-tenant cluster is very similar to our production cluster, boosting developer’s confidence that if it works via Ignis, it will work in production. Ignis works via the proxy command. For example:

ignis proxy SvcA

Ignis will fetch the configuration for SvcA and swap out the application container for a proxy container. Once the pod starts, Ignis spins up a development container on the dev’s machine and connects its outgoing traffic to the proxy server. This dev container has the same env vars and access to the filesystem, via a network volume mount, as the proxy server running in the cluster.

When a developer runs their service in the container, any outgoing traffic sent from the container is forwarded to the proxy server running in the development Kubernetes cluster. This subjects its egress traffic to our service-specific networking configuration (enabled via our service mesh) allowing developers to test any network configuration changes before deploying them. Every proxy session is also connected to our metrics and logging pipelines, allowing devs to test dashboards and alerts.

API Gateway

Running a service in isolation is helpful, but developers would ideally be able to receive traffic in their proxy as well. To enable this functionality, Ignis also has a command called ignis workspace expose, which exposes the developer’s workspace via an API gateway. This gateway has a unique URL and the exact same routing configuration as our API gateway in production.

When you create your gateway, we can update the gateway’s config dynamically to route to your proxy variant of Svc A rather than the Svc A used by the environment. This allows you to develop locally, in real-time, with an actual Tinder client, and see the implications of your changes. You can also pair with client developers on features as they can point their client to your personal gateway rather than the gateway for the broader environment.

Routing

A major shortcoming of the gateway is its inability to redirect traffic to your proxy further in the call stack. For example, given the following request chain, what if you want to proxy Svc B?

Svc A -> Svc B -> Svc C

The gateway would route the request to Svc A which doesn’t (and shouldn’t) know that your proxy exists and it would make a downstream request to Svc B in the cluster. This bypasses your proxy, meaning you can’t test any service unless it’s directly called by the gateway. This is hugely limiting for developers as they have to manually make requests to simulate traffic from an upstream service rather than receiving the traffic from the service directly.

To solve this, we created a separate service called the Smart Router. The Smart Router is ultimately a gRPC/HTTP proxy that leverages the Kubernetes informer API to dynamically route incoming requests based on a special Ignis header/metadata key. Our service mesh is configured to redirect all traffic to the Smart Router when it detects the Ignis header in an outgoing request. All of our services also propagate this header if it’s detected (similar to a tracing header).

When receiving an outgoing request, the Smart Router checks if a developer is running a pod for the requested service. If the pod is found running in the developer’s namespace, the Smart Router redirects the request to your proxy pod. Otherwise, the request is forwarded along as normal. Ignis injects this special header in both the Gateway as well as in the proxy server. This special routing is what enables the multi-tenancy of the environment. If the request does not originate from an Ignis resource, then it follows the normal routing path. If a developer sends a request either from their proxy session or through their personalized Gateway, it is routed via the Smart Router.

Impact

Ignis was first released to developers in July of 2019. Since then, we’ve been able to kill a dozen staging environments, consolidate our development environments, and decommission our old development account. This amounted to an immediate and significant reduction in cloud spend while simultaneously increasing our development velocity and reduced onboarding time.

Ignis has also opened the door to integration testing and many developers leverage Ignis for their testing today, particularly the Smart Router’s ability to isolate traffic when sending test requests. Developers are able to run variants of a service in their namespace and run automated tests to ensure they are functioning properly.

Takeaways

We are continuing to iterate and improve on Ignis. In fact, V2 of Ignis was released in January, 2024 with a complete architectural overhaul and a ton of new features including:

  • Remote development using the most popular IDEs.
  • Access to an IDE in the proxy session container.
  • A dashboard showing where your traffic is routed.
  • Easy CRD support.
  • Extensive monitoring of developer connection latency, failure rates, error codes, general Ignis usage, etc.

These new features reduce friction for developers which ultimately reduces frustration and improves motivation. Monitoring in particular provides us the opportunity to gain insights about what’s most frustrating about the development process and make further improvements.

Today every developer uses Ignis in some way. Whether it’s a client dev investigating a bug through the Ignis gateway, a QA engineer testing the stability of a service, a backend dev writing a new feature, or a devops engineer issuing a test SQL query.

How Tinder Eased Development Pain With Ignis was originally published in Tinder Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Tags for this post:
No items found.