Debugging .NET Applications Running as Docker Microservices. From Intrusive to Non-Intrusive

Debugging Docker Microservices
While Docker does run natively on Windows, it’s more native on Linux. .NET developers are somewhat less familiar with developing and debugging with Docker than native Linux developers. Learn how to debug .NET applications running in Docker containers.
Docker has enjoyed meteoric growth, becoming the leading platform for containerization and microservices. While Docker does run natively on Windows platforms, it’s more native on Linux, and there are differences between Windows containers vs. Linux containers. For example, Linux Docker containers support .NET Core, so they take up about 200 Mb; however, Windows Docker containers only support .NET Framework so typically run at about 2Gb. Therefore, for new development, Linux is preferable, but if you have legacy code written on .NET Framework, you can still use Windows containers. Either way, we all know that developers spend much of their time debugging, and debugging only gets more difficult as you move up the CI/CD pipeline from Development to Production. In this blog post, I will talk about how to debug .NET applications in Docker containers, starting from a single container in Development and move on to debugging an application built on several microservices in Production. The blog is based on a webinar I recently hosted, “Advanced Debugging for .NET Applications Built on Docker Microservices.” To view the full webinar recording, scroll down to the end of this post.

How to debug Docker Linux containers in Visual Studio

When debugging a .NET application in Visual Studio, the IDE connects to your application through the operating system’s Debugging API.

Visual Studio Connects to Your Application - Ozcode

However, once you place your application inside a Docker container, it’s not running on the same OS anymore, and you need to bridge that gap using Visual Studio’s Attach to Process feature. Under the hood, Visual Studio is running the Docker commands necessary to expose the processes that are running in the container and then displays them so you can choose which process to attach to. Here’s what it looks like:

What Visual Studio is doing is injecting VsDbg into the container. It then uses the container’s standard stdio feature to enable communication without needing the certificates normally needed for remote debugging. Visual Studio then connects to VsDbg, which uses the host OS’s native debugging API in order to debug your application.

Visual Studio Attach Process - Ozcode
Tip Just like Visual Studio can inject VsDbg to enable debugging Docker containers, you can inject other .NET utilities that may come in handy using the docker exec command. For example, inject dotnet-trace to enable remote diagnostics on the container to measure performance, see event traces, and view metrics. Or you could inject dotnet-dump to generate a dump file of your container to analyze a crash or exception. These tools can be really handy in your container but be aware that they also incur overheads. For example, you would need to create a volume mapping if you want to extract a dump file out of the container to examine it in your Windows environment.

Once Visual Studio has attached the debugger to your service, you can debug it just like you’re used to, breakpoints, and all.

From Microservice to Cluster

Being able to debug Docker as a single container is great, but microservices never live alone. The whole point of an application built on microservices architecture is to have many well-encapsulated services communicating with each other. Service A calls service B, which does something that needs service C… and so on. To run and debug an application with many microservices, you need a Docker container orchestration tool. The tool used by Visual Studio is Docker Compose, and you set it up like this:

And don’t forget, Visual Studio will stop execution at any breakpoints you may have left from your last test with a single container.

So that sounds great, doesn’t it? Debugging a Docker microservices application with breakpoints and all. But it’s not time to party yet.

Timeout hell and other limitations of debugging Docker microservices

Let’s start by acknowledging that freezing a microservice with a breakpoint in Production systems is rarely a viable option. You’re not usually willing to risk user experience while trying to debug an issue. Not only that, an unresponsive microservice can change performance metrics and raise alerts with any tools that are monitoring Production. And then, there’s PII. You might not be allowed to expose your Production environment to the developer who’s best equipped to troubleshoot a particular issue.

Even putting all that aside, what happens when Visual Studio pauses execution of a microservice at a breakpoint depends on the orchestration tool you’re using. In the case of Docker Compose, other microservices that are waiting for a response from the “break’ed” one will be in limbo and eventually fail due to a network timeout. In other words, stopping the execution of one microservice causes a chain of cascading errors stopping all other microservices in that call chain. They all become unresponsive in what you might call “timeout hell.”

If you’re using Kubernetes, the behavior may be different. Kubernetes uses redundancy. A microservice stopped at a breakpoint is deemed unresponsive, and Kubernetes diverts traffic to one of the other instances. So, you can’t really debug your issue unless you attach to all redundant instances of a microservice in order to catch the code execution flow of the error. Moreover, depending on how your Kubernetes cluster is configured, Kubernetes may just “kill” the unresponsive container and spin up a new redundant instance – that doesn’t have your breakpoint in it. There goes your debug session.

In general, the ephemeral nature of microservices (and similarly, serverless code) poses real problems when debugging applications that use this technology because it can be virtually impossible to accurately reproduce an error when the problematic code/scenario is running one moment and gone the next.

The evolution of non-intrusive debugging

Local debugging has been around for a while and is inherently intrusive. You’re trying to extract data such as logs from the system, and when the debugger halts the execution of your application at a breakpoint, you can inspect things like memory, parameters, and local variables. This is all done in milliseconds at what I’ll casually call “computer speed.” But then, you need to process this data to reason and understand what happened. This is done at “human speed,” which is much slower.

Now, this disparate dichotomy between fast computers and slow humans may be tolerable in development, but it’s not when you consider Production systems. Why should the computer (and your application) have to wait for us humans to figure something out?

What we really want to do is separate these processes and allow us slow humans the privilege of time to understand and debug our Production systems, but without freezing them. This is what I call non-intrusive debugging.

Let’s look at the evolution of non-intrusive debugging.

Gen 0: Console

Every developer out there has already engaged in non-intrusive bugging. One of the first things you learn when learning to code is how to write messages out to the console. Something like this:

Console.WriteLine(“Calling function ProcessOrder”);

Of course, trying to debug by scrolling up and down your console is not very practical.

Gen 1: Logs

So along came logging, and now you would try debugging like this:

Log.Warning(“Customer has negative age”,user, age);

You can output your log entries to a separate file and peruse it at your convenience, along with the multitude of other log files you probably need to look at.

Gen 2: Structured logs

Then, we took another step forward and could place some more specific data in our log entries by parametrizing them like this:

Log.Warning(“Customer {@Customer} has negative age {Age}”,user, age);

Once our logs had some structure, it enabled us to search and filter them to make it easier to find what we were looking for.

Gen 3: Log analyzers

And when log files became proverbial haystacks, we started using log analyzers like DataDog and Logz.io (to arbitrarily name two out of many available), and debugging in Production usually starts from a display that looks something like this.

So, it looks like we’re in a pretty good place with these tools aggregating and summarizing our logs into beautiful and colorful charts. Well, these charts are certainly leaps and bounds from Console.Writeline, but so are today’s computer programs. The fact is, even today’s most sophisticated log analyzers don’t do enough to help us debug microservices.

Applications built on microservices are extremely complex. It’s almost impossible to follow the complex code execution path of errors as they traverse a multitude of redundantly deployed, ephemeral microservices, with intermittent database requests, networking, and messaging, all while generating terabytes of log entries. No matter how hard we try, we never have the log entries in the right place. If we knew exactly what logs to write, that means we know where the bugs are and could have avoided them in the first place.

Even this level of non-intrusive debugging is inadequate to debug microservices in Production.

Gen 4: Code-level observability

Let’s rewind a bit. Earlier I talked about local debugging. During development, you put breakpoints in your code and step through it with your debugger with full visibility into logs, method parameters, local variables, and pretty much any data you need. That’s the experience you want when debugging Docker microservices applications in Production – but without the breakpoints. You can’t freeze your Production systems. Well, that experience is now available.

Ozcode Production Debugger records the complete error execution flow of an exception, with all the data I just mentioned across microservices. Basically, you have that IDE-like debugging experience, but it’s non-intrusive and has zero impact on your Production systems. To get the full debugging experience for your microservices application, install the Ozcode agent in each Docker microservice by adding a few lines of code into your Dockerfile, and you’re ready to go. You get all exceptions from any of the microservices in one place, making it much easier to correlate what happened between them. Here’s the debugging experience you can expect.

Never have to reproduce a production error - Ozcode

The next generation: Using dynamic logging with tracepoints for non-intrusive debugging of logical errors across microservices

Logical errors can be the most difficult to solve. There’s no exception to point you to where something went wrong. This is where dynamic logging with tracepoints becomes invaluable, and these are very different from logs. With logs, you have to decide ahead of time exactly what information you want to extract. With tracepoints, it’s more about which point in time you want to inspect the code. You can choose any point because all the information you may need has been recorded. We are about to release this feature in Ozcode Production Debugger, and it works like this:

  • You define a Tracepoint Session.
  • Select the object to inspect where you suspect the problem is. Ozcode goes through your Production code, decompiles the corresponding piece of code, and presents it to you to debug with tracepoints.
  • Now you use structured logging statements to specify what information the Production Debugger should display. The syntax lets you specify exactly what data to log, and you can log anything because everything has been captured.

Here’s what it looks like:

The beauty of this is that it’s completely non-intrusive debugging. There’s no need for any source code integration, and you don’t need to care what version is running, yet you’re debugging the actual Production code where the error occurred.

Now, because you have access to all the Production code, you can place tracepoints anywhere and follow the execution of an action across all the microservices in your application. Here’s what the output of a session with several tracepoints distributed across your application might look like.

We can see the tracepoint output of a CoursesController microservice together with the tracepoint output of a RoomsController microservice to see how they correlate. And if that’s not enough to understand what happened, we can continue to add more tracepoints, on-the-fly, anywhere in the code until we do.

Where do we go from here?

As the use of microservices continues to rise and new technologies like serverless code take hold, the challenges posed to developers and DevOps to troubleshoot their systems and fix bugs will only increase. The only way to meet those challenges is to bridge the gap between Production code and developers by bringing observability down to code level.

Here is the full webinar recording.

Automonous exception capture
Don’t chase after exceptions. They automatically appear in your dashboard.

Code-level observability
Get insights into your code down to the deepest levels.

Perfect bug reports
All the data you need is automatically assembled into a single shareable link.

Ozcode Production Debugger - START FOR FREE

Idan Shatz

Comments

Keep your code in good shape!

Subscribe to our blog and follow the latest news in the .NET debugging industry

Ready to Dive into Your Prod Code?

Easy debugging with full time-travel data

The Exception

A Quarterly Roundup

Subscribe to our quarterly newsletter and hear all about what’s happening around debugging with Ozcode

Share on facebook
Share on twitter

Recent Posts

Follow Us

Join Ozcode YouTube Channel

Let’s start debugging, it’s free!

Ozcode Logo

This website uses cookies to ensure you get the best experience on our website.