Why I Hate Bugs, and What I’ve Decided to Do About Them

Why I Hate Bugs | Ozcode
I hate bugs, and used to hate debugging. That's why I started Ozcode. Now, with Ozcode Production Debugger, debugging is a breeze.

I started Ozcode nine years ago for one simple reason: I hate debugging.
In the past, I worked for a company that makes medical devices, which meant that every bug in our software would have a direct effect on people’s health. This greatly amplified the stress and frustration of debugging. This and other experiences allowed me to master handling exceptions, setting breakpoints, and deciphering call stacks, yet it still always amazed me how hard it is to track failures down and fix them, especially when the code is already deployed to a production (or even staging) environment. So I started Ozcode to try and turn debugging into a straightforward process instead of a guessing game.

You Don’t Watch Movies Backwards

There are a lot of error monitoring tools on the market, like Raygun, Stackify, and Microsoft Application Insights. Sure, they give you a nice dashboard so that you can see all the different exceptions and give you an idea of where the error occurred. The problem is that looking at error reports is like seeing the last frame of a murder mystery movie, where everyone is already dead. These exception handling tools tend to hint at how you got to the point of failure without giving you the whole story.

One frame just isn’t enough (Dial M for Murder, 1954)

Moreover, as much as I love the technological advancements that have led to cool things that enable a great customer experience, like distributed apps, microservices, and serverless technologies, debugging in today’s cloud-native application development cycle is more complicated than ever. The reason is that today, the shift to the cloud means that new lines of code are often executed minutes after they were written, most times using a Docker container somewhere in the cloud, with little or no visibility into how they were executed, so errors are even tougher to track down.

Similarly, we face the challenge of distributed code, because most tech stacks are comprised of numerous microservices or serverless functions, making tracing hard to impossible. Think about it. When a user clicks a button in their web browser, it sends out a request to some REST API, which promptly calls another service, which then calls yet another service, and good luck trying to trace errors in that chain of completely isolated pieces of software.

The Bane of Log Files

So you know you have a bug when your error monitor tooling or APM gives you an alert (or when angry customers yell at you over the phone). Still, then you have to reproduce the issue and isolate what’s unique about that particular faulty scenario.

The common method of production debugging is to look up the log files of each service running in the cloud (the only way to gain visibility into a cloud service). But, and pardon my language, log files suck! Log files are usually spread across different file system locations, machines, and dashboards. To continue the murder mystery metaphor, log files give you various frames of the crime movie, so you get a flash of a knife or a gun, but you still need to piece together the correct murder weapon and killer, the motive, and the conclusion.

Reading through log files is time-consuming and inefficient. You don’t know exactly what the problem is, so you can’t immediately search for a particular symptom. Of course, inserting a breakpoint and attaching a debugger is out of the question because that would break the production environment.

You also need to figure out which version of the software is live. Then you have to go through massive amounts of logging data to find the log lines that correlate to the problem. Think you’ve got it? The next step is to form a hypothesis and then try to validate that hypothesis.

Usually, you don’t have enough information to understand the root cause of the bug, so your first guess will not produce results – back to adding more log lines, and trying again. This means changing your code, redeploying it to the production environment, and waiting for the error to reproduce again. You’ll likely end up with multiple iterations until you completely understand and resolve the issue.

Spend Your Time Producing, not Reproducing

So now I’ve named the main reasons why I hate debugging, here’s what I decided to do about it: build Ozcode and put an end to the pain of reproducing errors. With Ozcode, it’s like watching the director’s cut. Our lightweight agent technology gives you the full version of the “crime movie”, empowering developers to use time travel debugging techniques to debug the elusive chain of causality that led up to the moment of failure.

Ozcode silently monitors applications for errors allowing you to see from the first frame all the way to the last. This eliminates the guesswork and minimizes the time it takes to remove errors from your production environment (though the agent can also be applied during the staging or QA process, not just production).

When an error occurs, Ozcode captures the time travel and execution of the particular flow of events that led to the exception – using contextual logging to provide you with the full picture (murder mystery film, in the case of our metaphor). The results of this trace are delivered through a browser-based, IDE-like debugging experience. Instead of slaving through never-ending log files, developers can finally debug the actual failure, without having to reproduce it!

Not to mention, considering today’s geographically-distributed teams, Ozcode’s debugging agent can function across locations, allowing remote teams to work together more efficiently. We create a web-based collaborative debugging experience, initiated by sending a link to begin debugging sessions so that teams can solve issues faster than ever before.

From the CTO or R&D manager’s point of view, Ozcode saves money. Debugging in the production environment can be much more expensive than debugging during development or testing. Bugs that appear in production can cause downtime and churn, so when issues appear in a live product, speedy triage is essential.

Who Can Use Ozcode? (Not just murder mystery buffs!)

The Ozcode Production Debugger has something for everyone in the company who is involved with solving bugs in production. From SREs to QA teams, and of course, ideal for developers using .NET Framework or .NET Core (on Windows or Linux) as well as those building ASP.NET web apps, Windows Services, and traditional desktop applications.

Ozcode can help remove the pains from your debugging process, by providing you with context: the full story of your bug.

So if you’re tired of using old methods to solve increasingly difficult problems, then Ozcode Production Debugger is for you. Start for FREE right now and get rid of your debugging headaches so you can spend more time producing code that works.

Omer Raviv

Comments

Keep your code
in good shape!

Follow the latest news in the .NET debugging industry

Ready to Dive into Your Prod Code?

Easy debugging with full time-travel data

Share on facebook
Share on twitter

Recent Posts

Follow Us

Join OzCode YouTube Channel

Let’s start debugging, it’s free!

Keep your code in good shape!

Follow the latest news in the .NET debugging industry
Ozcode Logo

This website uses cookies to ensure you get the best experience on our website.