BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Why We Need Software Monitoring

Following

Clouds are natural. Clearly, the vapor-like billowing mists that make up our planet’s cloud formations and systems are part of the natural phenomena that make our world so special. Computing clouds are obviously less natural i.e. we fabricate them out of virtualized compute ‘instances’ that we define via Software-as-a-Service (SaaS) processes and tools that enable us to provision them for specific tasks and functions.

But as digital and ordered as they are, computing clouds often work themselves into a state of tension, almost quite naturally. To be fair to the cloud, it is us the users (and the machines that we also empower to connect with the cloud) that knock cloud instances out of kilter as we overload them, misconfigure them, integrate them with non-native services that they don’t dovetail or balance well with.

Monitoring as a business

What these realities bring us to is a point where cloud monitoring has become a subset specialist discipline in and of itself. We have cloud observability specialists, we have Application Performance Management (APM) specialists and we have cloud-native security specialists that devote a large proportion of their efforts to cloud controls - and then we have monitoring purists.

Styling itself as a dedicated monitoring vendor, eG Innovations is a company known for its cloud-based application performance and IT-infrastructure monitoring solutions. The company has monitoring tools that work on both operational clouds and on software application development environments and virtual workspaces used by its software engineers.

Technical product specialist at eG Innovations Rachel Berry says that like many organizations, the company has evolved to have multiple on-site development teams in multiple countries. It also has a substantial number of employees who work from home, work remotely or operate on hybrid work schedules. This dispersed diversity means that eG Innovations has to make sure it keeps its developers productive and content by making sure they have software tools and applications available 24/7 i.e. if someone can’t check in a 'code fix', it ultimately also affects the ability to service customer support tickets

“Developers need to be able to properly test their work, so they don’t get swamped with support tickets when code goes into production or is released to customers. Collaboration tools and mechanisms are used to collect data so different teams aren’t finger-pointing or blaming each other,” said Berry. “We use a mixture of real user monitoring and synthetic monitoring (robot users, simulating access 24/7) to detect issues proactively and to resolve them. Virtual Desktop Infrastructure (VDI) is extremely useful for standardizing development environments and ensuring our IT teams only support a limited known configuration. VDI also helps us avoid ‘problematic’ technologies such as VPNs.”

Roasting bandwidth hoggers

Thinking about the working environment that she and the team oversee, Berry explains that many of the company’s developers access its VDI systems from laptops remotely. Often when they encounter user experience problems, the root cause is something associated with the physical endpoint or associated with the worker’s home location (poor ISP connection, Wi-Fi router issues, other household members gaming or streaming and hogging bandwidth etc.) - and this means that having to have tools in place to troubleshoot home and remote workers’ hardware and home networking.

“We have many shared resources that our developers leverage, particularly databases. If these have a problem the effects can impact multiple teams and block progress,” clarified Berry. “Having database monitoring in place that our developers have visibility on is extremely important for our business continuity. Similar services that should be monitored are systems responsible for building and delivering customer patches and responses. Uptime and performance of infrastructure services including file servers, Active Directory, hypervisors and even storage devices are important to ensure that developers remain productive.”

Integrated monitoring into DevOps

The eG Innovations operations team works to continually provide developers with performance monitoring data (both live and historical) that allows them to assess the impact of change in IT services. Being able to automatically detect changes in an application or cloud service’s baseline performance and then correlate that performance change to newly deployed versions and code releases (of that same application or service) as they are implemented makes development processes faster and raises quality.

“This is particularly important to us for find issues early, having Application Performance Monitoring (APM) in place alongside stress and load testing allows our developers to identify bottlenecks and their causes even down to a single line of code in a Java or .NET application. This helps us avoid many bugs or performance issues reaching customers or even our own QA team,” clarified Berry, with an appropriate nod to the use of APM, which still remains a key discipline in this context.

Beyond the usual suspect configuration management, testing and code repository review developer tools deployed at this level such as GitHub, Jenkins, Ansible and so on, the team also continually monitor applications and tools such as O365, Zoom and Microsoft Teams. It is important to have visibility into the root cause of issues, particularly if tools are delivered as SaaS or are cloud hosted – is it an Azure problem vs. a bandwidth issue?

The company’s monitoring infrastructure is also integrated with an IT Service Management (ITSM) function and the ticketing tools that offers, such as JIRA and ServiceNow. This helps ensure the team can track and review problems and set targets for issue resolution. Treating the company’s software application developers and their issues with the same diligence and urgency as its customers is the mantra and ethos being used here.

What 'type' of cloud?

“Many development teams now deploy applications on cloud infrastructure including public clouds such as Azure, Amazon AWS or Google GCP for agility. Often there is a lack of coordination between IT teams provisioning cloud resources and the development teams that need those resources. An important decision that has to be taken when provisioning resources is the type of cloud instances to use. Development teams often describe their requirements in terms of CPU and memory needed (e.g. 4 vCPUs, 16 GB RAM), while IT teams have to provision VMs by choosing an instance type,” noted Berry, in specific detail.

For example, she says, if the team uses a ‘burstable’ [fast to use] IT instance type because it is cheaper, it may not match the resource usage needs of the development team (who may be thinking they are getting a virtual machine (VM) with dedicated capacity). When stress testing the application, the VM may run out of CPU credits and performance may be poor, leading to developer frustration.

No amount of debugging code will reveal the cause of some issues. Having the right oversight and monitoring for cloud environments is key for application success in the cloud. By monitoring and tracking the availability and performance of developers’ tools and applications, we can set internal Service Level Agreements (SLAs) and Key Performance Indicators (KPIs) to quantify whether our developers and of course our users are getting what they need.

The only question now is, who’s monitoring the monitoring team, right?

Follow me on Twitter or LinkedIn

Join The Conversation

Comments 

One Community. Many Voices. Create a free account to share your thoughts. 

Read our community guidelines .

Forbes Community Guidelines

Our community is about connecting people through open and thoughtful conversations. We want our readers to share their views and exchange ideas and facts in a safe space.

In order to do so, please follow the posting rules in our site's Terms of Service.  We've summarized some of those key rules below. Simply put, keep it civil.

Your post will be rejected if we notice that it seems to contain:

  • False or intentionally out-of-context or misleading information
  • Spam
  • Insults, profanity, incoherent, obscene or inflammatory language or threats of any kind
  • Attacks on the identity of other commenters or the article's author
  • Content that otherwise violates our site's terms.

User accounts will be blocked if we notice or believe that users are engaged in:

  • Continuous attempts to re-post comments that have been previously moderated/rejected
  • Racist, sexist, homophobic or other discriminatory comments
  • Attempts or tactics that put the site security at risk
  • Actions that otherwise violate our site's terms.

So, how can you be a power user?

  • Stay on topic and share your insights
  • Feel free to be clear and thoughtful to get your point across
  • ‘Like’ or ‘Dislike’ to show your point of view.
  • Protect your community.
  • Use the report tool to alert us when someone breaks the rules.

Thanks for reading our community guidelines. Please read the full list of posting rules found in our site's Terms of Service.