Learning Library

← Back to Library

ARM and APM: Unified Performance Assurance

6m • Unknown Channel • devops • tutorial • intermediate • Watch on YouTube ↗

Key Points

Assuring app performance requires both the application‑level insight of APM and the infrastructure‑level optimization of ARM, which together guarantee resources are available when needed.
In the “it’s the node” scenario, an ARM system uses real‑time infrastructure and application metrics to automatically tune cloud resources, eliminating guesswork about where performance bottlenecks lie.
In the “it’s the code” scenario, APM provides deep runtime diagnostics that help developers quickly identify and fix code‑related issues in development or production environments.
Turbonomic Application Resource Management for IBM Cloud Paks pulls data from IBM Observability by Instana (or other APM tools like AppDynamics, Dynatrace, New Relic) to map end‑to‑end relationships from business applications down to containers, pods, and nodes—all without installing extra agents.
By automating resource allocation decisions based on fine‑grained metrics such as transaction counts, CPU, and heap usage, the ARM/APM combo turns the usual “whodunnit” blame game into proactive performance assurance.

Sections

Full Transcript

# ARM and APM: Unified Performance Assurance **Source:** [https://www.youtube.com/watch?v=C9Sm0pmQLC0](https://www.youtube.com/watch?v=C9Sm0pmQLC0) **Duration:** 00:06:59 ## Summary - Assuring app performance requires both the application‑level insight of APM and the infrastructure‑level optimization of ARM, which together guarantee resources are available when needed. - In the “it’s the node” scenario, an ARM system uses real‑time infrastructure and application metrics to automatically tune cloud resources, eliminating guesswork about where performance bottlenecks lie. - In the “it’s the code” scenario, APM provides deep runtime diagnostics that help developers quickly identify and fix code‑related issues in development or production environments. - Turbonomic Application Resource Management for IBM Cloud Paks pulls data from IBM Observability by Instana (or other APM tools like AppDynamics, Dynatrace, New Relic) to map end‑to‑end relationships from business applications down to containers, pods, and nodes—all without installing extra agents. - By automating resource allocation decisions based on fine‑grained metrics such as transaction counts, CPU, and heap usage, the ARM/APM combo turns the usual “whodunnit” blame game into proactive performance assurance. ## Sections - [00:00:00](https://www.youtube.com/watch?v=C9Sm0pmQLC0&t=0s) **Combining ARM and APM for Performance** - The speaker explains how integrating Application Resource Management with Application Performance Monitoring ensures steady app performance by optimizing infrastructure resources (“it’s the node”) and providing detailed runtime diagnostics for code issues (“it’s the code”). - [00:03:04](https://www.youtube.com/watch?v=C9Sm0pmQLC0&t=184s) **Automating Resource Management Decisions** - The speaker explains how Turbonomic’s data‑driven action recommendations—spanning non‑disruptive memory increases to potentially disruptive VM resource reductions—can be manually vetted, then fully automated to lower IT spend and prevent application performance issues, especially when paired with Instana’s real‑time microservice metrics. - [00:06:09](https://www.youtube.com/watch?v=C9Sm0pmQLC0&t=369s) **Integrating APM with ARM** - The speaker explains how combining Turbonomic's Application Resource Management with Instana's APM automates performance decisions, prevents issues before they arise, and reduces IT spend. ## Full Transcript

0:01You likely monitor app performance, 0:03but what do you do to assure app performance? 0:05And when users report their app is slow, 0:07what do you do? 0:08Hi, I'm Dan Kehn from IBM Cloud. 0:11In this video, I'll explain why platforms 0:14for Application Resource Management, or ARM, 0:16and Application Performance Monitoring, or APM, 0:19can work together 0:20to solve some particularly knotty performance problems 0:22for your business applications. 0:24So how do you assure app performance? 0:26In principle, it's simple-- 0:27you need to make sure the applications 0:29have the resources they need, when they need them. 0:32This is where the goals of APM and ARM systems intersect. 0:36You can get steady, predictable performance 0:39by combining the application level 0:40understanding of an APM 0:42with the infrastructure know-how of an ARM. 0:45To explain how ARM and APM systems work, 0:48I'll walk through two scenarios. 0:50The first I call "it's the node", 0:53Instead of relying on educated guesses, 0:55I'll show how an ARM system 0:56can use infrastructure and application metrics 0:59to optimize cloud resources for performance. 1:02The second I call "it's the code". 1:04As a developer, I know that better runtime diagnostics 1:07will make my job easier. 1:09I'll show an APM system can help debug tricky problems, 1:12whether it's in your dev or production environment. 1:15Let's get to it. 1:17I referred to this first scenario 1:18as the "it's the node" view of performance 1:20because today's modern business applications 1:22run in containers, 1:24which are hosted by virtual machines 1:25that are on-premise or in the public cloud. 1:28But that's the infrastructure's view of the world. 1:30Your customers think in terms what the application does, 1:33not how it executes. 1:36Once deployed to select environments, 1:38Turbonomic Application Resource Management 1:40for IBM Cloud® Paks uses the data from standard APM API 1:44to discover both application and cloud entities. 1:47It then builds this end to end supply chain view 1:50from top level business application 1:52down to the supporting infrastructure. 1:55These diagram relationships act as a single source of truth 1:58for application performance in your hybrid environment. 2:00In this case, Turbonomic is pulling in analytics data 2:04from IBM Observability by Instana APM. 2:07But it can also work with APM tools 2:09like AppDynamics, Dynatrace, and New Relic. 2:12No agents are required to be installed. 2:15With the help of this data, 2:16it recognizes their associated transactions, 2:18services, and application components, 2:21as well as the infrastructure components 2:22like containers, pods, and nodes. 2:25Okay, let's return to my opening question. 2:28When the user reports their applications is slow, 2:30what happens next? 2:32Whether it's a war room or a group chat, I'll bet 2:35your approach boils down to creating a whodunnit list 2:37of suspected services and infrastructure components. 2:40How do we change this blame game? 2:42By automating resource allocation decisions 2:45based on data like application transactions 2:48and application CPU, or application heap. 2:51Not just large grain metrics like container RAM usage. 2:54That's the secret sauce of ARM/APM combo. 2:58It can avoid the mistake 2:59of incorrectly estimating resource requirements 3:02that lead to performance problems. 3:04The Turbonomic proposed actions shown here 3:06are fundamental ARM decisions 3:08independent of the underlying technology. 3:11Yikes! That's a lot of actions. 3:14There's actions for decreasing node memory, 3:16reallocating VM resources, and tweaking the permitted IOPS. 3:20But don't worry, lots of actions are a good thing. 3:23Sure, until you trust the recommendations, 3:25you'll go through them manually, 3:26but long-term you let the system handle them automatically. 3:30In other words, 3:30the performance objective is decision automation 3:32for resource management, not just process automation. 3:36These actions can help you reduce your IT spend. 3:39And since the actions are based on usage data, 3:41you'll get more recommendations over time. 3:43You begin by reviewing the actions you should take. 3:46This view provides context, 3:48like whether it's a non-disruptive action, 3:50such as increasing memory, or potentially disruptive, 3:53like reducing the resources of a VM. 3:56Once you validate enough proposed actions 3:58and you are confident the recommendations are trustworthy, 4:00you can move to the next stage: decision automation. 4:05That's the end game-- 4:05eliminating resource problems 4:06as a source of application delay 4:08so your users don't need to call you 4:10about performance problems in the first place. 4:13Okay, in the prior scenario, 4:15Turbonomic plus Instana 4:17helped intelligently manage resources 4:18based on actual runtime metrics. 4:21It took advantage of microservice metrics 4:22captured by Instana like load, latency, 4:25error rate, and saturation. 4:27That can lead to better performance and reduce IT costs. 4:30As a bonus, 4:31you can escape the resource management blame game, 4:33but for diagnosing actual performance problems 4:35at the code level, 4:36Instana's always-on tracing 4:38can really save debug time. 4:40Let's take a quick look at the "it's the code" scenario. 4:44Based on a chat message from Cloud Pak® for Watson AIOps, 4:47you can proactively investigate and resolve brewing issues 4:50before they impact users. 4:53Combined with the deep dive performance monitoring tools 4:55from Instana, you can confirm the details 4:57that led to an incident report. 4:59In this case, a sudden increase 5:01in the error returned by the discount service. 5:07The event that triggered the incident 5:08shows that the error rates spike suddenly. 5:10This unexpected behavior is automatically detected by Instana 5:14without the need to manually set and maintain thresholds. 5:19As shown in the related events, 5:20the abnormal termination of the MySQL process 5:22was the root cause. 5:24Once the database was brought back online, 5:26the discount service returned to normal operation. 5:29Let's review the trace logs that pointed the cause. 5:33Here we see that the failing calls 5:34to the discount service took about two seconds, 5:36suggesting a timeout situation. 5:38We can confirm this in the trace detail. 5:42It shows how the error in the timeline, 5:44indicated by the red triangle, propagated up the call stack, 5:47ultimately returning to server 500 error. 5:51If needed, the ops team and dev team 5:52have access to more low-level detail, 5:54such as the call stack, application components stack, 5:57log details, even previews of the underlying source code. 6:01Now you see as a developer, 6:02why I'm excited about how Instana 6:04helps diagnose and resolve difficult code problems, 6:07even in a production environment. 6:09Okay, let's summarize. 6:11By integrating APM and ARM, 6:13the ARM system can automate decisions 6:16informed by application metrics 6:17as well as infrastructure awareness. 6:20That's why the combination of Turbonomic and Instana 6:22helps you take full advantage 6:24of the performance elasticity of cloud platforms 6:26without over-provisioning. 6:28What could this mean to your company? 6:30ARM automation can address performance problems 6:33before they occur and reduce IT spend. 6:35This means more time and money 6:37to put towards expanding your business 6:38rather than reacting to alerts and trouble tickets. 6:42Thanks for watching. 6:44If you'd like to see more videos like this in the future, 6:46please click like and subscribe. 6:48If you want to learn more 6:49about Turbonomic Application Resource Management 6:51for IBM Cloud Paks and IBM Observability by Instana APM, 6:56please make sure to check out the links in the description.