Zero Outage Map
As mentioned above, the Zero Outage Map is based on a value chain approach, so let’s first define the value chain concept in context of IT.
The obvious question is why choose an IT value chain concept and what does it entail? Let’s start with what it is and what it entails. The value chain is a well-known business concept, described by Michael Porter in 1985 (see graphic 2). The principles are pretty simple:
- A chain of related steps (primary activities) that successively create value for the respective stakeholders, e.g. customers, shareholders. The key to this definition is that the value outcome of the chain is greater than the sum of the parts. The additional value is created by the synergy of working together in an integrated fashion based on a common context.
- In order to create the value in an efficient and consistent fashion the supporting activities come into play, which can be common business functions (e.g. procurement), technology (e.g. master data of a supply chain) or infrastructure (e.g. a conveyer belt setup in a production line). Even though these activities are not directly value generating, they provide the common connecting tissue that makes the value chain more efficient, repeatable and predictable.
IT cannot really claim predictability, the typical end-to-end maturity is fairly low, as is the level of collaboration across organisational and technology silos. IT has grown by answering technology disruptions with dedicated solutions in a siloed manner, never having the time to mature and truly integrate them into the overall landscape.
When thinking about this, it becomes glaringly obvious that the value chain concept is highly applicable to IT, and specifically to Zero Outage. First, when decomposing the steps required to deliver end-to-end services, one can quickly see how they need to be logically chained, still allowing iteration. Second, to deliver to the Zero Outage quality level, the chain needs to be highly automated, fault tolerant and most importantly predictable.
Leverage of IT4ITTM
In order to design the Zero Outage Map as an end-to-end value chain, it seems obvious to leverage the IT4ITTM Value Chain standard , published by The Open Group, which provides a description of the IT landscape and how to run IT as a business (see graphic 3). Effectively The Open Group applied and translated the Porter Value Chain to the IT problem. The huge advantage is that this is defined as an open industry standard, continuously reviewed and evolved by a representative group of consumers and providers.
The IT4ITTM value chain is structured into four value streams, describing the well-known Plan, Build, Run phases, but adding the Deliver phase, which segregates the concerns of creating service release packages and the actual implementation in production via service order and fulfilment catalogues. This is a direct response to IT trends, such as DevOps, Cloud and Service Broker becoming pervasive.
Furthermore, based on the value chain concept, The Open Group developed and published the IT4IT Reference Architecture standard, which provides a functional and an information model prescribing how to deliver services in a business fashion. This is a very good starting point for the Zero Outage Industry Standard to expand on, adding Zero Outage-specific architectural policies and data model aspects.
When looking at the IT4ITTM Value Chain, the commonality and applicability to the Zero Outage problem is obvious. All phases significant in the evolution of the Zero Outage interpretation of the value chain, articulated by the Zero Outage Map:
- Properly rationalize the business demand and “Plan” for Zero Outage quality services within the given service delivery boundary conditions, such as strategic, financial, legal, and architectural. This entails strategic criteria, such as an operating model tuned towards Zero Outage, as well as specific design criteria, such as properly architecting the infrastructure stack to ensure the required level of resilience and security.
- Next, the services need to be “Built” to meet the criteria determined in the “plan” phase. In particular the specific design criteria need to be translated into the appropriate non-functional requirements, such as availability, performance and security, which are key to delivering Zero Outage. The Zero Outage Industry Standard will provide practical guidance to develop actionable requirements, which guide the proper service design and development of the service and are particularly useful to hold service providers accountable when sourcing element of the service.
- Traditionally “plan” and “build” was followed by “run”, but the revolution of virtualisation technology, sourcing models and development methodologies (agile, DevOps) required the innovation of a fourth, intermediate step called “Deliver”. These revolutions all have one major consequence: complexity, making it much more difficult to see and track what is going on. That is in direct contradiction to the keys of the value chain concept, namely collaboration and predictability. The traditionalists would say “Keep cloud away from Zero Outage” and the New Agers would counter “Cloud solves your resilience problem by definition”. As always, there is a point in both, therefore we need to make services work in hybrid environments (being the dynamic mix of traditional and cloud elements), and we need to make them manageable. One crucial element is the ability to construct services from various catalogues and various providers, and to activate such services in heterogeneous, hybrid infrastructure environments while keeping them under control. This includes control of usage and respective charging, as well as manageability. Even though there is more to it, this is the essence of “deliver”, what IT4IT calls Request to Fulfil.
- Finally and probably best known in terms of processes and management maturity is the “Run” phase, which assures that the services in use are delivered at the required Zero Outage quality level and stay that way through the reality of inevitable, continuous and dynamic change. However, Zero Outage changes the game significantly. Organisations used to be fairly nonchalant with the notion of “proactive operations”, it seemed to be “good enough” to automate the known and fatalistically accept the disruption of new surprises. However, Zero Business Outage means that there can’t be surprises. Therefore Zero Outage requires innovation of the traditional “run” to anticipating and preventing issues before they disrupt the business.
In addition to the four main phases of Plan, Build, Deliver and Run, the Zero Outage Map also reflects the fact that services become obsolete, hence a phase to “Retire” services has been added. Zero Outage services typically require costly and/or labour intensive components, which should be released and made available to other use cases when no longer required.
Like the IT4ITTM Value Chain we place the Service Model in the centre as the connecting tissue and the heart of the Zero Outage Industry Standard, as described earlier, and of specific relevance to the platform and security workstream work.
We have chosen to depict the value chain as a circular rather than a linear model, knowing that modern IT requires continuous iteration between various capabilities within and between phases. In addition we chose to depict Supporting Functions as a surrounding frame and focused on selected functions specifically important for Zero Outage:
- Governance: Zero Outage has a lot to do with guarantee of a certain quality, which in turn requires governance and control to ensure it actually happens. In addition, it requires governance continuously across the entire value chain. At any given point one needs to be able to determine the current state and service delivery and required course of action.
- Risk Management: decisions always need to find the right balance between conflicting priorities and service delivery boundary conditions. In order to avoid any business degradation one can’t really afford surprises, hence risks need to be proactively understood and managed, especially the impact of actions and changes.
- Analytics & Reporting: this is a key enabler as it provides crucial insights to continuously improve service delivery. It could be argued that this is a sub function of Governance and Risk Management, but it has evolved to be a science. Big data has evolved technologies to become a source of innovation in and of itself, e.g. the determination of anomalies and anticipation of failure is only feasible through the level of analysis that can be done today.
- Supplier Management: multi-supplier service delivery is mainstream, even though the maturity of it is often more on the low end. One of the objectives of the Zero Outage Industry Standard is to structure and streamline the cooperation of suppliers jointly delivering Zero Outage compliant services. One of the key elements is to make the touchpoints between suppliers transparent and measurable.
The service model provides the common context throughout the execution of the value chain, from Plan to Run. It is the source of truth that captures and shares the relevant information about a service at any given point in time.
One can compare the model to master data controlling a supply chain. The consistency and integrity of the service model is the basis for achieving transparency and traceability of the characteristics of a service throughout its lifetime.
The service model evolves over the lifecycle of a service throughout the value chain:
- Conceptual service – the conceptual model represents the output of planning the service, essentially the description “why” and “what” needs to be built in a Zero Outage compliant architectural context.
- Logical service – the logical model expands on the conceptual, adding conclusions “how” the service is built to meet the Zero Outage relevant non-functional requirements in a Zero Outage compliant technology and system architecture.
- Physical service – the physical model further expands on the logical model “with what” the service has been realised in the physical world and how it's being managed and kept current. We delineate between two instances of the physical service: the “Desired Service” model as output of the request fulfilment and the “Actual Service” model as being recognized and managed in operations.
So, throughout the value chain the respective capabilities are based upon a formal specification of the service, always talking about the same service but at different levels of granularity and specificity. The generic structure of the service model is described in the “Layered Model” section below, which the platform workstream will expand upon with detailed design.
A value chain implementation typically includes tools from various vendors, hence it is mandatory to have a common interpretation of the service model data, requiring a common syntax and semantics of its attributes. This is what the IT4ITTM standard started to create and continues to evolve. However, it is likely that Zero Outage use cases will require additional specifications, therefore we plan to cooperate with the Open Group and drive those additions through their standardisation process.
After exploring the Zero Outage Map on the highest level, let‘s drill down one step deeper and look at the capabilities required for the individual phases of the value chain and their respective relevance for the Zero Outage use cases.
In this chapter the Zero Outage Map specification adds to the content of the IT4ITTM standard, but also slightly deviates. This is because the IT4ITTM standard focuses on the functional and information model rather than the capability level, which it only loosely articulates and mostly from a traditional IT perspective. Therefore this capability view adds forward thinking to reformulate known IT capabilities for the new requirements of the digital enterprise.
The overall capability view in Graphic 6 expands each phase with a cycle of 4 core capabilities that articulate what needs to happen. These cycles however neither work independently, nor in a fixed waterfall-type manner. To the contrary, there are typically iterations within the cycle and interoperability between the cycles at any given point in time.
The important fact is that all these interactions happen in a transparent and traceable fashion, tracked in the common context of the service model, which evolves in the level of detail and prescriptiveness over the course of the value chain. It is important to understand the service model concept before diving into the capabilities themselves
- Service Strategy and Sourcing – description of the business boundary conditions for delivering the specific service quality level for the target market, e.g. the Zero Outage justification for a target market (market opportunity, financials etc.), the portfolio priorities and the related sourcing strategy. The resulting conceptual service model codifies the architectural consequences of delivering the service according to these strategic boundary conditions.
- Enterprise Architecture Management – the methodology, the architecture and technology guidelines for the overall service and the underlying application(s) e.g. native cloud applications need to be architected as a set of integrated micro services in order to leverage the resilience capabilities of the underlying cloud infrastructure. The resulting structure, design guidelines and architectural policies become part of the conceptual model.
- Business Demand Management – rationalising, harmonising and grouping the business and operational demand for existing and new services into an actionable portfolio backlog. Actionable means translating the business purpose into a qualified demand specifically describing the non-functional characteristics. That allows to effectively prioritize the backlog according to the strategic and architectural guidelines, which guides agile development.
- Portfolio Management – determining/analysing the scope and value of the “to be” portfolio based upon what has been done (“as is”), what should be done (value generating backlog priorities) and what can be done (budget, skills). The result is a conceptual model articulating a clear proposal/expectation per a desired service in the portfolio.
Specific relevance to Zero Outage: in order to achieve Zero Outage, it is NOT enough to simply improve the maturity managing the service. The objective is to fundamentally change the approach from reactive to proactive management, to avoid issues before they occur. The prerequisites for addressing that problem are transparency and predictability of what is being done, which requires consistent and formal documentation throughout the lifecycle of a service, as the basis of learning what we don’t know. One could argue that all capabilities are important, but two seem most relevant:
- Enterprise Architecture – transparency and predictability starts with a structured (service-oriented) architecture of the service model and the platform that provides the appropriate required architectural and technical criteria.
- Business Demand Management – one can avoid the majority of issues by building the right service and building it right in the first place. The latter is critical in avoiding service degradation: building the service based on the right availability, performance and security requirements. Practical guidelines and examples as to how to translate generic demand e.g. business continuity into the right architecture are critical for Zero Outage and will make a notable difference.
Generic description of the capabilities
- Requirements Management – for the services to be newly sourced/developed or updated, the related business demands must be translated into functional requirements. Most important for Zero Outage is the translation of architectural policies into non-functional requirements, e.g. security, performance, resilience. Analysing, scoping, prioritising and planning the requirements as consumable, packaged value.
- Service Design Engineering – translating the conceptual into a logical model of the service. Developing user stories and detailed design from the respective requirements in the backlog. Driving sourcing decisions on the service component level.
- Service Development and Testing – the actual development of the service, which might entail SW development, sourcing or buying elements of the SW, system integration etc. There is no limitation regarding the choice or mix of agile and waterfall models. Once the consumable value is integrated, the function of the service and the non-functional constraints needs to be verified for real life.
- Service Release Management – once the service is sufficiently tested (based on the intended quality level) and the risk analysis meets the release criteria, the service is ready for general release, a deployable release package is generated and verified that it can be successfully deployed.
Specific relevance to Zero Outage: building the service continues the thread of designing and building the service right. The following two capabilities are especially exposed to that problem:
- Requirements Management – planning the right architecture is the first step, now this needs to be broken down into actionable and measurable non-functional requirements that can be acted upon by developers to create better code, which is better suited to the target production environments. The ability to formulate high-quality non-functional requirements is typically a low maturity in most organisations, hence practical best practices will help tremendously.
- Service Design Engineering – consequently the next critical step is to design a Zero Outage compliant service, which translates the non-functional requirements into the appropriate service architecture and its related logical service model. This involves decisions, such as which deployment model (traditional vs. cloud) to use for which layer, and taking the interdependencies between layers into account. Zero Outage design principles will prescribe the right approach to service modelling and its underlying technologies. Equally important for Zero Outage is the design of the required test cases and their level of automation.
Generic description of the capabilities
- Service Offer Creation – instead of just passing a release along to the operations team, the activation of the service is based upon an offer catalogue specification from which services can be consumed. Therefore the release package is published as a service catalogue entry. In today’s virtualised world, this may involve aggregation of various service components from different catalogues into one consumable item.
- Service Consumption Management – providing a seamless consumption experience via the service catalogue (e.g. as part of the self-service portal) facilitating the shopping, request and ordering process, hiding the complexity of the service from the user (individual or business).
- Service Activation - when being ordered, the service needs to be activated based on the consumer requirements. This involves determining the required physical or virtual infrastructure on which to realise the service, interlocking with the change process. Again, in today’s hybrid IT world, this may likely involve deploying components of the service to different physical infrastructures using different fulfilment engines.
- Service Usage Management - measuring the usage of the activated service and facilitating the appropriate charging, specified in the delivery model, e.g. cross-charge or general allocation through the IT Financial Management system in place.
Specific relevance to Zero Outage: the structure of the catalogue based service activation is important to sustain data integrity, especially due to the complexity and lack of transparency of cloud and multi-supplier models. The following two capabilities prominently drive that characteristic:
- Service Offer Creation – it is critical to model the complexity of virtualized, distributed service components in the service catalogue system. The modeled structure of related components drives the automatic service aggregation of the components from the underlying fulfilments catalogue(s).
- Service Activation – parts of the service may reside in different deployment models controlled by different suppliers, hence it is critical to maintain access points in the desired service model (e.g. through a standardised API gateway). The deployment models might dynamically change, hence integration points are critical to manage these changes while maintaining the related Zero Outage requirements, such as the resilience level of a component. Again, the aforementioned design principles will include that prescription.
Generic description of the capabilities
- Preventive Health Management – classically this includes monitoring the availability and performance of the services and driving event management. For Zero Outage that is not sufficient though, it is all about anticipating issues before they occur.
- Service Assurance – managing processes to assure the required level of services are being met, which includes help desk, incident and problem management. Involving the ability to analyse and diagnose the relevance, potential impact and cause of potential issues, proactively avoiding the breach of service levels.
- Knowledge Management & Automation – translating learnings into predictable and repeatable actions. Determining the required tasks, priorities and timelines to repair keeping the resilience level of the service overall. Translating best practices into automated runbooks, minimising human requirements and failure.
- Configuration & Change Management – discovering the actual state of the service, comparing it to the desired state and reconciling inconsistencies (via the physical service model). This provides the basis for managing the risk and the execution of required changes. Those actions could be automatic done and tracked (e.g. adding bandwidth) or formally governed and requiring interaction (e.g. version upgrade), based on the guidelines specified in the service model. Patching is a specific type of change, like release, the difference becomes marginal in the agile world.
Specific relevance to Zero Outage: the intention of Zero Outage is to guarantee no service degradation, while changes are performed on components of the service. That means we cannot afford surprises managing services, hence we need to manage key characteristics (e.g. resilience) proactively at all times, and we need to learn what we don’t know. One could argue that all Run capabilities (integrated) are critical for Zero Outage, but two seem to be most critical for proactive resilience:
- Preventive Health Management – we need to learn what we don’t know, to anticipate issues before they occur. While the end-to-end monitoring of structured data from the known environment is very important, it is not enough. It needs to evolve towards understanding patterns of behaviour, pinpointing abnormal behaviour, investigating and mitigating/resolving those proactively. Transparency and integration of structured data is the basis for establishing a system of record, but the ability to capture vast amount of unstructured data, integrate and analyse it in the context of the system of record brings it a different level. The use of modern big data technologies is a true innovation opportunity towards establishing a system of insight for IT that helps avoiding surprises.
- Configuration & Change Management – still today, many issues resulting in outages are based on poorly managed changes. Managing the risk of change and sustaining the service model integrity is key to achieving Zero Outage. This is not only a process question, but also needs to be reflected in the underlying data structure, namely the service model and the way access points are defined and brokered.
We earlier introduced four supporting functions that do not directly create value itself but support the value generating capabilities across the chain to enable consistency and efficiency. Hence, one can look at supporting functions as overarching principles that need to be applied within all of the capabilities.
The graphic above articulates a couple of bottom line aspects, conclusions and dependencies:
- Governance is NOT a control instance from above, but is stimulating the right decisions from within. Decisions need to be made at each phase across the value chain, e.g.:
- Defining the right policies to drive zero outage quality.
- Designing the service system according to the requirements.
- Activating the service in the right deployment model.
- Doing the right changes to the running service at the right time to further improve rather than impair quality.
Hence we need to think about the right stimuli driving the right behaviour within the capabilities across the value chain.
- Risk Management is NOT as much concerned with the known issues, but is attempting to understand the unknown and control the impact that stems from it. Sourcing questions carry a lot of uncertainty, release decisions always balance the conflicting objectives time and quality, and changes in a complex production environment often cause seemingly unforeseeable domino effects. Every step of creating and managing zero outage services carries risk to be managed and controlled, hence we need to understand the uncertainties.
- Analytics & Reporting is NOT focused on creating after-the-fact charts to demonstrate the results of what has been done. On the contrary, for zero outage service delivery it needs to focus on creating insight into how things are being done at any given point in time. The objective is not to justify the past but predict the future, and with that enable intelligent and situational decision-making to guarantee zero outage. Naturally, like governance, that has to happen within every capability across the value chain.
- Supplier Management is NOT trying to procure individual pieces of a zero outage service from individual suppliers, but manages the construct of dependencies between all the suppliers that are necessary to plan, build, deliver and run a zero outage service. We need to make sure that every constituency in that construct understands its role and dependencies to other constituencies, fostering collaboration between all involved parties.
It becomes obvious that all four supporting functions are heavily interrelated and dependent on each other:
- In order to drive the right decisions, the appropriate insight is required at the right time. On the other hand, understanding the decisions to be made is crucial to design the appropriate analytics in the first place.
- Sourcing a portion of a service is an uncertainty, at least perception wise, as it is outside the immediate control. Supplier management is an input into risk management as well as a means to mitigate said risk.
- Governance drives supplier decisions as part of an overall architecture, which defines the required analytics, which feeds the insight to understand the uncertainties, which in turn refines the governance for the certainties.
All of the above only works if the data underlying the value chain is made transparent and traceable, which is enabled by the heart of the value chain, the service model. Understanding all of the intricacies of the service and their relationship is the basis for the supporting functions as much as it is the basis for the capabilities.
Let’s focus first on the two supporting functions most critical and intrinsic to the Zero Outage value proposition, Governance and Risk Management.
One can find various definitions of the term “IT Governance”, articulated by standards associations, consultancies and IT vendors. The common denominator comes down to the following: “IT Governance is a framework that ensures IT supports and enables the achievement of company strategies and business objectives in an effective and efficient manner.”
There are a couple of important terms in there that imply consequences:
- Framework: it does not say capability or function or process or even organisation, but implies a set of principles and methodologies, leaving the actual implementation open.
- Business objectives: the focus is on achieving the intended end result, hence on the outcome of the IT value chain, rather than on the result of individual steps in between.
- Ensures: in order to ensure the achievement, governance needs to direct the value chain, to monitor the value chain, and evaluate the results, which in turn creates directives.
- Effective and efficient: besides the business objectives the directive also includes IT objectives how to achieve the business results.
With the definition and specific relevance of the Zero Outage Map capabilities in mind, a key conclusion becomes obvious:
- For Zero Outage to become a reality, the culture applied in the value chain needs to change towards a self-governing, fully transparent, and collaborative attitude with zero tolerance for quality compromises.
- Therefore, governance can NOT be a separate capability, process, function, or even organisation that controls the IT value chain from the outside, but a set of principles and methodologies need to be consistently applied within the value chain.
The monitor-evaluate-direct construct fulfils two purposes:
- Top-Down: breaking down and refining the overall business objectives into corresponding, more granular and actionable objectives for the capabilities down the road. It is essential that service quality objectives, experienced by the consumer, can be linked back to the strategy that drive its creation. It starts with the Enterprise Architecture capability in the planning phase, continued by Service Design in Build, Service Offer Creation in Deliver and Service Assurance in Run that play the key role in deriving and setting the respective directives.
- Bottom-up: continuously improving the way things are being done by providing full transparency and traceability of service creation and delivery aspects across the value chain, enabled by the service model. At any given point in time execution effectiveness and efficiency can be evaluated against the respective objectives.
It is easier said than done to drive a change towards such a culture and attitude, but it is as critical as defining the appropriate architecture with the value chain definition at its heart. Actually it needs to go hand in hand, discussions about culture, will become more concrete when driving from capability description into functional and integration prescription of the underlying reference architecture. On that level the limiting and desired behaviours become more obvious, driving the definition of governance objectives and corresponding KPIs (key performance indicators).
Again, let’s first look at a definition that attempts to capture the popular approaches from various constituencies, such as ISO (International Organisation for Standardisation), NIST (National Institute of Standards and Technology), ISACA (Information Systems Audit and Control Association) and TOG (The Open Group): “The ongoing total process of identifying, controlling, and eliminating or minimising uncertain events that may affect system resources.”
That translates into a generic framework of processes:
- Risk assessment: identification (assets, threats, vulnerabilities, controls, consequences), estimation (quantitative, qualitative), evaluation (compare risk level against criteria, prioritize)
- Risk mitigation (or treatment): deciding the appropriate security measures/controls to reduce, retain, avoid or transfer risks
- Risk monitoring and review (or evaluation and assessment): ongoing regular check of implemented security measures/controls, e.g. vulnerability assessment
Then let’s look at the definition if the managed object, IT Risk
- “Risk is an uncertain event or condition that, if occurs, has an effect on at least one objective” (PMI)
- Or, “Risk is the probable frequency and probable magnitude of future loss” (TOG)
- Or, more focused on information security, “Risk is the potential that a given threat will exploit vulnerabilities of an asset or group of assets and thereby cause harm to the organization” (ISO)
To put it into a simple formula: Risk = Likelihood * Impact, whereby
Likelihood = Threats * Vulnerabilities / Proactive Controls and
Impact = Consequence to the Business / Reactive Controls.
When looking at IT statistics, outages and security beaches are at top of the list of high priority/most impacting IT uncertainties, which is in the centre of the Zero Outage problem statement. Therefore, risk management is
- a key critical methodology
- inherently related to governance
- consequently an integral set of principles and methodologies that need to be consistently applied with the capabilities across the value chain
And the same comments made in the governance section regarding culture and behaviours apply to risk management as well. This section explains “WHY” risk management is critical, but when we dive into the reference architecture work, we will become prescriptive about “HOW” to actually use the theory as part of the day to day capabilities.
Now let’s take a look into the other two supporting functions that also provide the basis for managing risk and governing a zero outage service delivery successfully: Analytics & Reporting and Supplier Management, or better multi-supplier management.
Although these terms are used fairly frequently and seem to be fairly well understood, it makes sense to first define the outcome and position the function in the context of the Zero Outage value chain.
Ultimately, the proper leverage of analytics and reporting increases value and speed, while reducing cost and risk. In order to optimize that outcome, analytics and reporting provides the required insight into the value chain to enable the right decisions at the right time. With that it directly supports the outcome of the overall value chain, rather than optimizes any particular capability within. Therefore it is correctly positioned as a supporting function across the entire value chain.
Analytics and reporting are actually two different functions, however, working closely together to achieve the desired outcome. Analysis explores the data about the value chain and turns it into meaningful insight. Reporting organizes the insight into summarized information for specific audiences to help them make the best decisions in their realm of responsibility, which might be related to a distinct capability (e.g. rationalize portfolio priorities) or a phase of the value chain (e.g. deliver consumer excellence) or the entire value chain (govern for continuous improvement). Whatever decision is enabled at whatever point in the value chain, the insight focuses on optimising the outcome of the value chain at large, rather than sub-optimising a certain capability or function along the way.
Since analytics and reporting is all about turning data into insight, let’s have a look at the different types of systems involved:
- System of record: holds and keeps the integrity of the authoritative data to run the value chain, essentially the structured master data providing the common context for capabilities along the value chain to perform collaboratively and consistently. Obviously this has a lot to do with the service model and associated data evolving along the value chain, representing what we know about service delivery. It is structured data, easy to query and allowing deductive, logical reasoning.
- System of engagement: holds data about the interaction with the value, including both, human and system interactions, such as consumption utilization, quality experience or monitored data. The respective capabilities align such transactional data with the relevant structured data of the system of record, therefore it can be used for deductive analysis as well as for discriminant analysis (e.g. determine patterns of monitored data).
- System of insight: integrates and analyses data from the systems of record and engagement as well as value chain external context (e.g. unstructured social media or business process performance). Technologically (big data) it is able to consume vast amounts of structured, unstructured and transactional data, producing a data lake on which behaviour-driven insights can be analysed.
The resulting insight can then directly feed business rules implementing logic, or inform stakeholders in a focused way to take decisive action.
While it is important to understand the different sets of data and to structure the conceptual approach accordingly, the reference architecture work will guide how to apply the concept in reality. E.g. the integrated analysis of service consumption experience and monitored data on top of an accurate physical service model is the basis to move from traditional incident to preventive health management.
Supplier Management drives the crucial and seamless collaboration between all the suppliers that are necessary to plan, build, deliver and run a zero outage service.
The basis for achieving that lofty goal is the understanding of all service-oriented dependencies between the components of a service as codified in the service model. As all supporting functions, supplier management is applied throughout the entire value chain in order to manage relationships and contracts with all involved parties.
The sourcing strategy provides the boundary conditions to determine potential types of supplied values that are outside the core competency of the company, within or outside IT. This might entail entire services, underpinning dependent services, technology components to build the service or value chain components (sourced capabilities) to deliver the service or parts of it.
For each element to be sourced, suppliers and their “product” need to be evaluated, mutual expectations needs to be negotiated, decided and contractually formalized before the sourced “product” is made consumable either as part of the service or the value chain.
Budget and service cost management are necessary dependent functions throughout the decision making process along the value chain to drive make vs buy decisions. Collaboration tools, such as ChatOps, enable the continuous synchronization of common context and mutual expectations.
The real-time sharing of insights with all involved parties is absolutely crucial in order to act as one team towards delivering zero outage quality, both preventively to avoid failures (e.g. joint analysis of patterns) and reactively to restore components before impacting the business (e.g. service model context sensitive event/incident case exchange). These are just examples, more application of the model will materialize when diving into the reference architecture.