Rebecca Wirfs-Brock and I developed a workshop on decision making in software architecture. This is an extract from a paper from the IEEE International Conference on Software Architecture (ICSA 2019) that describes the workshop. The final version is in the ICSA proceedings. A preprint of the full paper is available here.
First, we shape our architecture. Then, our architecture shapes us. As architects we bring part of ourselves to the systems we work with. We evolve with our architectures. In this tutorial we consider the metaphor of “terroir” to understand architectures and their sense of place. Terroir comes from the French word used to describe the set of all environmental factors that affect the observable characteristics of an organism, e.g., the unique set of contextual characteristics of place that influence food crops, coffee, tea, or wine. So too in systems, architectures are uniquely shaped by the culture and context of a place. Factors include people, organization, culture, technology, and tenets shared among the architects and makers. Understanding an architecture is a first step towards evaluating it. The set of concepts and practical tools covered in this tutorial are well suited to being used in conducting architecture analyses and reviews and integrate with any other processes an organization might be using.
Structure and Outline
- Part 1: Making sense of system architectures. The tutorial begins by introducing some concepts that help people to make sense of a system architecture. The outcome includes insight not just into the architecture itself, but also the wider context, including culture, decision-making processes, attitudes, constraints, and assumptions that contribute to the architecture. We will demonstrate how to see and interpret patterns to understand architecture context and understand the decision-making landscape of which architects are part.
- Part 2: Decision models for architects. Having established a sense of place for the architecture, we will move into discussing decision models. Different kinds of decision are necessary to evolve our architectures. Sometimes we need to make high-stakes decisions under conditions of uncertainty, with insufficient information, and too little time. Other times we need to balance deep thought, collaboration, and trade-offs among different architecture qualities.
- Part 3: Taking action to evolve our architectures in conditions of uncertainty. Once we have a sense of place, and we have decided how we will make decisions, we will move into action. In this tutorial we focus on making decisions and acting in conditions of volatility, uncertainty, complexity, and ambiguity. We explore the roles of heuristics and experimentation for making decisions under such conditions, and how this influences the evolution and evolve-ability of our architectures.
- Part 4: Practical considerations for the dimension of time in architecture decisions. In this section we will look at the temporal dimensions of architecture decisions. We will look at the time factors that affect our architectures. These include when decisions are made, the cadence of decision making, the impact of decisions over time, and challenges around ensuring follow-through and consistency of decisions over time.
- Part 5: Summary and closing activities. Summary of concepts, decision models and tools; Q&A. In this section we spend time to ensure participants have at least one or two practical things they are ready to try when they get back to the office.
This is the abstract and recommendations from a paper on understanding the context in which architects make decisions that I co-wrote with Rebecca Wirfs-Brock. We presented the paper at the 2018 European Conference on Software Architecture (ECSA 2018) in Madrid. The full paper is available in the ECSA 2019 proceedings, published by Springer. A preprint version is available here.
Many organizations struggle with efficient architecture decision-making approaches. Often, the decision-making approaches are not articulated or understood. This problem is particularly evident in large, globally distributed organizations with multiple large products and systems. The significant architecture decisions of a system are a critical organization knowledge asset, as well as a determinant of success. However, the environment in which decisions get made, recorded, and followed up on often confounds rather than helps articulation and execution of architecture decisions. This paper looks at aspects of architecture decision-making, drawing from an industry-based case study. The data represents findings from a qualitative case study involving a survey and three focus groups across multiple organizations in a global technology company. Architects in this organization are responsible for multiple products and systems, where individual products can include up to 50+ teams. The impact is not just on others in the system; architecture decisions also impact other decisions and other architects. The findings suggest recommendations for organizations to improve how they make and manage architecture decisions. In particular, this paper notes the relevance of group decision-making, decision scope, and social factors such as trust in effective architecture decision-making.
- Consider the space-time separation of teams, and how that impacts architecture de- cisions. When dealing with teams who are separated in space (through multiple ge- ographies) and time (through multiple time zones), make an effort to compartmental- ize the scope of responsibility of teams such that coherent architecture decisions can be made in each location.
- Establish clear decision-making boundaries. Articulate who is responsible for which type of decisions. This can be based on scope of decision (product, system, compo- nent, etc.), nature of decision (product, technology, etc.), or something else.
- If your organization is using an agile development approach, then take the time to articulate how architecture fits.
- Understand who is impacted by decisions made by architects. Establish a feedback loop so that architects understand that impact in a timely manner.
- Start with why. Architects in this study expressed a much higher degree of success in decision adoption when other people understood why a decision is being taken. This is an important part of the context of architecture decisions.
- Take the time to foster trust among architects and those impacted by decisions.
- Consider how architecture decisions are retained and communicated. We see a need for retaining and communicating architecture decisions and their rationale, espe- cially when decisions have broad impact. Documenting decisions, to be effective, should fit into existing processes.
- Some decisions are necessarily made for short-term expediency, e.g. to address an immediate customer need. Perhaps there needs to be some mechanism to flag these types of decisions and manage them, perhaps in a product debt backlog (especially those that will incur architecture debt) for periodic review.
This is the abstract from a paper I wrote about my experiences using sensemaking in large-scale transformation efforts. I presented this at the 49th Hawaii International Conference on System Sciences (HICSS 2016). The final paper is available as part of the HICSS proceedings. A pre-print is available here.
For organizations undergoing agile and lean transformation, it can be difficult to get meaningful, actionable insights into progress and impediments. Teams and organizations are best understood as complex adaptive human systems. Understanding what is happening in such systems requires approaches grounded in the complexity sciences and social sciences. This paper describes an approach using complexity science and sensemaking that helps an organization understand its culture, how it is progressing with its strategic initiatives, and the types of impediments that are holding it back. It provides a means of qualitative and quantitative analysis that helps teams and organizations improve. This paper also correlates the experiences of the people in the organization to its goals of being a more agile organization.
This is the abstract and summary of lessons learned from an experience report I wrote and presented at the Agile 2015 conference in Washington DC. The full paper is available here. Among other things, the paper talks about using A3 problem solving, Cynefin, and the Containers, Differences, Exchanges model from Human Systems Dynamics in the context of portfolio management in large organizations.
Working in a multi-team, multi-program, multi-product environment brings several challenges. One of those is providing a smooth flow of work to teams, and incorporating their feedback, while staying responsive to the needs of the business in a changing environment. Managing the portfolio backlog is a critical piece of the solution. This Experience Report documents several years’ experience working in such environments. The focus of this Experience Report is specifically on managing the portfolio backlog, not the full scope of what could be considered under a portfolio management strategy and implementation. We have found that getting the portfolio backlog management strategy right is a key element in the success of the overall portfolio management approach.
Summary of Lessons Learned
This section summarizes some of the key lessons learned in managing portfolio backlogs. Some general lessons related to solving problems in organizations include:
- Understanding the nature of the problem helps us to take appropriate action to solve the problem. The Cynefin framework helps with this.
- Make sure you are solving actual problems and causes, not just symptoms. A3 problem solving helps with this.
- Understand how to create a balance between agility, self-organization and coherence. HSD and the CDE model helps with this.
- Focus on the end-to-end flow of value through your organization, and on actively removing anything that impedes the flow of work. Lean thinking helps with this.
- Understand what success and failure could look like before running your experiments. This will help you pay beselective about the patterns you pay attention to.
Some specific lessons related to managing portfolio backlogs in large organizations include:
- Define the focus of your portfolio. In general, it is good practice to base the portfolio structure on your product line rather than organization structure. The former is what your customers care about; the latter more temporal.
- Understand what content goes on the portfolio backlog. Define different types of items, e.g., features, initiatives, architecture items, etc.
- Focus on the flow of work from portfolio to teams. The portfolio backlog management approach is an enabler of flow. Define policies for centralized portfolio-level decisions and localized program- and team-level decisions.
- Set up a portfolio backlog management meeting at a regular cadence with the right participants. Create a Definition of Ready for portfolio items. Focus the meeting on feedback from the development teams, and on moving portfolio backlog items to a ‘ready’ state. Do not let it become a status or strategy planning meeting.
- Create conditions that encourage a strong relationship between product managers, engineering leaders and architects. Together they bring multiple important perspectives to creating the portfolio backlog items. Consider also adding user experience design leaders to this mix, depending on the nature of your products.
Finally, this is a process of continuous experimentation and improvement. While some things can ultimately be moved to the obvious domain of best practices, or the complicated domain of good practices, we still operate within an ever-changing and complex environment that requires continuous awareness, experimentation, learning and adaptation. We continue to experiment and make improvements.
This abstract is from a paper I co-wrote with Kieran Conboy for the 37th International Conference on Software Engineering (ICSE 2015) in Firenze, Italy. The final paper is available in the conference proceedings. A pre-print version is available here.
Contemporary lean thinking, especially in knowledge work areas like software engineering, begins with understanding flow. Architecture plays a vital role in enabling the flow of value in software engineering teams and organizations. To date there has been little research in understanding impediments to flow in software engineering organizations. A focus on enabling flow through removing impediments is a useful perspective in creating a more agile, lean thinking software engineering organization. Particularly so when supported by appropriate metrics. This paper presents a case study of how architecture-related impediments impact the flow of work in software engineering teams and organizations. The key contributions of this paper are centered on the concept of flow and impediments in modern software engineering, and its relationship with architecture. We develop an understanding of how a focus on flow and removing impediments, supported by appropriate metrics, is helpful in identifying architecture-related challenges . Drawing on research of one company’s practices the paper presents an example of a scenario where flow analysis using specific metrics reveals architecture-related impediments and shows how addressing these impediments improves effectiveness and productivity in ways that would not otherwise have been revealed.
This is the abstract and conclusions from a paper I presented at Agile 2014. The full text is here.
When adopting agile and lean approaches in our company, one goal for teams and organizations is to achieve a smooth end-to-end flow of work through the system. This paper presents a useful set of metrics that reveal how work is flowing. It describes four metrics we find useful: Cumulative Flow, Throughput Analysis combined with Demand Analysis, Cycle Time and Lead Time.
These metrics help you understand Flow in your teams and organizations. In particular:
- CFDs give deeper insight into what’s happening in queues or workflow states, and help diagnose problems.
- Throughput Analysis shows how work is flowing through our system over time. It is even more useful when combined with a Demand Analysis that shows the proportion of work flowing through the systemthat is Value Demand versus Failure Demand.
- Cycle Time analysis shows how long it takes for work items to pass through one or a subset of workflowstates. This enables teams to make predictions about how long it takes to process planned work items.
- Lead Time analysis shows how long it takes for work items to pass through the entire organization. This enables the organization to make predictions about how long it will take to process requests. We generally use Lead Time to understand the time it takes work to pass through all states, from the moment there is arequest or idea, to the moment the work is complete and in the hands of customers.
- All these metrics can be used to indicate the presence of impediments to Flow in your system. The combination of these metrics offers good insight into what’s happening in an organization. They provide insight and visibility on status, and inform forecasting around when specific content might be delivered.
This is the abstract and conclusions from a short paper I wrote and presented at the 15th International Conference on Agile Software Development (XP 2014) in Rome, Italy. A preprint version of the full paper is available here.
Teams and organizations are complex adaptive systems. Self- organization in complex adaptive systems evolves through a set of Simple Rules. Self-organization is a core tenet of agile teams. Self-organization does not mean everyone gets to do whatever they want to do. Team members create contracts with each other. These contracts create boundaries, or containers, within which self-organization can occur. Teams also create contracts with other teams, the wider organization and other stakeholders. The contracts are both implicit and explicit. Social contracts in complex adaptive systems are more effective if they are based on Simple Rules. Social Contract Theory acts as a lens through which we can better understand these social contracts in agile teams. This paper represents ongoing research that examines the role of Simple Rules and Social Contract Theory in fostering self-organization in agile development teams. The paper discusses four examples of social contracts in agile teams: definition of done, definition of ready, working agreements, and retrospectives.
This paper described the connection between Social Contract Theory and agile teams, viewing agile teams as complex adaptive systems. The field of Human Systems Dynamics provides a suitable lens through which to view teams and organizations as complex adaptive social systems, and defines necessary conditions for self- organization using Containers, Differences and Exchanges. The social contracts in agile teams and organizations are based on the Simple Rules that govern emergence and self-organization.
Simple Rules support coherent behaviors in a system. Definition of done, definition of ready, and working agreements are all examples of social contracts, created using Simple Rules, in agile teams and organizations. In addition, there are examples of social contracts to be found in retrospectives, including the prime directive, second directive and ground rules. These Simple Rules and Social Contracts support emergent behaviors and self-organization.
Teams own their own Simple Rules. As teams adapt their Simple Rules, new patterns are formed in the system. These patterns are governed by the social contracts created by the Simple Rules. Violating the Simple Rules creates a tension in the system that can be resolved by the team enforcing the rules or altering the rules (an Exchange intervention), or by the team membership changing (a Container intervention).
Social Contracts exist within agile teams, between agile teams, between agile teams and management, and within management teams.