Wednesday, November 5, 2008

Knowledge needs to be collected from natural owners within the enterprise

This items starts with a high level summary with a few pictures. Following that there is a detailed discussion.

Summary high level summary
Asking all the people looking at single model doesn’t work:

Each person has a different model (data and way of looking at it) they are mainly focused on:

Asking people to deal with models with data in it they don’t understand (or organised in a way they don't understand) won't work well
Asking people just to deal with data they do understand will work:

This allows us all to have the models we want composed of data that others can interact with (i.e. examine, keep current etc.)

Detailed discussion

The problem
Most people accept that a critical need of large, complex enterprises is to effectively manage their knowledge about how they operate and seek to operate. Without this knowledge they can't make informed decisions about how to make transformations. Models of the enterprise contain knowledge and support decision making. However, modelling has almost universally failed to deliver on its promises and to be effectively adopted as a way of managing knowledge to transformations in large enterprises.

This article draws on a decade of experience working with a wide range of clients undertaking transformations in many geographies and sectors. It looks at the question of why modelling by itself fails to provide an effective solution for enterprise decision making. Future articles will explore the characteristics of solutions and strategies that work.

Independent, separated data for modelling
To support business transformation, enterprises need to develop decision support solutions of which models are an essential component. For this approach
to be effective, it is important to be able to separate the enterprise data from the models which represent the data. There are two reasons for this: the nature of models and nature of organisations. We'll look at models first.

What do we mean by models?
Models are particularly useful for answering questions. So modelling is often used in design and planning when many alternatives are being considered. Models can also often be used to create visual representations, which have the power to provide an immediate perspective on a lot of data. The way something is modelled depends almost entirely on the purpose of the model: that is, on what you expect to learn from the model, and to some extent how you want to visualize things.

Models consist of data elements related in fairly complex ways: objects, properties of objects, relationships between objects, calculated values, and so on. As they are always designed to achieve a purpose – and no one model can achieve all purposes, – data-elements will usually appear in many models, each with a different purpose (At this point we could discuss metadata and how it is maintained. That, however, along with semantics, ontology, taxonomies, patterns and so on, is best avoided in an introductory article)

More precisely, I use the term model to mean semantically explicit representations of some perceived reality. For example, models may be presented visually as diagrams (but clearly most diagrams are not models). Or they may be presented numerically or textually: for example, a cashflow spreadsheet and a project plan are both models.

Personally I remain skeptical of most things purporting to be models that are presented in text documents, drawings, or presentation tools such as Word, Visio, and Powerpoint. Why? Because it is too easy to make things look the way you want them to look – and this is harder with real models. Of course talented people, as well as lazy people, can create models that don’t represent reality. But fortunately these are usually fairly easily tested.

The inherent multiplicity of models. An analogy from architecture
The way things are modelled depends on the questions to be asked of the model. If we think about something we are all familiar with, such as a building or a city, we can see that the way we model it will vary depending on the question being asked. Here are just a few examples of the questions asked of buildings:
- usability: What will the user experience be like? For example: What will it look like? (data on: light sources, surface textures, transparency and translucence) How it will behave acoustically? (data on: materials' acoustic properties, and the location and disposition of build elements, noise sources)
- availability: How resilient will it be? How will it behave in distress? (data on: the nature of the materials and how they respond to stress, how building elements are connected, etc.); how it will behave in an earthquake (data on: Structural properties, connections etc,); how it will behave in a fire? (data on: how materials with heat, how they combust etc.)
- cost to create: How much will it cost to establish? (data on: materials, products, volumes and surface areas, etc. Product and material costs, labour rates etc.)
- cost to own: How much will it cost to maintain and operate? (data on: thermal properties, service costs and lifecycles, energy utilization, ongoing costs for services, etc.)
- capacity: How it will perform with respect to throughput? (data on: elevators' performance and their patterns of use, vehicular and ambulatory egress patterns and loads)
optimization: where should things be for optimal performance, e.g. space planning? (data on”costs of people and things, degrees of interaction, etc.)
- value: How much value it will generate? (data on: rental properties e.g. useable floor space, rental rates, etc.)
- greenness: What will its carbon footprint be and how can this be reduced?
- connectivity: How well will WiFi will operate in the building and how can this be improved?

So, what does this tell us about modelling things
To answer the questions we may wish to ask about a building many different types of models may be required. The questions reflect different criteria or constraints that need to be considered in design and planning. The way things are presented and answered reflect the level of interest of different stakeholders: property developer, property owner, property occupier, real estate agent, various construction trades, and so on. Each stakeholder is the natural owner of some data. The multiplicity of models needed has important consequences for the data on which they are based.

Some data-elements are common to many models. For a building, these might be the number of floors, the location, the major walls, etc. Other data-elements are unique to particular models, for example a range of material characteristics, how things are connected, market characteristics, physical characteristics of the location, and so on. And inevitably, we will find that data-elements originating with one particular model need to be reused in another; and probably augmented or renormalised to enable the models to be properly correlated.

The success of our suite of models is going to be critically dependent on collecting and organising the data effectively. The set of models may well represent the sum of our knowledge of the building, but they are not necessarily a good way to manage that knowledge.

The data will come from many different stakeholders, who are often the natural owners of the data As we discuss later, it's important that the various stakeholders can contribute with needing to understand each other's models and domains.

What kind of data will we need to model an enterprise?
The building analogy illustrates that it is unlikely that we can ever define a-priori a canonical set of data-elements which will suffice for all our models. The nature of the questions determines the data required. Notice how carbon footprint and WiFi performance feature in the list presented earlier: despite the fact we have been building for millennia, new questions – and therefore new data requirements – continue to emerge from changes in our environment and behaviour. As with buildings, we need to understand enterprises from many perspectives:
- how our users (customers, partners, employees) will perceive and react to us
- how resilient our process and systems will be
- how much changes will cost to make
- how much things will cost to maintain and operate
- how things will perform, where things should be (data, systems, people) for optimal performance
- how much value will be generated (by products, services, systems).
And, as with buildings, the success of our management of the knowledge of the enterprise is going to depend on effectively collecting and maintaining data contributed by many different stakeholders.

Fragmentation of interests in complex organisations
In an enterprise many people will be interested in things from many different perspectives: economic, operations, organisation, product, market, contracts, and so on. They will be interested in these at different times: now, next year, and in the future. Most of them, while having a broad interest in many aspects of the enterprise, will only have a detailed interest in, and definitive knowledge of, the specific subset of the enterprise they are focused on.

So while many people may be interested in the services an enterprise provides, some will be interested in these at purely a business level and others interested in the technical aspects, some will be interested in costs associated with a service, some with how the service is delivered (procedures, policies, information), some more with who or what performs the service, some with what agreements are associated with the services. And so while many people may be interested in a particular service, few will have definitive knowledge of all its aspects. This is clearly less true in very small enterprises, those that have a very slow rate of change, or those that are intrinsically very simple. In what follows we are really talking about large, complex enterprises with an ongoing need for change.

Fragmentation of models
Often a specific set of data-elements is first considered, and captured in a structured and semantically explicit way, when something is being conceived, planned or changed and an appropriate model is needed. Later these data-elements, whose initial purpose was design and planning, are reused in different ways by other models (reporting, etc.). Eventually when things are operationalised the data-elements are updated by different groups with different interests.

In this situation the form of the data-elements tends to be tightly coupled with the specific purpose and implementation of the original model. This militates against most other people interacting with the data-elements. It is not that most people are not able to understand the model if it is explained to them – it is that they don’t want to understand a model created for a purpose different from their. That is, they don’t want to invest time and energy learning about data and relationships that are, from their perspective, extraneous.

So for most people, only interested in a small subset of the data contained in a complex model, the model will be too complex for them to grasp, or contain too much extraneous data. The model will seem like a complex monolithic assemblage of data not necessarily well suited to harvesting the data for reuse. It is certainly not something most people will feel comfortable updating. This is especially the case if the people are not modellers by nature or not familiar with the modelling tool or technology that was used.

The more generic the modelling tool, the worse this problem is. That is, the more the alien tool becomes an obstruction to shared understanding. Ironically, this obstruction is commensurate with the potential ability of such tools to manage knowledge.

Types of tool
Generic modelling tools (such as meta-modellers, spreadsheets, etc.) allow users to define their own semantics (Though, obviously, meta-metamodels constrain these types of tools). These tools are useful because of the broad set of semantics they can deal with and the way they allow business people to model in a language that makes sense to them. This means that if an insurance company wants to model a policy or a claim, or an entertainment organization wants to model a theatre or a theme park as objects – with specific properties, relationships, calculated values, and so on – then they can.

By comparison, dedicated modelling tools deal usually with a very narrow set of semantics. A planning tool, for example, might essentially deal with just tasks and resources; a data modelling tool with just a half a dozen object types; and so on.

The tool impedance problem – an example
A spreadsheet is a modelling tool that most of us are familiar with. Spreadsheets can be used to create complex models. They are also effectively, in a business sense, meta-modellers. The objects that can be represented are constrained (by validation, formatting) and the relationships that can be defined are limited (to formulas, lookups, etc.), but a set of business semantics can be defined.

Many of us are quite capable of using spreadsheets and creating such complex models. Yet when someone presents us with their complex spreadsheet containing lots of data and many relationships representing business semantics – and they ask us to review it and perhaps update some data – most of the time our reaction is “I don’t really want to the spend the time understanding your model, just tell me what you want to know”. This is a reasonable reaction.

Which brings us back to data independence
The unavoidable conclusion here is that there must be neutral way, independent ofany modelling paradigm, for updating the data-elements represented in a model. A way that suits the interest and level of understanding of each person or role expected to be involved. A way, moreover, that ensures the data-sets are available for use in models of every kind: in other words, that make the data fully reusable. Only then will we have a virtuous cycle of data use and update that will result in knowledge being accreted as a natural by-product of day-to-day behaviour.

We have described the reasons that enterprise modelling has so far delivered less than promised. The problem so far has been that the datasets represented in models have not been visible in a form independent of models. The data have not been reusable by all models, all tools, and all people with an interest in them. The result has been incomplete and ineffective management of the knowledge, making models to expensive to populate and keep current.

Solutions to this problem will be the subject of the next article.