Vose-Risk Analysis - Project Enchiridion

--- ## Part I: Introduction ### Destaque (amarelo) - Chapter 1: Why do a risk analysis? > Página 3 · Posição 354 1.1 Moving on from “ What If ” Scenarios Single - point or deterministic modelling involves using a single “ best - guess ” estimate of each variable within a model to determine the model's outcome ( s ) . Sensitivities are then performed on the model to determine how much that outcome might in reality vary from the model outcome . This is achieved by selecting various combinations for each input variable . These various combinations of possible values around the “ best guess ” are commonly known as “ what if ” scenarios . The model is often also “ stressed ” by putting in values that represent worst - case scenarios . Consider a simple problem that is just the sum of five cost items . We can use the three points , minimum , best guess and maximum , as values to use in a “ what if ” analysis . Since there are five cost items and three values per item , there are 35 = 243 possible “ what if ” combinations we could produce . Clearly , this is too large a set of scenarios to have any practical use . This process suffers from two other important drawbacks : only three values are being used for each variable , where they could , in fact , take any number of values ; and no recognition is being given to the fact that the best - guess value is much more likely to occur than the minimum and maximum values . We can stress the model by adding up the minimum costs to find the best - case scenario , and add up the maximum costs to get the worst - case scenario , but in doing so the range is usually unrealistically large and offers no real insight . The exception is when the worst - case scenario is still acceptable . Quantitative risk analysis ( QRA ) using Monte Carlo simulation ( the dominant modelling technique in this book ) is similar to “ what if ” scenarios in that it generates a number of possible scenarios . However , it goes one step further by effectively accounting for every possible value that each variable could take and weighting each possible scenario by the probability of its occurrence . QRA achieves this by modelling each variable within a model by a probability distribution . The structure of a QRA model is usually ( there are some important exceptions ) very similar to a deterministic model , with all the multiplications , additions , etc . , that link the variables together , except that each variable is represented by a probability distribution function instead of a single value . The objective of a QRA is to calculate the combined impact of the uncertainty1 in the model's parameters in order to determine an uncertainty distribution of the possible model outcomes . ### Destaque (amarelo) - Chapter 1: Why do a risk analysis? > Página 5 · Posição 375 1.2 The Risk Analysis Process Figure 1.2 shows a typical flow of activities in a risk analysis , leading from problem formulation to decision . This section and those that follow provide more detail on each activity . Figure 1.2 The risk analysis process . ### Destaque (amarelo) - Chapter 1: Why do a risk analysis? > Página 10 · Posição 507 Management Options The manager evaluating the possible options for dealing with a defined risk issue needs to consider many things : Is the risk assessment of sufficient quality to be relied upon ? How sensitive is the ranking of each option to model uncertainties ? What are the benefits relative to the costs associated with each risk management option ? Are there any secondary risks associated with a chosen risk management option ? How practical will it be to execute the risk management option ? Is the risk assessment of sufficient quality to be relied upon ? ( See Chapter 3 . ) How sensitive is the ranking of each option to model uncertainties ? On this last point , we almost always would like to have better data , or greater certainty about the form of the problem : we would like the distribution of what will happen in the future to be as narrow as possible . However , a decision - maker cannot wait indefinitely for better data and , from a decision - analytic point of view , may quickly reach the point where the best option has been determined and no further data ( or perhaps only a very dramatic change in knowledge of the problem ) will make another option preferable . This concept is known as decision sensitivity . For example , in Figure 1.3 the decision - maker considers any output below a threshold T ( shown with a dashed line ) to be perfectly acceptable ( perhaps this is a regulatory threshold or a budget ) . The decision - maker would consider option A to be completely unacceptable and option C to be perfectly fine , and would only need more information about option B to be sure whether it was acceptable or not , in spite of all three having considerable uncertainty . Figure 1.3 Different possible outputs compared with a threshold T . 1.5 Inefficiencies in Transferring Risks to Others A common method of managing risks is to force or persuade another party to accept the risk on your behalf . For example , an oil company could require that a subcontractor welding a pipeline accept the costs to the oil company resulting from any delays they incur or any poor workmanship . The welding company will , in all likelihood , be far smaller than the oil company , so possible penalty payments would be catastrophic . The welding company will therefore value the risk as very high and will require a premium greatly in excess of the expected value of the risk . On the other hand , the oil company may be able to absorb the risk impact relatively easily , so would not value the risk as highly . The difference in the utility of these two companies is shown in Figures 1.4 to 1.7 , which demonstrate that the oil company will pay an excessive amount to eliminate the risk . Figure 1.4 The contractor's utility function is highly concave over the money gain / loss range in question . That means , for example , that the contractor would value a loss of 100 units of money ( e.g . $ 100 000 ) as a vastly larger loss in absolute utility terms than a gain of $ 100 000 might be . Figure 1.5 Over that same money gain / loss range , the oil company has an almost exactly linear utility function . The contractor , required to take on a risk with an expected value of − $ 60 000 , would value this as — X utiles . To compensate , the contractor would have to charge an additional amount well in excess of $ 100 000 . The oil company , on the other hand , would value − $ 60 000 in rough balance with + $ 60 000 , so will be paying considerably in excess of its valuation of the risk to transfer it to the contractor . Figure 1.6 Imagine the risk has a 10 % probability of occurring , and its impact would be − $ 300 000 , to give an expected value of − $ 30 000 . If $ 300 000 is the total capital value of the contractor , it won't much matter to the contractor whether the risk impact is $ 300 000 or $ 3 000 000 — they still go bust . This is shown by the shortened utility curve and the horizontal dashed line for the contractor . Figure 1.7 In this situation , the contractor now values any risk with an impact that exceeds its capital value at a level that is less than the oil company ( shown as “ Discrepancy ” ) . It may mean that the contractor can offer a more competitive bid than another , larger contractor who would feel the full risk impact , but the oil company will not have covered the risk it had hoped to transfer , and so again will be paying more than it should to offload the risk . Of course , one way to avoid this problem is to require evidence from the contractor that they have the necessary insurance or capital base to cover the risk they are being asked to absorb . A far more realistic approach to sharing risks is through a partnership arrangement . A list of risks that may impact on various parties involved in the project is drawn up , and for each risk one then asks : How big is the risk ? What are the risk drivers ? Who is in control of the risk drivers ? Who has the experience to control them ? Who could absorb the risk impacts ? How can we work together to manage the risks ? What arrangement would efficiently allocate the risk impacts and rewards for good risk management ? Can we insure , etc . , to share risks with outsiders ? The more one can allocate ownership of risks , and opportunities , to those who control them the better — up to the point where the owner could not reasonably bear the risk impact where others can . Answering the questions above will help you construct a contractual arrangement that is risk efficient , workable and tolerable to all parties . ### Destaque (amarelo) - Chapter 1: Why do a risk analysis? > Página 16 · Posição 652 Ranking risks P − I scores can be used to rank the identified risks . A scaling factor , or weighting , is assigned to each phrase used to describe each type of impact . Table 1.5 provides an example of the type of scaling factors that could be associated with each phrase / impact type combination . Table 1.5 An example of the scores that could be associated with descriptive risk categories to produce a severity score . Category Score Very high 5 High 4 Medium 3 Low 2 Very low 1 In this type of scoring system , the higher the score , the greater is the risk . A base measure of risk is probability * impact . The categorising system in Table 1.1 is on a log scale , so , to make Table 1.5 consistent , we can define the severity of a risk with a single type of impact as which leaves the severity on a log scale too . If a risk has k possible types of impact ( quality , delay , cost , reputation , environmental , etc . ) , perhaps with different probabilities for each impact type , we can still combine them into one score as follows : The severity scores are then used to determine the most important risks , enabling the management to focus resources on reducing or eliminating risks from the project in a rational and efficient manner . A drawback to this approach of ranking risks is that the process is quite dependent on the granularity of the scaling factors that are assigned to each phrase describing the risk impacts . If we have better information on probability or impact than the scoring system would allow , we can assign a more accurate ( non - integer ) score . ### Destaque (amarelo) - Chapter 3: The quality of a risk analysis > Página 29 · Posição 937 3.1 The Reasons Why a Risk Analysis can be Terrible From Figure 3.1 I think you'll see that there really needs to be more communication between decision - makers and their risk analysts and a greater attempt to work as a team . I see the risk analyst as an important avenue of communication between those “ on the ground ” who understand the problem at hand and hold the data and those who make decisions . The risk analyst needs to understand the context of the decision question and have the flexibility to be able to find the method of analysis that gives the most useful information . I've heard too many risk analysts complain that they get told to produce a quantitative model by the boss , but have to make the numbers up because the data aren't there . Now doesn't that seem silly ? I'm sure the decision - maker would be none too happy to know the numbers are all made up , but the risk analyst is often not given access to the decision - makers to let them know . On the other hand , in some business and regulatory environments they are trying to follow a rule that says a quantitative risk analysis needs to be completed — the box needs ticking . Regulations and guidelines can be a real impediment to creative thinking . I've been in plenty of committees gathered to write risk analysis guidelines , and I've done my best to reverse the tendency to be formulaic . My argument is that in 19 years we have never done the same risk analysis twice : every one has its individual peculiarities . Yet the tendency seems to be the reverse : I trained over a hundred consultants in one of the big four management consultancy firms in business risk modelling techniques , and they decided that , to ensure that they could maintain consistency , they would keep it simple and essentially fill in a template of three - point estimates with some correlation . I can see their point — if every risk analyst developed a fancy and highly individual model it would be impossible to ensure any quality standard . The problem is , of course , that the standard they will maintain is very low . Risk analysis should not be a packaged commodity but a voyage of reasoned thinking leading to the best possible decision at the time . I think it is usually pretty easy to see early on in the risk analysis process that a quantitative risk analysis will be of little value . There are several key areas where it can fall down : 1 . It can't answer all the key questions . 2 . There are going to be a lot of assumptions . 3 . There is going to be one or more show - stopping assumption . 4 . There aren't enough good data or experts . We can get around 1 sometimes by doing different risk analyses for different questions , but that can be problematic when each risk analysis has a different set of fundamental assumptions — how do we compare their results ? For 2 we need to have some way of expressing whether a lot of little assumptions compound to make a very vulnerable analysis : if you have 20 assumptions ( and 20 is quite a small number ) , all pretty good ones — e.g . we think there's a 90 % chance they are correct , but the analysis is only useful if all the assumptions are correct , then we only have a 0.920 = 12 % chance that the assumption set is correct . Of course , if this were the real problem we wouldn't bother writing models . In reality , in the business world particularly , we deal with assumptions that are good enough because the answers we get are close enough . In some more scientific areas , like human health , we have to deal with assumptions such as : compound X is present ; compound X is toxic ; people are exposed to compound X ; the exposure is sufficient to cause harm ; and treatment is ineffective . The sequence then produces the theoretical human harm we might want to protect against , but if any one of those assumptions is wrong there is no human health threat to worry about . If 3 occurs we have a pretty good indication that we don't know enough to produce a decent risk analysis model , but maybe we can produce two or three crude models under different possible assumptions and see whether we come to the same conclusion anyway . Area 4 is the least predictable because the risk analyst doing a preliminary scoping can be reassured that the relevant data are available , but then finds out they are not available either because the data turn out to be clearly wrong ( we see this a lot ) , the data aren't what was thought , there is a delay past the deadline in the data becoming available or the data are dirty and need so much rework that it becomes impractical to analyse them within the decision timeframe . There is a lot of emphasis placed on transparency in a risk analysis , which usually manifests itself in a large report describing the model , all the data and sources , the assumptions , etc . , and then finishes with some of the graphical and numerical outputs described in Chapter 5 . I've seen reports of 100 or 200 pages that seem far from transparent to me — who really has the time or inclination to read such a document ? The executive summary tends to focus on the decision question and numerical results , and places little emphasis on the robustness of the study . ### Destaque (amarelo) - Chapter 3: The quality of a risk analysis > Página 35 · Posição 1060 3.4 The Biggest Uncertainty in a Risk Analysis The techniques discussed above have focused on the vulnerability of the results of a risk analysis to the parameters of a model . When we are asked to review or audit a risk analysis , the client is often surprised that our first step is not to look at the model mathematics and supporting statistical analyses , but to consider what the decision questions are , whether there were a number of assumptions , whether it would be possible to do the analysis a different ( usually simpler , but sometimes more complex and precise ) way and whether this other way would give the same answers , and to see if there are any means for comparing predictions against reality . What we are trying to do is see whether the structure and scope of the analysis are correct . The biggest uncertainty in a risk analysis is whether we started off analysing the right thing and in the right way . Finding the answer is very often not amenable to any numerical technique because we will not have any alternative to compare against . If we do , it might nonetheless take a great deal of effort to put together an alternative risk analysis model , and a model audit is usually too late in the process to be able to start again . A much better idea , in my view , is to get a sense at the beginning of a risk analysis of how confident we should be that the analysis will be scoped sufficiently broadly , or how confident we are that the world is adequately represented by our model . Needless to say , we can also start rather confident that our approach will be quite adequate and then , once having delved into the details of the problem , find out we were quite mistaken , so it is important to keep revisiting our view of the appropriateness of the model . We encourage clients , particularly in the scientific areas of risk in which we work , to instigate a solid brainstorming session of experts and decision - makers whenever it has been decided that a risk analysis is to be undertaken , or maybe is just under consideration . The focus is to discuss the form and scope of the potential risk analysis . The experts first of all need to think about the decision questions , discuss with decision - makers any possible alternatives or supplements to those questions and then consider how they can be answered and what the outputs should look like ( e.g . only the mean is required , or some high percentile ) . Each approach will have a set of assumptions that need to be thought through carefully : What would the effect be if the assumptions are wrong ? If we use a conservative assumption and estimate a risk that is too high , are we back to where we started ? We need to think about data requirements too : Is the quality likely to be good and are the data easily attainable ? ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 38 · Posição 1162 4.1.2 Influence diagrams Influence diagrams are quite popular — they essentially replicate the mathematics you can build in a spreadsheet , but the modelling environment is quite different ( Figure 4.1 is a simple example ) . Analytica ® is the most popular influence diagram tool . Variables ( called nodes ) are represented as graphical objects ( circles , squares , etc . ) and are connected together with arrows ( called arcs ) which show the direction of interaction between these variables . The visual result is a network that shows the viewer which variables affect which , but you can imagine that such a diagram quickly becomes overly complex , so one builds submodels . Click on a model object and it opens another view to show a lower level of interaction . Personally , I don't like them much because the mathematics and data behind the model are hard to get to , but others love them . They are certainly very visual . Figure 4.1 Example of a simple influence diagram . 4.1.3 Event trees Event trees offer a way to describe a sequence of probabilistic events , together with their probabilities and impacts . They are perhaps the most useful of all the methods for depicting a probabilistic sequence , because they are very intuitive , the mathematics to combine the probabilities is simple and the diagram helps ensure the necessary discipline . Event trees are built out of nodes ( boxes ) and arcs ( arrows ) ( Figure 4.2 ) . Figure 4.2 Example of a simple event tree . The tree starts from the left with a node ( in the diagram below , “ Select animal ” to denote the random selection of an animal from some population ) , and arrows to the right indicate possible outcomes ( here , whether the animal is infected with some particular disease agent , or not ) and their probabilities ( p , which would be the prevalence of infected animals in the population , and ( 1 − p ) respectively ) . Branching out from these boxes are arrows to the next probability event ( the testing of an animal for the disease ) , and attached to these arrows are the conditional probabilities of the next level of event occurring . The conditional nature of the probabilities in an event tree is extremely important to underline . In this example : Thus , following the rules of conditional probability algebra , we can say , for example : Event trees are very useful for building up your probability thinking , although they will get quite complex rather quickly . We use them a great deal to help understand and communicate a problem . ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 40 · Posição 1190 4.1.4 Decision trees Decision trees are like event trees but add possible decision options ( Figure 4.3 ) . They have a role in risk analysis , and in fields like petroleum exploration they are very popular . They sketch the possible decisions that one might make and the outcomes that might result . Decision tree software ( which can also produce event trees ) can calculate the best option to take under the assumption of some user - defined utility function . Again , personally I am not a big fan of decision trees in actual model writing . I find that it is difficult for decision - makers to be comfortable with defining a utility curve , so I don't have much use for the analytical component of decision tree software , but they are helpful for communicating the logic of a problem . Figure 4.3 Example of a simple decision tree . The decision options are to make either of two investments or do nothing with associated revenues as a result . More involved decision trees would include two or more sequential decisions depending on how well the investment went . 4.1.5 Fault trees Fault trees start from the reverse approach to an event tree . An event tree looks forward from a starting point and considers the possible future outcomes . A fault tree starts with the outcome and looks at the ways it could have arisen . A fault tree is therefore constructed from the right with the outcome , moves to the left with the possible immediate events that could have made that outcome arise , continues backwards with the possible events that could have made the first set of events arise , etc . Fault trees are very useful for focusing attention on what might go wrong and why . They have been used in reliability engineering for a long time , but also have applications in areas like terrorism . For example , one might start with the risk of deliberate contamination of a city's drinking water supply and then consider routes that the terrorist could use ( pipeline , treatment plant , reservoir , etc . ) and the probabilities of being able to do that given the security in place . 4.1.6 Discrete event simulation Discrete event simulation ( DES ) differs from Monte Carlo simulation mainly in that it models the evolution of a ( usually stochastic ) system over time . It does this by allowing the user to define equations for each element in the model for how it changes , moves and interacts with other elements . Then it steps the system through small time increments and keeps track of where all elements are at any time ( e.g . parts in a manufacturing system , passengers in an airport or ships in a harbour ) . ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 41 · Posição 1212 More sophisticated tools can increase the clock steps when nothing is happening , then decrease again to get a more accurate approximation to the continuous behaviour it is modelling . We have used DES for a variety of clients , one of which was a shipping firm that regularly received LNG - ships at its site on a narrow shared waterway . The client wanted to investigate the impact of constructing an alternative berthing system designed to reduce the impact of their activities on other shipping movements , and the model evaluated the benefits of such a system . Within the DES model , movements of the client's and any other relevant shipping traffic were simulated , taking into account restrictions of movements by certain rules and regulations and evaluating the costs of delays . The stand - alone model , as well as documentation and training , was provided to the client and helped them to persuade the other shipping operators and the Federal Energy Regulatory Commission ( FERC ) of the effectiveness of their plan . Figure 4.4 shows a screen shot of the model ( it looks better in colour ) . Going from left to right , we can see that currently there is one ship in the upper harbour , four in the inner harbour , none at the city front and one in the outer harbour . In the client's berth , two ships are unloading with 1330 and 2430 units of materials still on board . In the upper right - hand corner the number of ships entering the shared waterway is visible , including the number of ships that are currently in a queue ( three and two ships of a particular type ) . Finally , the lower right - hand corner shows the current waterway conditions , which dictate some of the rules such as “ only ships of a certain draft can enter or exit the waterway given a particular current , tide , wind speed and visibility ” . Figure 4.4 Example of a DES model . DES allows us to model extremely complicated systems in a simple way by defining how the elements interact and then letting the model simulate what might happen . It is used a great deal to model , for example , manufacturing processes , the spread of epidemics , all sorts of complex queuing systems , traffic flows and crowd behaviour to design emergency exits . The beauty of a visual interface is that anyone who knows the system can check whether it behaves as expected , which makes it a great communication and validation tool . 4.2 Calculation Methods Given a certain probability model that we wish to evaluate , there are several methods that we could use to produce the required answer , which I describe below . 4.2.1 Calculating moments This method uses some probability laws that are discussed later in this book . In particular it uses the following rules : 1 . The mean of the sum of two distributions is equal to the sum of their means , i.e . and . 2 . The mean of the product of two distributions is equal to the product of their means , i.e . . 3 . The variance of the sum of two independent distributions is equal to the sum of their variances , i.e . V ( a + b ) = V ( a ) + V ( b ) and V ( a − b ) = V ( a ) + V ( b ) . 4 . , where n is some constant . The moments calculation method replaces each uncertain variable with its mean and variance and then uses the above rules to estimate the mean and variance of the model's outcome . So , for example , three variables a , b and c have the following means and variances : a Mean = 70 Variance = 14 b Mean = 16 Variance = 2 c Mean = 12 Variance = 4 If the problem is to calculate 2a + b − c , the result can be estimated as follows : These two values are then used to construct a normal distribution of the outcome : where is the standard deviation of the distribution which is the square root of the variance . This method is useful in certain situations , like the summation of a large number of potential risks and in the determination of aggregate distributions ( Section 11.2 ) . It does have some fairly severe limitations — it cannot easily cope with divisions , exponents , power functions , branching , etc . In short , this technique becomes very difficult to execute for all but the most simple models that also reasonably obey its set of assumptions . 4.2.2 Exact algebraic solutions Each probability distribution has associated with it a probability distribution function that mathematically describes its shape . Algebraic methods have been developed for determining the probability distribution functions of some combinations of variables , so for simple models one may be able to find an equation directly that describes the output distribution . For example , it is quite simple to calculate the probability distribution function of the sum of two independent distributions ( the following maths might not make sense until you've read Chapter 6 ) . Let X be the first distribution with density f ( x ) and cumulative distribution function FX ( x ) , and let Y be the second distribution with density g ( x ) . Then the cumulative distribution function of the sum of X and Y , FX + Y , is given by ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 43 · Posição 1278 4.1 The sum of two independent distributions is sometimes known as the convolution of the distributions . By differentiating this equation , we obtain the density function of X + Y : 4.2 So , for example , we can determine the distribution of the sum of two independent Uniform ( 0 , 1 ) distributions . The probability distribution functions f ( x ) and g ( x ) are both 1 for 0 ≤ x ≤ 1 , and zero otherwise . From Equation ( 4.2 ) we get For 0 ≤ a ≤ 1 , this yields which gives fX + Y ( a ) = a . For 1 ≤ a ≤ 2 , this yields which is a Triangle ( 0 , 1 , 2 ) distribution . Thus , if our risk analysis model was just the sum of several simple distributions , we could use these equations repeatedly to determine the exact output distribution . There are a number of advantages to this approach , for example : the answer is exact ; one can immediately see the effect of changing a parameter value ; and one can use differential calculus to explore the sensitivity of the output to the model parameters . A variation of the same approach is to recognise the relationship between certain distributions . For example : There are plenty of such relationships , and many are described in Appendix III , but nonetheless the distributions used in a risk analysis model don't usually allow such simple manipulation and the exact algebraic technique becomes hugely complex and often intractable very quickly , so it cannot usually be considered as a practical solution . 4.2.3 Numerical approximations Some fast Fourier transform and recursive techniques have been developed for directly , and very accurately , determining the aggregate distribution of a random number of independent random variables . A lot of focus has been paid to this particular problem because it is central to the actuarial need to determine the aggregate claim payout an insurance company will face . However , the same generic problem occurs in banking and other areas . I describe these techniques in Section 11.2.2 . There are other numerical techniques that can solve certain types of problem , particularly via numerical integration . ModelRisk , for example , provides the function VoseIntegrate which will perform a very accurate numerical integration . Consider a function that relates the probability of illness , Pill ( D ) , to ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 44 · Posição 1311 the number of virus particles ingested , D , as follows : If we believed that the number of virus particles followed a Lognormal ( 100,10 ) distribution , we could calculate the probability of illness as follows : where the VoseIntegrate function interprets “ # ” to be the variable to integrate over and the integration is done between 1 and 1000 . The answer is 2.10217E - 05 — a value that we could only determine with accuracy using Monte Carlo simulation by running a large number of iterations . 4.2.4 Monte carlo simulation This technique involves the random sampling of each probability distribution within the model to produce hundreds or even thousands of scenarios ( also called iterations or trials ) . Each probability distribution is sampled in a manner that reproduces the distribution's shape . The distribution of the values calculated for the model outcome therefore reflects the probability of the values that could occur . Monte Carlo simulation offers many advantages over the other techniques presented above : The distributions of the model's variables do not have to be approximated in any way . Correlation and other interdependencies can be modelled . The level of mathematics required to perform a Monte Carlo simulation is quite basic . The computer does all of the work required in determining the outcome distribution . Software is commercially available to automate the tasks involved in the simulation . Complex mathematics can be included ( e.g . power functions , logs , IF statements , etc . ) with no extra difficulty . Monte Carlo simulation is widely recognised as a valid technique , so its results are more likely to be accepted . The behaviour of the model can be investigated with great ease . Changes to the model can be made very quickly and the results compared with previous models . Monte Carlo simulation is often criticised as being an approximate technique . However , in theory at least , any required level of precision can be achieved by simply increasing the number of iterations in a simulation . The limitations are in the number of random numbers that can be produced from a random number generating algorithm and , more commonly , the time a computer needs to generate the iterations . For a great many problems , these limitations are irrelevant or can be avoided by structuring the model into sections . The value of Monte Carlo simulation can be demonstrated by considering the cost model problem of Figure 4.5 . Triangular distributions represent uncertainty variables in the model . There are many other , very intuitive , distributions in common use ( Figure 4.6 gives some examples ) that require little or no probability knowledge to understand . The cumulative distribution of the results is shown in Figure 4.7 , along with the distribution of the values that are generated from running a “ what if ” scenario analysis using three values as discussed at the beginning of this chapter . The figure shows that the Monte Carlo outcome does not have anywhere near as wide a range as the “ what if ” analysis . This is because the “ what if ” analysis effectively gives equal probability weighting to all scenarios , including where all costs turned out to be their maximum and all costs turned out to be their minimum . Let us allow , for a minute , the maximum to mean the value that only has a 1 % chance of being exceeded ( say ) . The probability that all five costs could be at their maximum at the same time would equal ( 0.01 ) 5 or 1 : 10 000 000 000 : not a realistic outcome ! Monte Carlo simulation therefore provides results that are also far more realistic than those that are produced by simple “ what if ” scenarios . Figure 4.5 Construction project cost model . Figure 4.6 Examples of intuitive and simple probability distributions . Figure 4.7 Comparison of distributions of results from “ what if ” and risk analyses . ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 47 · Posição 1356 4.3 Uncertainty and Variability “ Variability is a phenomenon in the physical world to be measured , analysed and where appropriate explained . By contrast , uncertainty is an aspect of knowledge ” . Sir David Cox There are two components of our inability to be able precisely to predict what the future holds : these are variability and uncertainty . This is a difficult subject , not least because of the words that we risk analysts have available to describe the various concepts and how these words have been used rather carelessly . Bearing this in mind , a good start will be to define the meaning of various keywords . I have used the now fairly standard meanings for uncertainty and variability , but might be considered to be deviating a little from the common path in my explanation of the units of uncertainty and variability . The reader should bear in mind the comments I'll make about the different meanings that various disciplines assign to certain words . As long as the reader manages to keep the concepts clear , it should be an easy enough task to work out what another author means even if some of the terminology is different . Variability Variability is the effect of chance and is a function of the system . It is not reducible through either study or further measurement , but may be reduced by changing the physical system . Variability has been described as “ aleatory uncertainty ” , “ stochastic variability ” and “ interindividual variability ” . Tossing a coin a number of times provides us with a simple illustration of variability . If I toss the coin once , I will have a head ( H ) or tail ( T ) , each with a probability of 50 % if one presumes a fair coin . If I toss the coin twice , I have four possible outcomes { HH , HT , TH , TT } , each with a probability of 25 % because of the coin's symmetry . We cannot predict with certainty what the tosses of a coin will produce because of the inherent randomness of the coin toss . The variation among a population provides us with another simple example . If I randomly select people off the street and note some physical characteristic , like their height , weight , sex , whether they wear glasses , etc . , the result will be a random variable with a probability distribution that matches the frequency distribution of the population from which I am sampling . So , for example , if 52 % of the population are female , a randomly sampled person will be female with a probability of 52 % . In the nineteenth century a rather depressing philosophical school of thought , usually attributed to the mathematician Marquis Pierre - Simon de Laplace , became popular , which proposed that there was no such thing as variability , only uncertainty , i.e . that there is no randomness in the world and an omniscient being or machine , a “ Laplace machine ” , could predict any future event . This was the foundation of the physics of the day , Newtonian physics , and even Albert Einstein believed in determinism of the physical world , saying the often quoted “ Der Herr Gott wurfelt nicht ” — “ God does not play dice ” . Heisenberg's uncertainty principle , one of the foundations of modern physics and , in particular , quantum mechanics , shows us that this is not true at the molecular level , and therefore subtly at any greater scale . In essence , it states that , the more one characteristic of a particle is constrained ( for example , its location in space ) , the more random another characteristic becomes ( if the first characteristic is location , the second will be its velocity ) . Einstein tried to prove that it is our knowledge of one characteristic that we are losing as we gain knowledge of another characteristic , rather than any characteristic being a random variable , but he has subsequently been proven wrong both theoretically and experimentally . Quantum mechanics has so far proved itself to be very accurate in predicting experimental outcomes at the molecular level where the predictable random effects are most easily observed , so we have a lot of empirical evidence to support the theory . Philosophically , the idea that everything is predetermined ( i.e . the world is deterministic ) is very difficult to accept too , as it deprives us humans of free will . The non - existence of free will would in turn mean that we are not responsible for our actions — we are reduced to complicated machines and it is meaningless to be either praised or punished for our deeds and misdeeds , which of course is contrary to the principles of any civilisation or religion . Thus , if one accepts the existence of free will , one must also accept an element of randomness in all things that humans affect . Popper ( 1988 ) offers a fuller discussion of the subject . Sometimes systems are too complex for us to understand properly . For example , stock markets produce varying stock prices all the time that appear random . Nobody knows all the factors that influence a stock price over time — it is essentially infinitely complex and we accept that this is best modelled as a random process . Uncertainty Uncertainty is the assessor's lack of knowledge ( level of ignorance ) about the parameters that characterise the physical system that is being modelled . It is sometimes reducible through further measurement or study , or by consulting more experts . Uncertainty has also been called “ fundamental uncertainty ” , “ epistemic uncertainty ” and “ degree of belief ” . Uncertainty is by definition subjective , as it is a function of the assessor , but there are techniques available to allow one to be “ objectively subjective ” . This essentially amounts to a logical assessment of the information contained in available data about model parameters without including any prior , non - quantitative information . The result is an uncertainty analysis that any logical person should agree with , given the available information . Total uncertainty Total uncertainty is the combination of uncertainty and variability . These two components act together to erode our ability to be able to predict what the future holds . Uncertainty and variability are philosophically very different , and it is now quite common for them to be kept separate in risk analysis modelling . Common mistakes are failure to include uncertainty in the model , or modelling variability in some parts of the model as if it were uncertainty . The former will provide an overconfident ( i.e . insufficiently spread ) model output , while the latter can grossly overinflate the total uncertainty . Unfortunately , as you will have gathered , the term “ uncertainty ” has been applied to both the meaning described above and total uncertainty , which has left the risk analyst with some problems of terminology . Colleagues have suggested the word “ indeterminability ” to describe total uncertainty ( perhaps a bit of a mouthful , but still the best suggestion I've heard so far ) . There has been a rather protracted argument between traditional ( frequentist ) and Bayesian statisticians over the meaning of words like probability , frequency , confidence , etc . Rather than go through their various interpretations here , I will simply present you with how I use these words . I have found my terminology helps clarify my thoughts and those of my clients and course participants very well . I hope they will do the same for you . Probability Probability is a numerical measurement of the likelihood of an outcome of some stochastic process . It is thus one of the two components , along with the values of the possible outcomes , that describe the variability of a system . The concept of probability can be developed neatly from two different approaches . The frequentist approach asks us to imagine repeating the physical process an extremely large number of times ( trials ) and then to look at the fraction of times that the outcome of interest occurs . That fraction is asymptotically ( meaning as we approach an infinite number of trials ) equal to the probability of that particular outcome for that physical process . So , for example , the frequentist would imagine that we toss a coin a very large number of times . The fraction of the tosses that comes up heads is approximately the true probability of a single toss producing a head , and , the more tosses we do , the closer the fraction becomes to the true probability . So , for a fair coin we should see the number of heads stabilise at around 50 % of the trials as the number of trials gets truly huge . The philosophical problem with this approach is that one usually does not have the opportunity to repeat the scenario a very large number of times . The physicist or engineer , on the other hand , could look at the coin , measure it , spin it , bounce lasers off its surface , etc . , until one could declare that , owing to symmetry , the coin must logically have a 50 % probability of falling on either surface ( for a fair coin , or some other value for an unbalanced coin as the measurements dictated ) . Probability is used to define a probability distribution , which describes the range of values the variable may take , together with the probability ( likelihood ) that the variable will take any specific value . Degree of uncertainty In this context , “ degree of uncertainty ” is our measure of how much we believe something to be true . It is one of the two components , along with the plausible values of the parameter , that describe the uncertainty we may have about the parameter of the physical system ( “ the state of nature ” , if you like ) to be modelled . We can thus use the degree of uncertainty to define an uncertainty distribution , which describes the range of values within which we believe the parameter lies , as well as the level of confidence we have about the parameter being any particular value , or lying within any particular range . A distribution of confidence looks exactly the same as a distribution of probability , and this can lead , all to easily , to confusion between the two quantities . Frequency Frequency is the number of times a particular characteristic appears in a population . Relative frequency is the fraction of times the characteristic appears in the population . So , in a population of 1000 people , 22 of whom have blue eyes , the frequency of blue eyes is 22 and the relative frequency is 0.022 or 2.2 % . Frequency , by the definition used here , must relate to a known population size . ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 51 · Posição 1478 Uncertainty and variability are described by distributions that , to all intents and purposes , look and behave exactly the same . One might therefore reasonably conclude that they can be used together in the same Monte Carlo model : some distributions reflecting the uncertainty about certain parameters in the model , the other distributions reflecting the inherent stochastic nature of the system . We could then run a simulation on such a model which would randomly sample from all the distributions and our output would therefore take account of all uncertainty and variability . Unfortunately , this does not work out completely . The resultant single distribution is equivalent to our “ best - guess ” distribution of the composite of the two components . Technically , it is difficult to interpret , as the vertical scale represents neither uncertainty nor variability , and we have lost some information in knowing what component of the resultant distribution is due to the inherent randomness ( variability ) of the system , and what component is due to our ignorance of that system . It is therefore useful to know how to keep these two components separate in an analysis if necessary . ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 59 · Posição 1633 4.4.2 Monte Carlo sampling Monte Carlo sampling uses the above sampling method exactly as described . It is the least sophisticated of the sampling methods discussed here , but is the oldest and best known . Monte Carlo sampling got its name as the code word for work that von Neumann and Ulam were doing during World War II on the Manhattan Project at Los Alamos for the atom bomb , where it was used to integrate otherwise intractable mathematical functions ( Rubinstein , 1981 ) . However , one of the earliest examples of the use of the Monte Carlo method was in the famous Buffon's needle problem where needles were physically thrown randomly onto a gridded field to estimate the value of π . At the beginning of the twentieth century the Monte Carlo method was also used to examine the Boltzmann equation , and in 1908 the famous statistician Student ( W . S . Gossett ) used the Monte Carlo method for estimating the correlation coefficient in his t - distribution . Monte Carlo sampling satisfies the purist's desire for an unadulterated random sampling method . It is useful if one is trying to get a model to imitate a random sampling from a population or for doing statistical experiments . However , the randomness of its sampling means that it will over - and undersample from various parts of the distribution and cannot be relied upon to replicate the input distribution's shape unless a very large number of iterations are performed . For nearly all risk analysis modelling , the pure randomness of Monte Carlo sampling is not really relevant . We are almost always far more concerned that the model will reproduce the distributions that we have determined for its inputs . Otherwise , what would be the point of expending so much effort on getting these distributions right ? ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 63 · Posição 1718 4.5 Simulation Modelling My cardinal rule of risk analysis modelling is : “ Every iteration of a risk analysis model must be a scenario that could physically occur ” . If the modeller follows this “ cardinal rule ” , he or she has a much better chance of producing a model that is both accurate and realistic and will avoid most of the problems I so frequently encounter when reviewing a client's work . Section 7.4 discusses the most common risk modelling errors . A second very useful rule is : “ Simulate when you can't calculate ” . In other words , don't simulate when it is possible and not too onerous to determine exactly the answer directly through normal mathematics . There are several reasons for this : simulation provides an approximate answer and mathematics can give an exact answer ; simulation will often not be able to provide the entire distribution , especially at the low probability tails ; mathematical equations can be updated instantaneously in light of a change in the value of a parameter ; and techniques like partial differentiation that can be applied to mathematical equations provide methods to optimise decisions much more easily than simulation . In spite of all these benefits , algebraic solutions can be excessively time consuming or intractable for all but the simplest problems . For those who are not particularly mathematically inclined or trained , simulation provides an efficient and intuitive approach to modelling risky issues . ### Destaque (amarelo) - Chapter 4: Choice of model structure > Página 64 · Posição 1759 It is very common for people to include rare events in a risk analysis model that is primarily concerned with the general uncertainty of the problem , but provides little extra insight . For example , we might construct a model to estimate how long it will take to develop a software application for a client : designing , coding , testing , etc . The model would be broken down into key tasks and probabilistic estimates made for the duration of each task . We would then run a simulation to find the total effect of all these uncertainties . We would not include in such an analysis the effect of a plane crashing into the office or the project manager quitting . We might recognise these risks and hold back - up files at a separate location or make the project manager sign a tight contract , but we would gain no greater understanding of our project's chance of meeting the deadline by incorporating such risks into our model . ## Part 2: Introduction ### Destaque (amarelo) - Chapter 6: Probability mathematics and simulation > Página 118 · Posição 2839 6.2 The Definition of “ Probability ” Probability is a numerical measurement of the likelihood of an outcome of some random process . Randomness is the effect of chance and is a fundamental property of the system , even if we cannot directly measure it . It is not reducible through either study or further measurement , but may be reduced by changing the physical system . Randomness has been described as “ aleatory uncertainty ” and “ stochastic variability ” . The concept of probability can be developed neatly from two different approaches : Frequentist definition The frequentist approach asks us to imagine repeating the physical process an extremely large number of times ( trials ) and then to look at the fraction of times that the outcome of interest occurs . That fraction is asymptotically ( meaning as we approach an infinite number of trials ) equal to the probability of that particular outcome for that physical process . So , for example , the frequentist would imagine that we toss a coin a very large number of times . The fraction of the tosses that come up heads is approximately the true probability of a single toss producing a head , and the more tosses we do the closer the fraction becomes to the true probability . So , for a fair coin , we should see the number of heads stabilise at around 50 % of the trials as the number of trials gets truly huge . The philosophical problem with this approach is that one usually does not have the opportunity to repeat the scenario a very large number of times . How do we match this approach with , for example , the probability of it raining tomorrow , or you having a car crash ? ### Destaque (amarelo) - Chapter 6: Probability mathematics and simulation > Página 118 · Posição 2852 Axiomatic definition The physicist or engineer , on the other hand , could look at the coin , measure it , spin it , bounce lasers off its surface , etc . , until one could declare that , owing to symmetry , the coin must logically have a 50 % probability of falling on either surface ( for a fair coin , or some other value for an unbalanced coin , as the measurements dictated ) . Determining probabilities on the basis of deductive reasoning has a far broader application than the frequency approach because it does not require us to imagine being able to repeat the same physical process infinitely . A third , subjective , definition In this context , “ probability ” would be our measure of how much we believe something to be true . I'll use the term “ confidence ” instead of probability to make the separation between belief and real - world probability clear . A distribution of confidence looks exactly the same as a distribution of probability and must follow the same rules of complementation , addition , etc . , which easily lead to mixing up of the two ideas . Uncertainty is the assessor's lack of knowledge ( level of ignorance ) about the parameters that characterise the physical system that is being modelled . It is sometimes reducible through further measurement or study . Uncertainty has also been called “ fundamental uncertainty ” , “ epistemic uncertainty ” and “ degree of belief ” . ### Destaque (amarelo) - Chapter 19: Project risk analysis > Página 473 · Posição 11194 we will assume that a preliminary exercise has already been carried out to identify the various risks associated with the project . We will assume that a risk register has been drawn up ( see Section 1.6 ) and that sufficient information has been gathered to be able adequately to quantify the probabilities associated with each risk and the size of potential impact on the tasks of the project .