|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Evaluating EHR Vendors
This section is based on and, at times uses exact text of, Alemi F, Gustafson DH. Decision Analysis for Healthcare Managers, Health Administration Press: Ann Arbor, Michigan, 2007. IntroductionThis section introduces a flexible method for evaluating different electronic health records. In this section, we assume that evaluation criteria are specified by a single individual as opposed to a committee. Of course, in reality requirements are set in a group process and with input from various people, including outside experts. Nevertheless, for ease of presentation we assume that a single individual is making the decisions about vendor selection. While we present the vendor selection process as a discussion between an analyst and a decision maker, it should be obvious that the same process can be used for self-analysis. A decision maker can build a model of his/her own decisions without the help of an analyst or a consultant. The procedures described here are based on Multi-Attribute Value models. The purpose of these models is to preserve a decision maker's preferences across a large number of criteria. EHRs differ in many ways. Making the judgment of which one is the best for the organization requires keeping track of the performance of the system on a large number of criteria. To simplify the task, Multi-Attribute Value models allow the decision maker to make an evaluation of the system on each criterion and use a mathematical formula to integrate these evaluations into a single score that would represent the value of the EHR to the decision maker. Multi-Attribute Value models have received considerable attention from economists, engineers and psychologists (Savage .1972; Edwards 1974). A comprehensive and rather mathematical introduction to constructing value models is found in von Winterfeldt and Edwards (1986). This section focuses on instructions for making the models and ignores axiomatic/mathematical foundation of Multi-Attribute Value models. Value models quantify the decision maker's preferences. By this, we mean that value models assign numbers to options so that higher numbers reflect more preferred options. These models assume that the decision makers must select from several options and that the selection should depend on grading their preferences for the options. These preferences are quantified by examining the various attributes (characteristics, dimensions, criterion, or features) of the options. For example, if decision makers were choosing among different Electronic Health Record systems, the value of the different EHRs could be scored by examining such attributes as “compatibility with legacy systems”, “potential impact on practice patterns” and “cost.” First, the impact of each EHR on each attribute would be scored, this is often called single attribute value function. Second, scores would be weighted by the relative importance of each attribute. Third, the scores for all attributes would be aggregated, often by using a weighted sum. Fourth, the EHR with the highest weighted score would be chosen. If each option was described in terms of n attributes A1, A2, ... , An the option would be assigned a score on each attribute, V(A1), V(A2), ... , V(An). The overall value of an option equals: Value = Function [ V(A1), V(A2), ... , V(An) ] In words, the overall value of an option is a function of the value of the option on each attribute. Why Model Values?Values (attitudes, preferences) play major roles in making management decisions. In organizations, decision making is often very complex and a product of collective action. Frequently, decisions must be made concerning issues on which few data exist, forcing managers to make decisions on the basis of opinions, not fact. Often there is no correct resolution to a problem‑with all options having equally legitimate perspectives and values playing a major role in the final choice. There are many everyday decisions that involve value tradeoffs. Decision such as -- Which software to purchase? Which vendor to contract with? What contractual relationships to enter into? Which components to include in the EHR system? In all these decisions, the manger has to tradeoff gains in something against other things. Some EHRs can facilitate reduction in medication errors but others do not have the needed feature. Other EHRs cost more to purchase but less to operate. Some EHRs are easier for clinicians to operate; others can more easily connect to electronic health records in the clinician's offices. In business, difficult decisions almost always involve trading off various benefits against each other. Most people acknowledge that a manager's decisions involves consideration of value tradeoffs. This is not a revelation. What is unusual is that decision analyst model these values and use numbers and formulas to express them. Some may wonder why the analyst needs to model and quantify value tradeoffs. The reason for modeling decisions maker's values includes the following:
Value models have been used extensively. Chatburn and Primiano (2001) used value models to examine large capital purchases such as the decision to purchase a ventilator. Value models have been used to model policymakers' priorities for evaluating standards for offshore oil discharges (von Winterfeldt 1980), energy alternatives (Keeney 1976), drug therapy options (Aschenbrenner and Kaubeck 1978), and family planning options (Beach et al. 1979). Misleading Numbers?Though value models allow us to quantify subjective concepts, the resulting numbers are rough estimates that should not be mistaken for precise measurements. It is important that managers do not read more into the numbers than they mean. Analysts must stress that the numbers in value models are intended to offer a consistent method of tracking, comparing, and communicating rough, subjective concepts and not to claim a false sense of precision. An important distinction is whether the model is to be used for rank ordering (ordinal) or rating the extent of preferences (interval). Some value models produce numbers that are only useful for rank ordering options. Thus, in evaluating vendors what matters to some decision makers is which vendor best meets the organization's needs and not how much better one is than another. In these circumstances, a vendor with a score of 80 out of 100 may not be twice better than a vendor with a score of 40 out of 100. All we can say is that the vendor with a score of 80 is better than the vendor with a score of 40 but not how much better. Furthermore, averaging such ordinal scores is meaningless. In contrast, value models that score on an interval scale show how much more preferable one option is than another. Now we can say with some confidence that the vendor with a score of 80 is twice better than the vendor with a score of 40 and perhaps we may be willing to pay twice more for the EHR from the vendor with the score of 80. Furthermore, averaging interval scores is meaningful and we can use these scores in other calculations. Numbers can also be used as a way of naming things (the so called Nominal scale). Nominal scales produce numbers that are neither ordinal or interval. The International Classification of Disease assigns numbers to diseases but these numbers are neither ordinal or interval. The point is that there are different types of numbers that achieve different goals. In modeling decision makers values, single-attribute value functions must be measured on interval scales. If single attributes are measured on an interval scale, then these numbers can be added or multiplied to produce the overall score. If measured as an ordinal scale or nominal scale, one cannot calculate the overall score from the single attribute values. Sometimes measuring single value attributes on an interval scale is difficult. In these circumstances it may be necessary to ask the decision maker to make repeated ordinal preferences about small improvements in EHRs. Then the collection of a large number of these small preferences would construct a near-interval scale. In contrast, overall scores for options need only have an ordinal property. When it comes to choosing an option, all most decision makers care about is which option has the highest rating and not by how much is an option scored higher than others. Keep in mind that the purpose of quantification is not to be precise in numerical assessment. The analyst quantifies values of various attributes so that the calculus of mathematics can be used to keep track of them and produce an overall score that reflects the various attributes. Quantification allows us to use the logic imbedded in numbers. In the end, model scores are a rough approximation of preferences. They are helpful not because they are precise but because they adequately track contributions of each attribute to the overall decision. Example of EHR EvaluationsVendors are typically asked for five categories of information: product functions, corporate information, service, technology and price. Each of these categories include a number of attributes. Cost might be broken into initial licensing fees, maintenance cost, and networking costs. Product function might have numerous other sub-categories (scheduling, billing, etc.), each leading to a different set of attributes. The California Healthcare Foundation (Forrester Research 2003) and the Leapfrog group (Kilbridge, Welebob, Classen, 2006) each have developed detailed procedures for vendor evaluations. These efforts are prescriptive, in the sense that they advise how practices should evaluate electronic health records. In addition, others have taken a more descriptive route. They have examined how real practices make vendor decisions (Eden 2002). Throughout this paper, we use the attributes identified in Eden's survey of physician practices to construct a model for vendor evaluations. Steps in Modeling Values
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Analyst: |
Can you recall a specific electronic health record that you think will be a poor choice for us? |
|
Decision Maker: |
I have talked to a lot of vendors. They promise you the world but then when you get the product you see that very few of the promises are actually delivered. |
|
Analyst: |
Tell me about a system you liked and why. |
|
Decision Maker: |
I liked the lab ordering system we currently have. It is easy to use and I am already familiar with it. I do not want to give that up. I do not want to have to learn every thing from scratch. |
|
Analyst: |
So far you have told me that you consider the following attributes as important: "Vendors that have a reputation of promising only what they can deliver" "Ease of use" and compatibility with existing laboratory system" what other attributes are important in your selection of a vendor? |
|
Decision Maker: |
Well cost for sure, but not just cost of purchasing. I want to know the cost of operating it and how it affects my efficiency of seeing patients? |
As you may have noticed in this dialogue, the analyst started with tangible examples and used the terminology and words introduced by the decision maker to become more concrete. This dialogue is designed to reinforce to the decision maker that the analyst is listening to him. Therefore it is important to use as far as possible the words and terminology used by the decision maker.
In general, analysts are not there to express their own ideas. They are there to listen. But they can ask questions to clarify things or even to mention things overlooked by the decision maker, as long as it does not change the nature of the relationship between the analyst and the decision maker.
Arrange the attributes in a hierarchy. Some analysts suggest using a hierarchy to solicit and structure the attributes (Keeney and Raiffa 1976). For example, the decision maker suggested that costs are important but that there are different types of cost attributes: cost of purchase, cost of maintenance, and cost of impact on practice efficiency. These three costs can be sub-attributes under the cost category. The hierarchical structure promotes completeness and simplifies tracking many attributes.
Be careful about terminology. Always use the decision maker's terminology, even if you think a reformulation would help. Thus, if the decision maker refers to "efficiency," do not substitute "effectiveness." If the decision maker refers to "speed" do not substitute with "time to response." Such new terminology may confuse the conversation and create an environment where the analyst acts more like a decision maker, which can undermine the decision maker's confidence that he or she is being heard.
In general, the less esoteric prompts are more likely to produce the best responses, so formulate a few prompts and use the prompts that feel most natural for your task. Avoid jargon, including the use of our terminology (e.g. attribute, value function, aggregation rules, etc.).
Take notes, and do not interrupt. Have paper and pencil available, and write down the important points. Not only does this help the Decision Maker's recall, but it also helps you review matters while the decision maker is still available. Decision maersxperts tend to list a few attributes, then focus attention on one or two. Actively listen to these areas of focus. When the decision maker is finished, review your notes for items that need elaboration. If you don't understand certain points, ask for examples, which are an excellent means of clarification.
Other approaches. Other, more statistical approaches to soliciting attributes are available, such as multidimensional scaling and factor analysis. However, we prefer the behavioral approach to soliciting attributes because it involves the decision maker more in the process and leads to greater acceptance of the final vendor selection model.
After soliciting a set of attributes, it is important to examine and, if necessary, revise them. Psychological research suggests that changing the framing of a question alters the response. Consider these two questions:
"What are the markers for good electronic health record?"
"What are the markers for a poor electronic health record?"
One question emphasizes the positives and the other the negatives. One would expect that the list of negative attributes is the absence of positive ones. Researchers have found this to be untrue (for a review, see Nisbett and Ross i 980). Decision Makers may identify entirely different attributes given the two prompts. This research suggests that value‑laden prompts tap different parts of the memory and can evoke recall of different pieces of information. Evidence about the impact of questions on recall and judgment is substantial and well established (Hogarth 1975; Ericsson and Simon 1980; Snyder and Cantor 1979).
Several tests should be conducted to ensure that the solicitation process succeeded. The first test ensures that the listed attributes are exhaustive. The analyst uses the attributes to describe several hypothetical electronic health records. The decision maker should be able to rate the systems. If the decision makers need additional information before they can make a judgment, the analyst solicits new attributes until they have sufficient information to judge the system.
A second test checks that the attributes are not redundant by examining whether knowledge of one attribute implies knowledge of another. For example, the decision maker may consider "maintenance cost" and "yearly licensing fees" redundant if both have the same implications. In such cases, either the two attributes should be collapsed into one, or one must be dropped from the analysis.
A third test ensures that each attribute is important to the decision maker's judgment. You can test this by asking the decision makers to judge two hypothetical situations: one with the attribute at its lowest level and another with the attribute at peak level. If the judgments are similar, the attribute may be ignored. For example, if the range of maintenance costs is so low to be relatively unimportant int the larger decision, the analyst can ignore it. .
Fourth, a series of tests examines whether the attributes are related or dependent (Keeney and Raiffa 1976; Keeney 1977). These words are much abused and variously defined. By independence we mean that in judging two different systems, changing the shared feature among these systems does not affect how other features are judged. This type of impendence is called preferential independence. There are many situations in which preferential independence does not hold. In predicting three‑year risks of hospitalization, age and lifestyle may be dependent (Alemi et al. 1987). Among young adults, drinking may be a more important concern than cholesterol risks, while among older adults, cholesterol is the more important risk factor. Thus, the relative importance of cholesterol and drinking risks depends on the age of the patients being compared.
In many circumstances, preferential independence holds. It often holds even when decision makers complain that the attributes are dependent in other senses. When preferential independence holds, it is reasonable to break a complex judgment into its components. Or, to say it differently, with preferential independence, it is possible to find a formula that translates scores on several attributes into an overall severity score in such a manner as to preserve the Decision Maker's preferences. When preferential independence does not hold, it is often a sign that some underlying issue is poorly understood. In these circumstances, the analyst should query the decision maker further and revise the attributes to eliminate dependencies (Keeney 1980).
In Eden's 2002 survey of practices, the following attributes were identified as important when selecting a vendor:
The software appeared easy to use
Software appeared to improve one or more of the business processes in the practice process
The software provided the most value for cost
The software would help the practice perform processes needed to reach our long term business strategy
The vendor had many sites and was responsive to practice information needs during the selection process
There were strong testimonies from prior users
The software was already in use by other sites affiliated with this practice
Software was compatible with existing systems in the practice
Note that each of these attributes can be further broken down to other (lower level) attributes. For example the second attribute "Software appeared to improve business process" might be broken into the following sub-set of attributes:
Improved billing process
More accurate documents
Improved ability to analyze managed care costs
Improved scheduling process
Improved access to patient information at multiple sites
Reduced malpractice costs
Improved referral process
Reduced time for recording patient information
Improved communication
Improved documented quality
Quicker lab results
As the number of attributes in a model increases, the chances for preferential dependence also increase. The rule of thumb is that preferential dependencies are much more likely in value models with more than nine attributes. As you can see, the evaluation of electronic health records could easily involve 50 or more attributes.
Now it is time to identify the possible levels of each attribute. The analyst starts by deciding if the attributes are discrete or continuous. Attributes such as cost are continuous; attributes such as compatibility with existing systems are discrete. Continuous attributes may be expressed in terms of a few discrete levels, so that cost can be described in ranges and not actual cost values. There are three steps in identifying the levels of each attribute:
Define the range of the attribute by selecting the best and worst levels in the attribute
Define some intermediate levels
Fill in the other possible levels so that the listing of the levels is exhaustive (capable of covering all possible situations).
To define the range, the analyst must examine the best and the worst vendor on the single attribute. Thus, for the cost attribute, the range is set so that it will cover the lowest and highest cost system. Note that the lowest cost and the highest possible cost is different from the range facing the decision maker at the time of the decision. The range of each attribute should be set so that all vendors under consideration can be rated on the attribute but no wider than necessary for this task.
A typical error in obtaining the best and the worst levels is failing to describe these levels in detail. For example, in assessing the value of the attribute "ease of use", it is not helpful to define the levels as:
Easiest to use system
Hardest to use system
The adjectives easy to use and hard to use should be defined. It is best to avoid using adjectives in describing levels, as decision makers perceive words like easy, or best in different ways. The levels must be defined in terms of the underlying physical process measured in each attribute, and the descriptions must be connected to the nature of the attribute. Thus, a good level for the easy to use attribute might be "Entering the average patient encounter takes less than 2 minutes" and the worst might be "Entering the average patient encounter takes more than 5 minutes."
Next, ask the decision maker to define intermediate levels. These levels are often defined by asking for a level between the best and worst levels. In the example, this dialogue might occur:
|
Analyst: |
I understand that in some systems it may take less than 2 minutes or more than 5 minutes to enter information on an average patient encounter. What about other systems. Where do you think they would fall in the range between less than 2 minutes to more than 5 minutes. |
|
Decision Maker: |
Well, a host of things can happen. But on average I think in many systems it may take about 3 minutes to enter an average patient. |
It is not always possible to solicit all possible levels of an attribute from the decision maker interviews. In these circumstances, the analyst can fill in the gaps afterward by reading the literature or interviewing other experts. The levels specified by the first decision maker are used as markers for placing the remaining levels, so that the levels range from best to worst. In the example, a clinician on the project team reviewed the Decision Maker's suggestions and filled in a long list of intermediate levels.
The analysis proceeds with the evaluation of single attribute value function, i.e. a scoring procedure that assigns the relative value of each level in a single attribute. A common method for doing so is to use the double-anchored estimation method (Kneppreth et al. 1974). This approach gets its name from the practice of selecting the best and worst levels first and rating the remaining levels according to these two "anchors." In this method, first the attribute levels are ranked, or, if the attribute is continuous, the most and least preferred levels are specified. Then the best and the worst levels are used as anchors for assessing the other levels.
For example, consider the attribute "Initial licensing fees per physician." It may have the following levels:
|
No cost (open software) |
|
$5000/physician |
|
$10,000/physician |
|
$20,000/physician |
|
$30,000/physician |
|
$40,000/physician |
Note that the best and worst costs are assigned by convention a score of 0 and 100. The following interaction typifies the questioning for the double-anchored estimation method:
|
Several psychologists have questioned whether decision makers are systematically biased in assessing value. Yates and Jagacinski (1979) showed that using different anchors produced different value functions. For example, in assessing the value of money, Kahneman and Tversky (1979) showed that values associated with gains or losses are different from values related to the amount of monetary return. They argued that the value of money is judged according to the decision maker's current assets. Because value may depend on the anchors used, it is important to use different anchors from just the best or worst levels. Thus, we might ask given that $5000 per physician is rated at 10 and $$30,000 per physician is rated at 90, where you will rate the remaining levels?
It is occasionally useful to change not only the anchors but also the assessment method. A number of other methods besides the double anchored method exist. When a value is measured by two different methods, there would be inadvertent discrepancies; the analyst must ask the decision maker to resolve these differences. If differences are small, they can be averaged out.
By convention, the single attribute value function must range from 0 to 100. Sometimes, decision makers refuse to assign the 0 value. In these circumstances, their estimated values should be revised to range from 0 to 100. The following formula shows how to obtain standardized value functions from estimates that do not range between 0 to 100:

For example, if the skin diseases attribute are rated as:
| Attribute | Attribute level | Rating | Standardized Rating |
| Scheduling system included |
No reminder or scheduling system |
10 | 0 |
|
Scheduling only |
60 | 62 | |
|
Appointment reminder only |
20 | 12 | |
| Both scheduling and appointment reminders | 90 | 100 |
Then, the maximum value is 90 and the minimum value is 10 and standardized values can be assigned to each level using the formula above. For example for "Appointment reminder only" the value is:

In this step, the analysis proceeds by finding a way for aggregate single attribute functions into an overall scores evaluated across all attributes. Note that the scoring convention has produced a situation in which the value of each attribute is somewhere between 0 and 100. The analyst must find an aggregation rule that differentially weights the various attributes.
The most obvious rule is the additive model. Assume that V represents the value of systems purchased from a vendor. If the system is described by a series of n attributes of {A1, A2, . . . , Ai, . . . , An}, then using the additive rule, the overall value equals:
S =
∑i
Wi Vi(Aj)
where Vi(Aj)
is the value of the jth level in the ith
attribute, Wi is the importance weight associated with the ith attribute, and ∑iWi
= 1.
Several other models are possible in addition to the additive model. But the additive form is the most commonly used method of aggregating across multiple attribute evaluations.
The analyst can estimate the weights for an additive value model in a number of ways. This section presents the method of rating the ratio of importance of the attributes. It is often useful to mix several approaches. Some analysts estimate weights by assessing the ratio of the importance of two attributes (Edwards 1977). The attributes are rank ordered, and the least important is assigned 10 points. Then the decision maker is asked to estimate the relative importance of the other attributes. There is no upper limit to the number of points other attributes can be assigned. For example, in estimating the weights for the three attributes (cost, impact on business process, and compatibility with current systems), the analyst and the decision maker might have the following discussion:
|
Analyst |
Which of the three attributes is most important? |
|
Decision Maker: |
Well, they are all important, but impact on the business process is the reason why we are getting an EHR so I guess I say that is most important. |
|
Analyst |
Which is the least important. |
| Decision Maker: | They are all important but among these three I think compatibility with existing systems is least important. I understand that if we want to have the features that are unique to a system, we may not be able to throw away the old systems and purchase a comprehensive system from the same vendor |
| Analyst: | If compatibility with existing system is rated 10 in importance, how many more times is cost more important. |
|
Decision Maker: |
Cost is important. I would say maybe 3 times more. |
| Analyst | That would put cost at 30 if compatibility to existing systems is rated as 10. Now how many more times is impact on business process more important than cost? |
| Decision Maker | I would go with 1.5 times more important. |
In the dialogue above, the analyst first found the order of the attributes, then asked for the ratio of the weights of the attributes. Knowing the ratio of attributes allowed the analyst to estimate the attribute weights. If the model has only three attributes, the weights for the attributes can be obtained by solving the following three equations:
W(Cost) / W(Compatibility with existing systems) = 3
W(Impact on business processes) / W(Cost) = 1.5
W(Compability with existing systems) + W(Cost) + W(Impact on business processes) = 1
An easy way to solve these three equations is to assign the least important attribute the weight of 10. The next attribute will have a weight of 30 and still the next one a weight of 45. The last equation requires us to divide the estimated weights by sum of all three, yielding the relative weights of 0.12, 0.35 and 0.53.
Using the attributes in Eden's 2002 survey, we identified the following relative weights for the attributes:
| Attribute | Weight |
| The software appeared easy to use | 0.17 |
| Software appeared to improve one or more of the business processes in the practice process. | 0.17 |
| The software provided the most value for cost | 0.15 |
| The software would help the practice perform processes needed to reach our long term business strategy | 0.14 |
| The vendor had many sites and was responsive to our needs during the selection process | 0.12 |
| There were strong testimonies from prior users | 0.10 |
| The software was already in use by other sites affiliated with this practice | 0.09 |
| Software was compatible with existing practice systems in the practice | 0.08 |
|
Table 1: Relative Weight for Attributes |
|
One characteristic of this estimation method is that its emphasis on the ratio of the importance of the attributes leads to relatively extreme weighting compared to other approaches. Thus, some attributes may be judged critical, and others rather trivial. Other approaches, especially the direct magnitude process (assign a weight between 0 to 100), may judge all attributes as almost equally important.
In choosing a method to estimate weights, you should consider several trade‑offs. You can introduce errors by asking decision makers awkward and partially understood questions, but you can also cause error with an easier, but formally less justified, method. Our preference is to estimate weights in several ways and use the resulting differences to help decision makers think more carefully about their real beliefs. In doing so, the analysts usually start with a rank order technique, then move on to assess ratios, obtain a direct magnitude estimate, identify discrepancies, and finally ask the decision maker to resolve them.
One note of caution: Some scientists have questioned whether decision makers can describe how they weight attributes. Nisbett and Wilson (1977) argued that directly assessed weight may not reflect an Decision Maker's true beliefs. Yi (2004) found that patients decisions depended on the choice of methods of assessing their preferences. Other investigators review the literature and find that directly assessing the relative importance of attributes is accurate (John and Edwards 1978, Naglie et al. 1997). The debate is many years old. The only way to decide if the directly assessed weights reflect the Decision Maker's opinions is to look at how the resulting models perform. In a number of applications, value models based on directly assessed weights correlated quite well with the subject's judgments (Fischer 1979). The typical correlation is actually in the upper .80s, which is high in comparison to most social science correlations. This success confirms the accuracy (perhaps one should say adequacy) of the subjective assessment techniques for constructing value models.
While researchers know the importance of carefully evaluating value models, analysts often lack the time and resources to do this. Because of the importance of having confidence in the models and being able to defend the analytical methodology, we will present several ways of testing the adequacy of value models (Gustafson et al. 1980).
Most value models are devised to apply to a particular context, and they are not portable to other settings or uses. This is called "context dependence." In general, it is viewed as a liability, but this is not always the case. For example, the vendor evaluation model works for one organization and applying the result to another will not be reasonable.
The value model should require only available data for input. Relying on obscure data may increase the model's accuracy at the expense of practicality. Thus, the vendor evaluation model cannot ask for information that is not available publicly or through the vendors. According to the survey by Eden, the sources of information for evaluating vendors include the following:
View software demonstrations
Issue a Request For Proposal and get information from the vendor
Compare software options with the best in the field
Conducted prior user interviews
Make a site visit
Developed a decision analysis
The Leapfrog group and others have also added an approach of creating simulated patients and marking the performance of the system on these patients
The model should be simple to use. The index of medical under service (Health Services Research Group 19'75) is a good example of the importance of simplicity. This index, developed to help the federal government set priorities for funding HMOs, community health centers, and health facility development programs, originally had nine variables, but the director of the sponsoring federal agency rejected it because of the number of variables. Because he wanted to be able to "calculate the score on the back of an envelope," the index was reduced to four variables. The simplified version performed as well as one with nine variables; it was used for eight years to help set nationwide funding priorities. This example shows that simplicity does not always equal incompetence. Simplicity nearly always makes an index easy to understand and use.
When different people apply the value model to the same situation, they must arrive at the same scores, which is referred to as inter‑rater reliability. In the example, different people evaluating the vendors using the same attributes should arrive at the same conclusions. If reasonable people using a value model reach different conclusions, then one loses confidence in the model's usefulness as a systematic method of evaluation. Inter-rater reliability is tested by having different people rate the performance of each vendor on each attribute.
The value model should also seem reasonable to decision makers, something coined face validity. Otherwise, even if it is accurate, one may experience problems with its acceptance. Clinicians who are unfamiliar with statistics will likely rely on their experience to judge the model, meaning that the variables, weights, and value scores must seem reasonable and practical to them. Face validity is tested by showing the model to a new set of decision makers and asking if they understand it and whether it is conceptually reasonable.
One way to establish the validity of a model is to show that it simulates the judgment of the decision makers; then if one believes in the leadership of the decision maker then one should also consider the model as valid (Fryback 1976). In this approach the decision maker is asked to score several (perhaps 100) hypothetical vendor profiles described only by variables included in the model. If the model accurately predicts the Decision Maker's judgments; confidence in the model increases.
A model is considered valid if several different ways of measuring it lead to the same finding. This method of establishing validity is referred to as construct validity. For example, if two different vendor evaluation methods lead to the same conclusions, then confidence in both methods increases.
No projects are assigned at this time for this section.
Advanced learners like you, often need different ways of understanding a topic. Reading is just one way of understanding. Another way is through writing about what you have read. The enclosed assessment is designed to get you to think more about the concepts taught in this session.
| Compatibility with existing systems | Ease of use | Cost | |
| System A | 20 | 20 | 50 |
| System B | 50 | 50 | 70 |
| System C | 60 | 10 | 90 |
| System D | 80 | 0 | 60 |
| Table 2: Ratings of 4 Vendors on 3 Criteria | |||
Ask your question and we will post your answer here. Alternatively read answers to questions asked by others.
Database Results Wizard ErrorSuggest a change or read changes suggested by others:
Database Results Wizard Error