EDUCATIONAL EVALUATION : Philosophical foundations

Philosophical foundations

Positivism, observation and theory

Before addressing particular aspects of evaluation theory it is important to locate the role of theory in evaluation within the broader set of debates within. the philosophy of science. The dominant school, much criticised but of continuing influence in the way we understand the world, is logical positivism. Despite being largely discredited in academic circles for some 50 years, this school still holds sway in policy debates. It constitutes the base model around which variants are positioned. With a history that stretches back to Compte, Hume, Locke, Hobbes and Mill, positivism emerged partly as a reaction to metaphysical explanations: that there was an 'essence of a phenomenon that. could be distinguished from its appearance. At the heart of positivism therefore, is a belief that it is possible to obtain objective knowledge through observation and that such knowledge is verified by statements about the circumstances in which such knowledge is true.

In the field of evaluation, House (1983) has discussed this tradition under the label of objec tivism: Evaluation information is considered to be *scientifically objective." This objectivity is achieved by using "objective" instruments like tests or questionnaires. Presumably, results produced with these instruments are reproducible. The data are analysed by quantitative techniques which are also "objective" in the sense that they can be verified by logical inspection regardless of who uses the techniques.' (House, 1983; p. 51).

House goes on to emphasise that part of objectivist tradition that he calls methodological Individualism' in Mill's work in particular. Thus, repeated observation of individual phenomenaist the way to identify uniformity within a category of phenomena. This is one important strand in the mainstream of explanations within the social and economic sciences. It is the basis for reductionism: the belief that it is possible to understand the whole by investigating its constituent parts. 'By methodological individualism, I mean whatever methodologically useful doctrine is asserted in the vague claim that social explanations should be ultimately reducible to explanations in terms of people's beliefs, dispositions, and situations. [...] It is a working doctrine of most economists, polit ical scientists, and political historians in North America and Britain." (Miller, 1991; p. 749). In this world-view, explanations rest on the aggregation of individual elements and their behaviours and interactions. It is worth noting: that this has been described as a 'doctrine' as well as a methodological statement. It underpins many of the survey based and economic models that are used in evaluation. There is now widespread agreement that empir ical work cannot rely only on observations. There are difficulties empirically observing the entirety of any phenomena; all description is partial and incomplete, with important unobservable elements. Scientists must be understood as engaged in a metaphysical project whose very rules are irretrievably determined by theoretical conceptions regarding largely unobservable phenomena. (Boyd, 1991; p. 12). This is even more true for mechanisms which it is generally recognised can be imputed but not observed. Ast Boyd goes on to say, it is an important fact, now. universally accepted, that many or all of the central methods of science are theory dependent. This recognition of the theory dependence of all scientific inquiry underpins the now familiar critiques of logical positivism, even though there is considerable difference between the alternatives that the critics of positivism advocate. The two most familiar critiques of positivism are scientific realism and constructivism.

Scientific realism

Scientific realism, while acknowledging the limits of what we can know about phenomena, asserts that theory describes real features of a not fully observable world. Not all realists are the same and the European tradition currently being inspired mainly by the work of Pawson (Pawson and Tilley, 1997; Pawson, 2002a and b) can be distinguished in various ways from US realist thinking. For example, some prominent North American realists commenting on Pawson and Tilley’s work have questioned the extent to which realists need completely to reject experimental and quasi-experimental designs, and suggest that more attention should be paid in the realist project to values. This is especially important if, in addition to explanation, realists are to influence decisions (Julnes et al., 1998). Nonetheless, this chapter draws mainly on the work of Pawson and Tilley to describe the realist position in evaluation. In some ways reallsm continues the positivist project: it too seeks explanation and believes in the possibility of accumulating reliable knowledge about the real world, albeit through different methodological spectacles. According to Pawson and Tilley, it seeks to open the ‘black-box’ within programmes or policies to uncover the mecha nisms that account for what brings about change. It does so by situating such mechanisms in contexts and attributing to contexts the key to what makes mechanisms work or not work. This is especially important in domains such as VET where the evaluation objects are varied and drawn from different elements into different configurations in differentiated contexts.

“What we want to resist here is the notion that programs are targeted at subjects and that as a consequence program efficacy is simply a matter of changing the individual subject.” (Pawson and Tilley, 1997; p. 64).

Rather than accept a logic that sees programmes and policies as simple chains of cause and effect, they are better seen as embedded in multilayered (or stratified) social and organisational processes. Evaluators need to focus on ‘underlying mechanisms’: those decisions or actions that lead to change, which is embedded in a broader social reality. However these mechanisms are not uniform or consistent even within a single programme. Different mechanisms come into play in different contexts, which is why some programmes or policy instruments work in some, but not all, situations.

Like all those interested in causal inference, realists are also interested in making sense of patterns or regularities. These are not seen at the level of some programme level aggregation but rather at the underlying level where mechanisms operate. As Pawson and Tilley (1997: p. 71) note: ‘regularity mechanism + context’. Outcomes are the results of mechanisms unleashed by

Particular programmes. It is the mechanisms that bring about change and any programme will probably rely on more than one mechanism, not. All of which may be evident to programme archi tects or policy-makers.

As Pawson and Tilley summarise the logic of realist explanation: The basic task of social inquiry is to explain interesting, puzzling, socially signifi cant regularities (R). Explanation takes the form of positing some underlying mechanism (M) which generates the regularity and thus consists of propositions about how the interplay between structure and agency has constituted the regu larity. Within realist investigation there is also inves tigation of how the workings of such mechanisms are contingent and conditional, and thus only fired in particular local, historical or institutional contexts (C) (Pawson and Tilley, 1997; p. 71).

Applying this logic to VET, we may note, for example, that subsidies to increase work-based learning and CVT in firms sometimes lead to greater uptake by the intended beneficiaries. This need not lead to the assessment of the programme as ineffective because, for example, positive outcomes can only be observed in 30 % of cases, We try rather to understand the mechanisms and contexts which lead to success. Is the context one. Where firms showing positive outcomes are in a particular sector or value chain or type of region? Or is it more to do with the skill composition of the firms concerned? Are the mechanisms that work in these contexts effective because a previous invest ment has been made in work-based learning at the firm level or is it because of the local or regional training infrastructure? Which mechanisms are at play and in what context:

(a) The competitive instincts of managers (mech anism), who fear that their competitors will benefit (context) unless they also increase in their CVT efforts?

(b) the demands of trade unions concerned about the professionalisation and labour-market. Strength of their members (mechanism), sparked off by their awareness of the avail ability of subsidies (context)?

(c) the increased effectiveness of the marketing Efforts of training providers (mechanism) made Possible by the subsidies they have received (context)?

According to the reallsts, it is by examining and Comparing the mechanisms and contexts in which they operate in relation to observed outcomes, that it becomes possible to understand success and describe it. For Pawson and Tilley, all revolves around these CMO (context, mechanism, outcome) configurations.

Policy-makers are then in a position to consider options such as:

(a) focusing the programme more narrowly at beneficiaries that are likely to change because of the mechanisms that work in the contexts they inhabit;

(b) differentiating a programme and its instruments more clearly to ensure that different mechanisms that work in different contexts are adequately covered;

(c) seeking to influence the contexts within which the programme aims to be effective. The table below, taken from the concluding chapter of Pawson and Tilley’s book, provides a brief summary of the realist position, in terms of eight ‘rules’ that are seen as encapsulating the key ideas of realistic enquiry and method.

Rules guiding realistic enquiry and method

Key ideas of realistic enquiry and method.

1: Generative causation

Rule Evaluators need to attend to how and why social programmes have the potential to causation.

Rule 2: Ontological depth

Evaluators need to penetrate beneath the surface of observable inputs and outputs of a programme.

Rule 3: Mechanisms Evaluators need to focus on how the causal mechanisms which generate social and behavioural problems are removed or countered through the alternative causal mechanisms introduced in a social program.

Rule 4: Contexts

Social programme Evaluators need to understand the contexts within which problem mechanisms are activated and In which programme mechanisms can be successfully fired

Rule 5: Outcomes

Evaluators need to understand what are the outcomes of an initiative and how they are produced

Rule 6: CMO configurations

In order to develop transferable and cumulative lessons from research, evaluators need to orient the thinking to context-mechanism-outcome patter configurations (CMO configurations)

Rule 7: Teacher-learnor processes In order to construct and test context-mechanism-outcome pattern explanations, evaluators need to Engage in a teacher-leamer relationship with program policy-makers, practitioners and participants.

Rule 8: Open systems

Evaluators need to acknowledge that programmes are implemented in a changing and permeable. Social world, and that programme effectiveness may thus be subverted or enhanced through the unanticipated intrusion of new contexts and now causal powersAdapted from Pawson and Tilley (1997)

Source: Adapted from Pawson and Tilley (1997)

Constructivists

Constructivists deny the possibility of objective knowledge about the world. They follow more in the tradition of Kant and other continental Euro pean philosophers than the mainly Anglo Saxon school that underpins positivism and realism, It is only through the theorisations of the observer that the world can be understood. ‘Socially constructed causal and metaphysical Phenomena are, according to the constructivist, Actors or stakeholders. According to Stufflebeam in his review of Foundation models for 21^st century program evaluation: Constructivism rejects the existence of any ultimate reality and employs a subjectivist epistemology. It sees knowledge gained as one or more human constructions, uncertifiable, and constantly prob lematic and changing. It places the evaluators and program stakeholder at the centre of the inquiry. Process, employing all of them as the evaluation’s “human instruments”. The approach insists that Real. They are as real as anything scientists can study ever gets. The impression that there is some sort of socially unconstructed reality that is somehow deeper than the socially constructed variety rests, the constructivist maintains, on a failure to appreciate the theory-dependence of all our methods. The only sort of reality any of our methods are good for studying is a theory dependent reality. (Boyd, 1991; p. 13). The way we know, whatever the instruments And methods we use, is constructed by human Philosophies and types of evaluation research 27 Evaluators be totally ethical in respecting and advocating for all the participants, especially the disenfranchised.’ (Stufflebeam, 2000a; pp. 71-72). The most articulate advocates of construc tivism in evaluation are Guba and Lincoln. They have mapped out the main differences between constructivists and the conventional position (as they label positivists) in their well-known text Fourth generation evaluation (Guba and Lincoln, 1989). The highlights of this comparison is summarised below:

According to Guba and Lincoln, when consid ering the purpose of evaluations, one needs to distinguish both between merit and worth and between summative and formative intent:

(a) A formative merit evaluation is one concerned with assessing the intrinsic value of some evaluand with the intent of improving it, so, for example, a proposed new curriculum could be assessed for modemity, integrity. Continuity, sequence, and so on, for the sake of discovering ways in which those character

Istics might be improved; (b) a formative worth evaluation is one concerned with assessing the extrinsic value of some evaluand with the intent of improving it; so, for example, a proposed new. Curriculum could be assessed for the extent to which desired outcomes are produced in some actual context of application, for the sake of discovering ways in which its perfor mance might be improved;

(b) A summative merit evaluation is one concerned with assessing the intrinsic value of some evaluand with the intent of deter mining whether it meets some minimal (or normative or optimal) standard for modernity. Integrity, and so on. A positive evaluation results in the evaluand being warranted as meeting Its Internal design specifications;

(c) A summative worth evaluation is one concerned with assessing the extrinsic value of some evaluand for use in some actual context of application. A positive evaluation results in the evaluand being warranted for use in that context. (Guba and Lincoln, 1989; pp. 189-190).

In practical terms, what it is that the evaluator should do, Guba and Lincoln start from the claims, concerns and issues that are identified by stakeholders, people who are put at some risk by the evaluation’. It is therefore necessary for evaluators to be responsive’, ‘One of the major tasks for the evaluator is to conduct the evaluation in such a way that each group must confront and deal with the constructions of all others, a process we shall refer to as hermeneutic dialectic. [...] Ideally responsive evaluation seeks to reach consensus on all claims, concerns and issues[...]’ (Guba and Lincoln, 1989; p. 41),

A distinctive role of the evaluator, therefore, is to help put together hermeneutic’ circles. This is defined by Guba and Lincoln as a process that. brings together divergent views and seeks to interpret and synthesise them mainly to allow their mutual exploration by all parties' (Guba and Lincoln, 1989; p. 149). As Schwandt has argued from a postmodernist standpoint, 'only throught situated use in discursive practices or language games do human actions acquire meaning' (Schwandt, 1997; p. 69). Applied to evaluation, this position argues for the Importance of the 'dialogic encounter in which evaluators are 'becoming partners in an ethically informed, reasoned conversation about essentially contested concepts [...]' (Schwandt, 1997; p. 79). In more down to earth terms, Guba and Lincoln emphasise the role of the evaluator to: (a) prioritise those unresolved claims, concerns and issues of stakeholders that have survived. earlier rounds of dialogue orchestrated by the evaluator. (b) collect information through a variety of means. - collating the results of other evaluations, reanalysing the information previously gener ated in dialogue among stakeholders, conducting further studies that may lead to the reconstruction' of understandings among stakeholders; (c) prepare and carry out negotiations that, as far as possible and within the resources avail able, resolve that which can be resolved and (possibly) identify new issues that the st holders wish to take further in another e 21 ation round. So how might this be exemplified in the VET domain? It should be noted that what follows does not fully conform to Guba and Lincoln's vision of constructivist evaluation, largely because it is situated in a larger scale socioeco nomic policy context than many of their own smaller scale case examples. But also it should be noted that constructivist thinking is, to some extent, relevant to many contemporary evaluation challenges and the example below is intended to illustrate such potential relevance. So, to apply this logic to VET, constructivist thinking can be especially helpful where there is a problem area with many stakeholders and the entire system will only be able to progress if there is a broad consensus. For example, there may be a political desire to become more inclusive and involve previously marginalised groups in training opportunities. The problem is how to ensure that certain groups such as women, young people. and ethnic communities are given a higher profile in VET. Here, the involvement of many stake holders will be inevitable. Furthermore the views of these stakeholders are more than data for the evaluator: they are the determinants and shapers of possible action and change. Unless the trainers, employers, advocacy groups, funding authorities and employment services responsible for job-matching and the groups being targeted' cooperate, change will not occur. It is also likely that these stakeholders hold vital information and insights into the past experience of similar efforts; what went wrong and right and what could be done to bring about improvements in the future. The evaluator might then follow much of the constructivist logic outlined above: (a) identify the different stakeholders who poten tially have a stake in these areas of concern; (b) conduct a series of initial discussions to clarify what they know, what they want and what are their interests;

c. feed back to all stakeholders their own and each other’s interests, knowledge and concerns in a way that emphasis the similari ties and differences: (d) clarify areas of agreement and disagreement

and initiate discussions among the stake

holders and their representatives to clarify

areas of consensus and continuing dissent; (e) agree what other sources of information could help move the stakeholders forward perhaps by synthesising other available studies, perhaps by initiating new studies;

(f) reach the best possible consensus about what should be done to improve VET provision and participation for the groups concerned.

It is worth highlighting that the balance of activities within constructivist evaluation is very different from both positivist and realist variants. It emphasises the responsive, interactive, dialogic and ‘orchestrating’ role of the evaluator because the sources of data that are privileged are seen to reside with stakeholders, as much as with new studies and externally generated data.eacj other's interests, knowledge and concerns in a way that emphasis the similari ties and differences: (d) clarify areas of agreement and disagreement and initiate discussions among the stake holders and their representatives to clarify areas of consensus and continuing dissent; (e) agree what other sources of information could help move the stakeholders forward perhaps by synthesising other available studies, perhaps by initiating new studies; (f) reach the best possible consensus about what should be done to improve VET provision and participation for the groups concerned. It is worth highlighting that the balance of activities within constructivist evaluation is very different from both positivist and realist variants. It emphasises the responsive, interactive, dialogic and 'orchestrating' role of the evaluator because the sources of data that are privileged are seen to reside with stakeholders, as much as with new studies and externally generated data.

Philosophical foundations

Evolution of Evaluation