Philosophical foundations
Positivism,
observation and theory
Before addressing particular aspects of evaluation theory it
is important to locate the role of theory in evaluation within the broader set
of debates within. the philosophy of science. The dominant school, much
criticised but of continuing influence in the way we understand the world, is
logical positivism. Despite being largely discredited in academic circles for
some 50 years, this school still holds sway in policy debates. It constitutes
the base model around which variants are positioned. With a history that
stretches back to Compte, Hume, Locke, Hobbes and Mill, positivism emerged
partly as a reaction to metaphysical explanations: that there was an 'essence of
a phenomenon that. could be distinguished from its appearance. At the heart of
positivism therefore, is a belief that it is possible to obtain objective
knowledge through observation and that such knowledge is verified by statements
about the circumstances in which such knowledge is true.
In the field of evaluation, House (1983) has discussed this
tradition under the label of objec tivism: Evaluation information is considered
to be *scientifically objective." This objectivity is achieved by using
"objective" instruments like tests or questionnaires. Presumably,
results produced with these instruments are reproducible. The data are analysed
by quantitative techniques which are also "objective" in the sense
that they can be verified by logical inspection regardless of who uses the
techniques.' (House, 1983; p. 51).
House goes on to emphasise that part of objectivist
tradition that he calls methodological Individualism' in Mill's work in
particular. Thus, repeated observation of individual phenomenaist the way to
identify uniformity within a category of phenomena. This is one important
strand in the mainstream of explanations within the social and economic
sciences. It is the basis for reductionism: the belief that it is possible to
understand the whole by investigating its constituent parts. 'By methodological
individualism, I mean whatever methodologically useful doctrine is asserted in
the vague claim that social explanations should be ultimately reducible to
explanations in terms of people's beliefs, dispositions, and situations. [...]
It is a working doctrine of most economists, polit ical scientists, and
political historians in North America and Britain." (Miller, 1991; p.
749). In this world-view, explanations
rest on the aggregation of individual elements and their behaviours and
interactions. It is worth noting: that this has been described as a 'doctrine'
as well as a methodological statement. It underpins many of the survey based
and economic models that are used in evaluation. There is now widespread agreement that empir
ical work cannot rely only on observations. There are difficulties empirically
observing the entirety of any phenomena; all description is partial and
incomplete, with important unobservable elements. Scientists must be understood
as engaged in a metaphysical project whose very rules are irretrievably
determined by theoretical conceptions regarding largely unobservable phenomena.
(Boyd, 1991; p. 12). This is even more true for mechanisms which it is
generally recognised can be imputed but not observed. Ast Boyd goes on to say,
it is an important fact, now. universally accepted, that many or all of the
central methods of science are theory dependent. This recognition of the theory dependence of
all scientific inquiry underpins the now familiar critiques of logical
positivism, even though there is considerable difference between the alternatives
that the critics of positivism advocate. The two most familiar critiques of
positivism are scientific realism and constructivism.
Scientific realism
Scientific realism, while acknowledging the limits of what
we can know about phenomena, asserts that theory describes real features of a
not fully observable world. Not all realists are the same and the European
tradition currently being inspired mainly by the work of Pawson (Pawson and
Tilley, 1997; Pawson, 2002a and b) can be distinguished in various ways from US
realist thinking. For example, some prominent North American realists
commenting on Pawson and Tilley’s work have questioned the extent to which
realists need completely to reject experimental and quasi-experimental designs,
and suggest that more attention should be paid in the realist project to
values. This is especially important if, in addition to explanation, realists
are to influence decisions (Julnes et al., 1998). Nonetheless, this chapter
draws mainly on the work of Pawson and Tilley to describe the realist position
in evaluation. In some ways reallsm continues the positivist project: it too
seeks explanation and believes in the possibility of accumulating reliable
knowledge about the real world, albeit through different methodological
spectacles. According to Pawson and Tilley, it seeks to open the ‘black-box’
within programmes or policies to uncover the mecha nisms that account for what
brings about change. It does so by situating such mechanisms in contexts and
attributing to contexts the key to what makes mechanisms work or not work. This
is especially important in domains such as VET where the evaluation objects are
varied and drawn from different elements into different configurations in
differentiated contexts.
“What we want to resist here is the notion that programs are
targeted at subjects and that as a consequence program efficacy is simply a
matter of changing the individual subject.” (Pawson and Tilley, 1997; p. 64).
Rather than accept a logic that sees programmes and policies
as simple chains of cause and effect, they are better seen as embedded in
multilayered (or stratified) social and organisational processes. Evaluators
need to focus on ‘underlying mechanisms’: those decisions or actions that lead
to change, which is embedded in a broader social reality. However these
mechanisms are not uniform or consistent even within a single programme.
Different mechanisms come into play in different contexts, which is why some
programmes or policy instruments work in some, but not all, situations.
Like all those interested in causal inference, realists are
also interested in making sense of patterns or regularities. These are not seen
at the level of some programme level aggregation but rather at the underlying
level where mechanisms operate. As Pawson and Tilley (1997: p. 71) note:
‘regularity mechanism + context’. Outcomes are the results of mechanisms
unleashed by
Particular programmes. It is the mechanisms that bring about
change and any programme will probably rely on more than one mechanism, not.
All of which may be evident to programme archi tects or policy-makers.
As Pawson and Tilley summarise the logic of realist explanation:
The basic task of social inquiry is to explain interesting, puzzling, socially
signifi cant regularities (R). Explanation takes the form of positing some
underlying mechanism (M) which generates the regularity and thus consists of
propositions about how the interplay between structure and agency has
constituted the regu larity. Within realist investigation there is also inves
tigation of how the workings of such mechanisms are contingent and conditional,
and thus only fired in particular local, historical or institutional contexts
(C) (Pawson and Tilley, 1997; p. 71).
Applying this logic to VET, we may note, for example, that
subsidies to increase work-based learning and CVT in firms sometimes lead to
greater uptake by the intended beneficiaries. This need not lead to the
assessment of the programme as ineffective because, for example, positive
outcomes can only be observed in 30 % of cases, We try rather to understand the
mechanisms and contexts which lead to success. Is the context one. Where firms showing
positive outcomes are in a particular sector or value chain or type of region?
Or is it more to do with the skill composition of the firms concerned? Are the
mechanisms that work in these contexts effective because a previous invest ment
has been made in work-based learning at the firm level or is it because of the
local or regional training infrastructure? Which mechanisms are at play and in
what context:
(a)
The competitive instincts of managers (mech anism), who
fear that their competitors will benefit (context) unless they also increase in
their CVT efforts?
(b) the demands of
trade unions concerned about the professionalisation and labour-market. Strength
of their members (mechanism), sparked off by their awareness of the avail
ability of subsidies (context)?
(c) the increased effectiveness of the marketing Efforts of
training providers (mechanism) made Possible by the subsidies they have received
(context)?
According to the reallsts, it is by examining and Comparing
the mechanisms and contexts in which they operate in relation to observed
outcomes, that it becomes possible to understand success and describe it. For
Pawson and Tilley, all revolves around these CMO (context, mechanism, outcome)
configurations.
Policy-makers are then in a position to consider options
such as:
(a) focusing the programme more narrowly at beneficiaries
that are likely to change because of the mechanisms that work in the contexts
they inhabit;
(b) differentiating a programme and its instruments
more clearly to ensure that different
mechanisms that work in different contexts are adequately covered;
(c) seeking to
influence the contexts within which the programme aims to be effective. The
table below, taken from the concluding chapter of Pawson and Tilley’s book,
provides a brief summary of the realist position, in terms of eight ‘rules’
that are seen as encapsulating the key ideas of realistic enquiry and method.
Rules guiding realistic enquiry and method
Key ideas of realistic enquiry and method.
1: Generative causation
Rule Evaluators need to attend to how and why social
programmes have the potential to causation.
Rule 2: Ontological depth
Evaluators need to penetrate beneath the surface of
observable inputs and outputs of a programme.
Rule 3: Mechanisms Evaluators need to focus on how the
causal mechanisms which generate social and behavioural problems are removed or
countered through the alternative causal mechanisms introduced in a social
program.
Rule 4: Contexts
Social programme Evaluators need to understand the contexts
within which problem mechanisms are activated and In which programme mechanisms
can be successfully fired
Rule 5: Outcomes
Evaluators need to understand what are the outcomes of an
initiative and how they are produced
Rule 6: CMO configurations
In order to develop transferable and cumulative lessons from
research, evaluators need to orient the thinking to context-mechanism-outcome
patter configurations (CMO configurations)
Rule 7: Teacher-learnor processes In order to construct and
test context-mechanism-outcome pattern explanations, evaluators need to Engage
in a teacher-leamer relationship with program policy-makers, practitioners and participants.
Rule 8: Open systems
Evaluators need to acknowledge that programmes are
implemented in a changing and permeable. Social world, and that programme
effectiveness may thus be subverted or enhanced through the unanticipated
intrusion of new contexts and now causal powersAdapted from Pawson and Tilley
(1997)
Source: Adapted from Pawson and Tilley (1997)
Constructivists
Constructivists deny the possibility of objective knowledge
about the world. They follow more in the tradition of Kant and other
continental Euro pean philosophers than the mainly Anglo Saxon school that
underpins positivism and realism, It is only through the theorisations of the
observer that the world can be understood. ‘Socially constructed causal and metaphysical
Phenomena are, according to the constructivist, Actors or stakeholders.
According to Stufflebeam in his review of Foundation models for 21st
century program evaluation: Constructivism rejects the existence of any
ultimate reality and employs a subjectivist epistemology. It sees knowledge
gained as one or more human constructions, uncertifiable, and constantly prob
lematic and changing. It places the evaluators and program stakeholder at the
centre of the inquiry. Process, employing all of them as the evaluation’s
“human instruments”. The approach insists that Real. They are as real as anything
scientists can study ever gets. The impression that there is some sort of
socially unconstructed reality that is somehow deeper than the socially
constructed variety rests, the constructivist maintains, on a failure to
appreciate the theory-dependence of all our methods. The only sort of reality
any of our methods are good for studying is a theory dependent reality. (Boyd,
1991; p. 13). The way we know, whatever the instruments And methods we use, is
constructed by human Philosophies and types of evaluation research 27 Evaluators
be totally ethical in respecting and advocating for all the participants,
especially the disenfranchised.’ (Stufflebeam, 2000a; pp. 71-72). The most
articulate advocates of construc tivism in evaluation are Guba and Lincoln. They
have mapped out the main differences between constructivists and the
conventional position (as they label positivists) in their well-known text
Fourth generation evaluation (Guba and Lincoln, 1989). The highlights of this
comparison is summarised below:

According to Guba and Lincoln, when consid ering the purpose
of evaluations, one needs to distinguish both between merit and worth and
between summative and formative intent:
(a)
A formative merit evaluation is one concerned with
assessing the intrinsic value of some evaluand with the intent of improving it,
so, for example, a proposed new curriculum could be assessed for modemity,
integrity. Continuity, sequence, and so on, for the sake of discovering ways in
which those character
Istics might be improved; (b) a formative worth evaluation
is one concerned with assessing the extrinsic value of some evaluand with the
intent of improving it; so, for example, a proposed new. Curriculum could be
assessed for the extent to which desired outcomes are produced in some actual
context of application, for the sake of discovering ways in which its perfor
mance might be improved;
(b)
A summative merit evaluation is one concerned with
assessing the intrinsic value of some evaluand with the intent of deter mining
whether it meets some minimal (or normative or optimal) standard for modernity.
Integrity, and so on. A positive evaluation results in the evaluand being
warranted as meeting Its Internal design specifications;
(c)
A summative worth evaluation is one concerned with
assessing the extrinsic value of some evaluand for use in some actual context
of application. A positive evaluation results in the evaluand being warranted
for use in that context. (Guba and Lincoln, 1989; pp. 189-190).
In practical terms, what it is that the evaluator should do,
Guba and Lincoln start from the claims, concerns and issues that are identified
by stakeholders, people who are put at some risk by the evaluation’. It is therefore
necessary for evaluators to be responsive’, ‘One of the major tasks for the
evaluator is to conduct the evaluation in such a way that each group must
confront and deal with the constructions of all others, a process we shall
refer to as hermeneutic dialectic. [...] Ideally responsive evaluation seeks to
reach consensus on all claims, concerns and issues[...]’ (Guba and Lincoln,
1989; p. 41),
A distinctive role of the evaluator, therefore, is to help
put together hermeneutic’ circles. This is defined by Guba and Lincoln as a
process that. brings together divergent views and seeks to interpret and
synthesise them mainly to allow their mutual exploration by all parties' (Guba
and Lincoln, 1989; p. 149). As Schwandt has argued from a postmodernist standpoint,
'only throught situated use in discursive practices or language games do human
actions acquire meaning' (Schwandt, 1997; p. 69). Applied to evaluation, this
position argues for the Importance of the 'dialogic encounter in which
evaluators are 'becoming partners in an ethically informed, reasoned
conversation about essentially contested concepts [...]' (Schwandt, 1997; p.
79). In more down to earth terms, Guba and
Lincoln emphasise the role of the evaluator to: (a) prioritise those
unresolved claims, concerns and issues of stakeholders that have survived.
earlier rounds of dialogue orchestrated by the
evaluator. (b) collect information through a variety of means. -
collating the results of other evaluations,
reanalysing the information previously gener ated in dialogue among
stakeholders, conducting further studies that may lead to the reconstruction'
of understandings among
stakeholders; (c) prepare and
carry out negotiations that, as far as possible and within the resources avail
able, resolve that which can be resolved and (possibly) identify new issues
that the st holders wish to take further in another e 21 ation round. So how might this be exemplified
in the VET domain? It should be noted that what follows does not fully conform
to Guba and Lincoln's vision of constructivist evaluation, largely because it
is situated in a larger scale socioeco nomic policy context than many of their
own smaller scale case examples. But also it should be noted that
constructivist thinking is, to some extent, relevant to many contemporary
evaluation challenges and the example below is intended to illustrate such
potential relevance. So, to apply this
logic to VET, constructivist thinking can be especially helpful where there is
a problem area with many stakeholders and the entire system will only be able
to progress if there is a broad consensus. For example, there may be a
political desire to become more inclusive and involve previously marginalised
groups in training opportunities. The problem is how to ensure that certain
groups such as women, young people. and ethnic communities are given a higher
profile in VET. Here, the involvement of many stake holders will be inevitable.
Furthermore the views of these stakeholders are more than data for the
evaluator: they are the determinants and shapers of possible action and change.
Unless the trainers, employers, advocacy groups, funding authorities and
employment services responsible for job-matching and the groups being targeted'
cooperate, change will not occur. It is also likely that these stakeholders
hold vital information and insights into the past experience of similar
efforts; what went wrong and right and what could be done to bring about
improvements in the future. The evaluator might then follow much of the constructivist logic outlined above: (a)
identify the different stakeholders who poten tially have a stake in these
areas of concern; (b) conduct a series of initial discussions to clarify what
they know, what they want and what are their interests;
c. feed back to all stakeholders their own and each other’s
interests, knowledge and concerns in a way that emphasis the similari ties and
differences: (d) clarify areas of agreement and disagreement
and initiate discussions among the stake
holders and their representatives to clarify
areas of consensus and continuing dissent; (e) agree what
other sources of information could help move the stakeholders forward perhaps
by synthesising other available studies, perhaps by initiating new studies;
(f) reach the best possible consensus about what should be
done to improve VET provision and participation for the groups concerned.
It is worth highlighting that the balance of activities
within constructivist evaluation is very different from both positivist and realist
variants. It emphasises the responsive, interactive, dialogic and
‘orchestrating’ role of the evaluator because the sources of data that are
privileged are seen to reside with stakeholders, as much as with new studies
and externally generated data.eacj other's interests, knowledge and concerns in
a way that emphasis the similari ties and differences: (d) clarify areas of
agreement and disagreement and initiate
discussions among the stake holders and
their representatives to clarify areas
of consensus and continuing dissent; (e) agree what other sources of
information could help move the stakeholders forward perhaps by synthesising
other available studies, perhaps by initiating new studies; (f) reach the best possible consensus about
what should be done to improve VET provision and participation for the groups
concerned. It is worth highlighting that
the balance of activities within constructivist evaluation is very different
from both positivist and realist variants. It emphasises the responsive, interactive,
dialogic and 'orchestrating' role of the evaluator because the sources of data
that are privileged are seen to reside with stakeholders, as much as with new
studies and externally generated data.