CHAPTER I (continued)

Basis for and outline of the Refutation Approach to problem of Speech Recognition based on Poppers solution to the Problem of Induction (Hume's Problem)

1.5. Basis for an approach to the speech recognition problem

Debates in the Philosophy of Science have suggested the existence of a variety of problems with respect to the validity of Popper's solution to the problem of induction (e.g. see Schilpp 1974). For example, the test statements referred to above are singular existential statements (Popper 1980 pp. 100-103) and they must be testable, inter-subjectively by observation (ibid). However, basic statements are themselves open to testing and hence the process is in principle amenable to infinite regress (ibid pp. 104-105).

Popper limits the scope of his argument, to that of the logical problem of induction.

"..., the central issue of the logical problem of induction is the validity (truth or falsity) of universal laws relative to some 'given' test statements. I do not raise the question, 'How do we decide the truth or falsity of test statements?', that is, of singular descriptions of observable events. The latter question should not, I suggest, be regarded as part of the problem of induction, since Hume's question was whether we are justified in reasoning from experienced to unexperienced 'instances'. Neither Hume nor any other writer on the subject before me has to my knowledge moved on from here to the further questions: Can we take the 'experienced instances' for granted? And are they really prior to the theories? Although these further questions are some of those problems to which I was led by my solution to the problem of induction, they go beyond the original problem" (Popper 1981 p.8 emphases in the original).

One might think that it is precisely the problem of determining the truth or falsity of singular descriptions of events which is at issue when we consider the problem of perception, i.e. is it true that I'm looking at what I think I'm looking at. But that is not so. An individual's perceptions take place in the context of their own internal model of the world. This model is the individual's theory of how the world is and how it is at the present moment. Each individual makes moment to moment hypotheses concerning what is happening in his or her environment according to his or her model. These hypotheses give rise to expectations regarding the sensory inputs to be received. It is the values of these sensory inputs which within the individual constitute the basic observation statements and it is these values which either are or are not as they were predicted to be. If not as predicted then there is something amiss with the person's model of the current environment and it needs to be revised to take account of the unexpected values. That is, the problem of infinite regression does not occur if the theory purports to be a full explanation of everything that there is to be understood about the world (Duhem 1954 pp. 183-188).

The above arguments provide the logical justification for a method in automatic speech recognition here referred to as the "Refutation" approach.

1.6. Outline of the Refutation Approach to Speech Recognition

There are two necessary aspects to developing a description of a physical phenomenon for use as a recognition procedure.

The first is that it must be of a form easy for researchers to visualise and discuss. The second is that the description have measurable consequences which may be tested by the machine.

Spectrogram readers describe speech spectrograms in terms of formant trajectories, onsets of voicing, times of release of plosives and so on. All these terms refer to phenomena which have definite time of occurrence, of duration and occupy definite regions of the frequency spectrum. As is well known, the automatic identification of the correlates of those phenomena has been a hard problem to solve.

The Refutation approach cuts through this Gordian knot. The approach does not aim to provide a "template" or any other form of reference object that is to be compared with the signal to ascertain the similarity between it and the referent. Instead the approach uses the hypothesised times and locations of the these phenomena as sets of reference points from which to predict energy distributions to be tested by measurement.

That is, and this is the key concept to be grasped, the elements which for instance we refer to as formants, are the perceptual correlates of complex theoretical mental processes which are endeavouring to make sense of the energy distributions in the spectrogram in terms of vocal tract characteristics and activity. The formants are defined by that theoretical mental endeavour and have no necessary physical existence outside of it. So there is no physical entitity to be compared with it for identification purposes. What there is instead, is the set of measurable consequences to those theoretical mental processes and it is these that are used to test hypothesised instances of those theoretical entities.

The Refutation Approach is as follows.

1. Describe a given sequence of vocal tract articulations in terms of entities such as onsets of voicing, formant midpoint trajectories, times of release of stops and plosives etc. In this description include limits on the values of the times of occurrence or frequencies occupied by these entities, for instance the limits on the interval on the release of /t/ and onset of voicing.

2. Write a computer program systematically to instantiate for every time t, every combination of variable values (within a given level of resolution/accuracy of representation) that fall within the prescribed limits. Each combination then constitutes a hypothesised instance of a particular articulatory sequence.

3 Define relations between sets of measurement to be made relative to the values of each hypothesis. These relations relate to the expectations arising out of the theory/description of the articulatory sequences under test.

4. Run the computer program over the data. The survival or otherwise of the hypotheses represents the times at which to the best of the researcher's ability to describe the articulations in question, those articulations did or did not occur. The extent to which the results corrspond with what actually happened is a measure of the success of the description and quality of testing.

1.7. Structure of remaining chapters

An example Refutation Based procedure is described in chapter 2 and the results are discussed in chapter 3. Chapter 4 discusses techniques which in many ways anticipate some of the technical aspects of the Refutation approach. Chapter 5 compares the Refutation approach with past approaches to the problem of automatic speech recognition. Chapter 6 presents a description of a speaker independent, isolated word Refutation based digit recogniser. In conclusion, chapter 7 suggests areas for future research.



Please send me your comments

If you include your e-mail I may reply!  

Page last modified: 11:57 Monday 7th. November 2011