further circumstances, like the required precision, the available instruments, and the style of docu-
mentation. As a consequence, the predominance of situations like that leads to the illusion of a
‘single-step measurement’: it is broadly assumed that only the name of one variable or observable
must be said and then everything is clear.
By way of contrast, there are situations in which the definition phase, the identification of
the context can no longer be skipped and, further- more, will require some labour and a methodical
procedure. If, for instance, the similarity or dis- similarity between two structures from a given set
is to be quantified, this will require the formula- tion of a suitable graph grammar that generates at
least all the structures under consideration Gern- ert, 1996. As soon as such a graph grammar has
been specified, the requested similarity measure is clear. Now, the crucial point is that a graph
grammar with the required properties always ex- ists, but it is not uniquely defined. The necessity
to select exactly one graph grammar from the multitude of all suitable ones just stands for the
compulsory specification of the context, like the goal pursued by that individual measurement of
similarity.
A characteristic feature of perspective notions is the necessity of a two-step proceeding, such that a
definition phase precedes. A mathematical treat- ment is possible, but this is distinguished from the
customary style: a measure will no more be sup- plied by a single formula, nor by several formulas,
but by a proceeding in which, in addition, a mathematical structure must and can be set up
that accounts just for the peculiarity of the per- spective notion. The above-mentioned illusion of
a single-step measurement, the expectation that problem data simply can be inserted into a couple
of formulas, turns out to be a widespread tacit assumption
4
. It seems plausible that just this tacit assumption is one of the causes why the concept
of pragmatic information is accepted in such a reluctant and hesitating manner.
4. A systematic description of cognitive processes
4
.
1
. O6er6iew The formalism for the description of cognitive
processes which is to be developed here may be of interest for cognitive science, too, but this is not
the primary goal. Rather, the formalism is in- tended as an intermediate step; the central issue is
the analogy with operator algebras in quantum theory Section 5. We presuppose a cognitive
system of any kind, that is a system capable of performing processes which can be interpreted as
learning, concluding, forgetting, etc.; this system may be a human individual, an animal, or a
technical device. Formally, the system is charac- terized by
1. states, which can vary in time and can be represented by vectors from a finite-dimen-
sional real vector space R
n
with a fixed posi- tive integer n, and
2. transitions, which lead from one state to an- other and which are represented by operators.
In this sense, an operator stands for an elemen- tary state change within the underlying system,
e.g. for a single act of learning. Five classes of operators are proposed as follows:
1. Learning: L = {
L
1
, L
2
,...} C = {
C
1
, C
2
,...} 2. Conclusion:
3. Valuation:
V = { V
1
, V
2
,...} R = {
R
1
, R
2
,...} 4. Re6ision:
5. Tentati6e:
T ={T
1
, T
2
,...} Some well-known types of cognitive processes,
like forgetting, concept formation, or problem- solving, will not be recognized in this list, but it
will be shown later Section 4.3 that some charac- teristic cognitive processes can be represented by
suitable combinations of these elementary opera- tors. At least a great majority of all cognitive
processes can be represented in this way.
4
Descartes planned to present a universal method for the solution of problems, which can be roughly outlined as fol-
lows: ‘‘First, reduce any kind of problem to a mathematical problem. Second, reduce any kind of mathematical problem to
a problem of algebra. Third, reduce any problem of algebra to the solution of a single equation.’’ Polya, 1962 In quoting
this, Polya immediately attaches his reservations concerning the validity and reach of this general rule.
4
.
2
. The fundamental operators and their properties in detail
4
.
2
.
1
. Operators in class L
:
learning Any cognitive system has a system environ-
ment. The operators in L learning describe pro- cesses by which a system accepts information from
outside. The reception of some new information can lead to a change of the system behaviour or to
a modification of its internal structure such that its repertoire for future behaviour is extended.
If the underlying system S is fixed, we can write L
i
instead of L
i
S; and L
i
L
k
S, which means that S first undergoes the operation L
k
and then L
i
, can be abridged as L
i
L
k
. In the general case this composition of operators is not commutative:
L
i
L
k
L
k
L
i
. As an illustration two different tasks of learning are contrasted. If ‘serious’ material is
to be learned, that is material with an internal structure, then the temporal order of its presenta-
tion can be relevant, whereas in extreme cases of rote-learning the temporal order of input opera-
tions may be irrelevant. This crucial feature of noncommutati6ity will be discussed later Section
6.1.
4
.
2
.
2
. Operations in class C
:
conclusion
Operations in class C, written as C
i
, C
k
,…, de- note conclusions derived from entries already ex-
isting within the system. We write C
i
C
k
short for C
i
C
k
S for the fact that the conclusion C
k
is performed first with the knowledge contributed
by C
k
being stored in the system and C
i
is performed afterwards.
It would be irrelevant from a logical viewpoint which of two possible conclusions is achieved first.
Here, however, only ‘realistic’ systems are consid- ered, such that labour, time, or energy consumed
play a role, and hence commutativity can no longer be maintained. The new findings obtained
by the conclusion operation C
k
can simplify the subsequent operation C
i
significantly, whereas no such reduction of labour may occur if both opera-
tors are applied in the inverse order. Therefore in the general case C
i
C
k
C
k
C
i
. Which of the many
possible operators in C will really be activated may be triggered by a process of valuation as
described in the next section.
4
.
2
.
3
. Operations in class V
:
6 aluation
An important class of operations occurring within cognitive systems can be united under the
term ‘valuation’. The class V includes the opera- tors V
i
, V
k
,… The object of a single process of valuation can
be
a single item of knowledge already stored in the system
the present state of the system when it is checked, e.g. whether a solution to a certain
problem already has been found
a recent state change caused e.g. by some new incoming information
a series of recent state changes e.g. the result of a series of conclusion operations.
The result of an act of valuation can be
a predicate, like ‘true’‘false’, ‘relevant’‘irrele- vant’, etc.
a mathematical object, like a number, a vector, a matrix, a function, a network, or a system of
relations
the identification of an item of information which fulfills given requirements.
There is a variety of reasons why valuation operators are necessary, and different purposes
are pursued by them:
In ‘realistic’ systems a distinction must be made between relevant and irrelevant parts of incom-
ing information.
In a similar way it must be decided whether the result of a series of conclusion operations is to
be stored or not.
Among the items of knowledge already present in the system those must be identified and
selected which are likely to fit to a given task.
It must be decided whether a certain strategy makes sense or by which different one it
should be replaced.
It must be recognized whether an operation called ‘revision’ Section 4.2.4 or ‘tentative’
Section 4.2.5 becomes necessary, or at least useful, and which will be the proper side condi-
tions for such an operation.
Valuation means that an object is confronted with a predefined standard. In simple cases a
procedure takes certain features of the given ob- ject as its input and supplies, e.g. an index, a
score, or a Boolean value like ‘acceptable’‘not acceptable’. In the general case, however, the
outcome of a valuation process is not necessarily a single value. Rather, it can take on the shape of
a vector, a matrix, a function, etc. which represent the discrepancies between the ideal standard and
the real situation with respect to several criteria deviation profile. The result of a valuation pro-
cess may also point forward to actions to be taken.
A necessary tool for many valuation processes is the measurement of the similarity or dissimilar-
ity between two complex structures see Section 3.2. For example, incoming information must be
compared with the available information, in order to avoid redundant entries, but also with the
information requirements of the system, in order to exclude irrelevant information. The search for
some information that is likely to fit to a certain task can be an issue of similarity measurement,
too.
Just as the operators in L and C, also the operators in V are noncommutative in the general
case: V
i
V
k
V
k
V
i
. If e.g. V
i
has identified and selected a set of objects with required properties,
then the subsequent operation V
k
can focus on just these objects — the overall effort or effi-
ciency may depend on the order of both operators.
4
.
2
.
4
. Operators in class R
:
re6ision Every ‘realistic’, and hence finite system has a
limited ability to accept, to store, and to process information. Therefore such a system is forced to
economize on these limited capacities. If new knowledge is permanently accepted and accumu-
lated, if numerous results of conclusions are con- sidered worth storing, then a revision of the
underlying representation scheme will become compulsory from time to time in order to main-
tain an efficient usage of capacities.
The following examples show two different situ- ations, but also two techniques for a transition to
a new representation scheme: 1. If there is a simple finite graph with a rela-
tively small number of edges, then it is reason- able to represent it by the list of its pairs of
connected vertices. But if, step by step, new edges are inserted, then there will be a critical
point beyond which it will be more advanta- geous to store the graph as its adjacency
matrix.
2. A series of measurement data can be stored as a long list of pairs
x
i
, y
i
, but also by a short string or code representing an approximating
formula like y = ax or y = a log x. In the first example the transitions from one
representation to the other and back are re- versible. By way of contrast, the second example
stands for the frequent situation that a change of representation implies a loss of information: the
original version can no more be reconstructed from the ‘condensed’ form in category theory the
term ‘forgetful functors’ is used.
A theoretical framework is supplied by a con- cept named belief re6ision, knowledge re6ision, or
theory change, which is now pursued in an inter- disciplinary effort in philosophy, logic, and com-
puter science Rott, 1996. Computer scientists mainly study the problem of how to revise a body
of knowledge if updating must be performed un- der capacity restrictions Fuhrmann and Mor-
reau, 1991; Wrobel, 1994 and the problem of consistency maintenance in knowledge-based sys-
tems. Logic and philosophy, however, focus on the necessary modifications of an ensemble of
propositions provoked by new, frequently incom- patible information
5
. The transition from a bulk of original data to a
formula the second example above can be re- garded as a primitive type of theory formation.
Under a unifying view we find a quasi-continuous transition from
a merely ‘technical’ revision, which is enforced by capacity restrictions the first example
above and permits conversions in both direc- tions without any loss of information, at one
end of a scale, to
a fundamental revision of an established theory — a paradigm change — at the opposite end
of that scale. We can use the terms ‘weak re6ision
’
and ‘strong re6ision
’
for these two extreme cases, provided
5
For a recent multidisciplinary overview of belief revision see Gabbay and Smets 1998.
that the quasi-continuous transition between them will be kept in mind.
Not in all cases the physical deletion of entries represented in the ‘old’ style will be compulsory.
In special cases it can make sense to store an entry both in the old and in the shorter new representa-
tion, such that both versions can be used alterna- tively Section 4.3.
Here the operators in R, written as R
i
, R
k
,…, denote a ‘revision’, that is a transition to a differ-
ent representation scheme. Apart from special cases
6
these operators, again, are noncommuta- tive: R
i
R
k
R
k
R
i
— each of both operators can entail its specific loss of information, and the
situation found by the operator acting later can have been significantly altered by the operator
which had acted first.
4
.
2
.
5
. Operators in class T
:
tentati6e In many methods of heuristic problem-solving
heuristic programming, genetic algorithms, etc. sequences of patterns or configurations are gener-
ated in a tentati6e manner. It is the purpose of these attempts to eventually find a configuration
fitting given requirements, or to proceed from a preliminary solution to a better one even if an
optimum cannot always be guaranteed. Such pat- terns can be generated — apart from random
processes — by sets of recursive rules, as for instance rewriting rules, graph grammars, shape
grammars
7
, or the transition rules typical of ge- netic algorithms.
An example may be a possible heuristic ap- proach to the travelling salesman problem. Given
a list of cities together with the distances between each pair of them, an optimal route is to be found
that visits each city exactly once and as assumed here for the sake of simplicity finally leads back
to the starting-point. A tentative configuration may be a closed loop which, however, does not
yet include all cities and hence must be expanded by a stepwise inclusion of further cities, or a
closed loop through all cities which is not yet optimal, but can be improved by a series of local
exchange operations.
Formally, the class T tentative consists of all
operators T
1
, T
2
,…; each operator stands for a process by which exactly one new pattern is gener-
ated. If two production rules of a graph grammar or a related system are applied one after an-
other, the result strongly depends on the order of execution, and in the general case two operators
in T are noncommutative:T
i
T
k
T
k
T
i
.
4
.
3
. The question of completeness As mentioned before, there are well-known
types of cognitive processes that cannot be found in the list of ‘elementary operators’ Section 4.2.
Of course, a proof of ‘completeness’ — the possi- bility to describe every cognitive process by a
combination of operators of the five kinds pro- posed here — is excluded, but there is at least
some plausibility that a significant part of the field is covered
8
. For cognitive processes of some essential types
it can be shown that such a representation is possible. Regularly conclusions C are necessary,
and new information from outside L can inter- vene; hence operators in C or in L are not always
explicitly addressed.
An ubiquitous type, that should not be forgot- ten, is forgetting. Some information stored in the
system can be deleted as a conseqence of a new valuation of its relevance. This can be a by-
product of a revision, and the deletion of a single item may be considered a boundary case of revi-
sion. The operators in V and R are sufficient to represent forgetting.
Three central types of cognitive processes — classification,
concept formation,
and pattern
8
No metaphysical assumptions are underlying here. Particu- larly, it is not assumed that human beings can be completely
described as ‘information-processing systems’ or something like that; nor is it intended to join the debate on limitations of
computers. If some class of cognitive processes would be identified that cannot be represented by combinations of the
operators proposed here, this would be a positive result, too.
6
Two operators R
i
and R
k
may be ‘independent’ if they modify disjoint sets of entries.
7
A survey of graph grammars and some applications can be found in Gernert 1997, for shape grammars see the mono-
graph by Gips 1975.
recognition — will be treated jointly. They are mutually connected, and they have in common
that perspective notions and similarity measures Section 3.2 play a dominant role. All objects
that are assigned to the same class within a given classification scheme are connected by their ‘intra-
class similarity’, and the same holds for all objects subsumed under the same concept. If no classifi-
cation scheme is previously defined, as in a method termed ‘cluster analysis’, then solely the
similarity measure and the given overall number of classes will steer the assignment of the objects
to equal or different classes. To each of those classes a newly created term can be assigned
afterwards; here again we find a hint to the affinity
between classification
and concept
formation. Pattern recognition can be understood as the
identification of those structures which have a sufficient similarity with one or another element
from a predefined set of ‘standard patterns’. For example, character recognition means that a sin-
gle scribble is identified if possible with the best fitting letter from a given alphabet.
Concept formation can essentially take on one of the following shapes:
1. As already mentioned, after a process of clas- sification a characterizing notion can be as-
signed to each of the classes. 2. It can be recognized that part of the objects in
a certain class have a characteristic feature in common and hence can be subsumed under a
new notion. For example, some materials share a ‘medium conductivity’ and thus are
named ‘semiconductors’.
3. If a variable remains constant in spite of varia- tions of other variables or of the overall sys-
tem state, then that special variable may be given a marked name. The most important
example is the term ‘energy’, which became a scientific term through the discovery of the
corresponding conservation principle.
To sum up, processes of classification, pattern recognition, and concept formation can be man-
aged mainly by operators from V and R.
Processes of problem-sol6ing show a permanent interplay of preliminary, tentative steps and valu-
ations of the proposals generated in that way. There must be a chance that a series of tentative
steps can be totally discarded and a new attempt can be made from a different starting-point or
with a new series of tentative steps backtracking. In this context, it sometimes can make sense to
temporarily store some information in more than one representation simultaneously e.g. in the
original and a condensed version, see Section 4.2.4. Problem-solving processes can essentially
be built up by operators from V and T.
4
.
4
. Fundamental operators acting in parallel For the sake of simplicity, parallel processes
have not been addressed until now. Let X, Y, Z,… denote operators from any of the five classes
introduced above Section 4.2. If two operators X and Y act in parallel this can be understood as
an operator again
9
, and that operator will be written as X + Y. By way of contrast, this compo-
sition is always commutative: X + Y = Y + X. Of course, a concrete implementation of a system
with parallel processing would require regulations concerning the relative independence of the com-
ponents working in parallel and their possible interactions, but this would not contribute to the
purpose of this paper. Some relevant aspects will be discussed in Section 5.4.
5. The algebras A and A defined by classes of operators