The soft computing technologies

A.F. Shapiro, R. Paul Gorman Insurance: Mathematics and Economics 26 2000 289–307 291 ers who are unfamiliar with the area. Following that, the reminder of this section discusses the grouping of ANM technologies into functional classes and the team approach.

3. The soft computing technologies

This section gives a cursory overview of the soft computing technologies NNs, FL and GAs. The reader is referred to the references for more detail on each of these technologies. 3.1. Neural networks NNs Bishop, 1995 are software programs that emulate the biological structure of the human brain and its associated neural complex and are used for pattern classification, prediction and financial analysis, and control and optimization. The core of an NN is the neural processing unit, a representation of which is shown in Fig. 2. The inputs to the neuron, x j , are multiplied by their respective weights, w j , and aggregated. The weight w serves the same function as the intercept in a regression formula. The weighted sum is then passed through an activation function, F, to produce the output of the unit. Often, the activation function takes the form of the logistic function F z = 1 + e − z − 1 , where z = P j w j x j , as shown in the figure. NN can be either supervised or unsupervised. The distinguishing feature of a supervised NN is that its input and output is known and its objective is to dis- cover a relationship between the two. Insurance applications using supervised NNs include Tu 1993, who compared NNs and logistic regression models for pre- dicting length of stay in the intensive care unit follow- Fig. 2. Neural processing unit. ing cardiac surgery, and Brockett et al. 1994, who sought to improve the early warning signals associated with property-liability insurance company insolvency. The distinguishing feature of an unsupervised NN is that only the input is known and the goal is to un- cover patterns in the features of the input data. Insur- ance applications involving unsupervised NNs include Jang 1997, who investigated insolvencies in the life insurance industry and Brockett et al. 1998, who investigated automobile bodily injury claims fraud. The remainder of this section is devoted to an overview of supervised and unsupervised NNs. 3.1.1. Supervised neural networks A sketch of the operation of a supervised NN is shown in Fig. 3. Since supervised learning is involved, the system will attempt to match a know output, such as firms that have become insolvent or claims which are fraudulent. The process begins by assigning random weights to the connection between each set of neurons in the network. These weights represent the intensity of the connection between any two neurons and will contain the memory of the network. Given the weights, the intermediate values a hidden layer and the output of the system are computed. If the output is optimal, the process is halted; if not, the weights are adjusted and the process is continued until an optimal solution is obtained or an alternate stopping rule is reached. If the flow of information through the network is from the input to the output, it is known as a feed forward network. The NN is said to involve back-propagation since inadequacies in the output are fed back through the network so that the algorithm can be improved. Fig. 3. The operation of a supervised NN. 292 A.F. Shapiro, R. Paul Gorman Insurance: Mathematics and Economics 26 2000 289–307 Fig. 4. Three-layer neural network. 3.1.2. A three-layer neural network An NN is composed of layers of neurons, an example of which is the three-layer NN depicted in Fig. 4. Extending the notation associated with Fig. 2, the first layer, the input layer, has three neurons labeled x 0j , j=0, 1, 2, the second layer, the hidden processing layer, has three neurons labeled x 1j , j=0, 1, 2, and the third layer, the output layer, has one neuron labeled x 21 . There are two inputs I1 and I2. The neurons are connected by the weights w ij k , where the subscripts i, j, and k refer to the ith layer, the jth node of the ith layer, and the kth node of the i+1th layer, respectively. Thus, for example, w 021 is the weight connecting node 2 of the input layer layer 0 to node 1 of the hidden layer layer 1. It follows that the aggregation in the neural processing associated with the hidden neuron x 11 results in z=x 00 w 001 + x 01 w 011 + x 02 w 021 , which is the input to the activation function. 3.1.3. The learning rules The weights of the network serve as its memory, and so the network “learns” when its weights are updated. The updating is done using a learning rule, a common example of which is the Delta rule Shepherd, 1997, p. 15, which is the product of a learning rate, which controls the speed of convergence, an error signal, and the value associated with the jth node of the ith layer. The choice of the learning rate is critical: if its value is too large, the error term may not converge at all, and if it is too small, the weight updating process may get stuck in a local minimum andor be extremely time intensive. 3.1.4. The learning strategy of a neural network The characteristic feature of NNs is their ability to learn and the strategy by which this takes place involves training, testing, and validation. Briefly, the clean and scrubbed data is randomly subdivided into three subsets: T1: which is used for training the network; T2: which is used for testing the stopping rule; T3: which is used for validating the resulting network. For example, T1, T2 and T3 may be 50, 25 and 25 of the database, respectively. The stopping rule reduces the likelihood that the network will become overtrained, by stopping the training on T1 when the predictive ability of the network, as measured on T2, is no longer improved. 3.1.5. Unsupervised neural networks This section discusses one of the most common unsupervised NNs, the Kohonen network Kohonen, 1988, which often is referred to as a self-organizing feature map SOFM. The purpose of the network is to emulate our understanding of how the brain uses spatial mappings to model complex data structures. Specifically, the learning algorithm develops a mapping from the input patterns to the output units that embodies the features of the input patterns. In contrast to the supervised network, where the neurons are arranged in layers, in the Kohonen network they are arranged in a planar configuration and the inputs are connected to each unit in the network. The configuration is depicted in Fig. 5. Fig. 5. Two-dimensional Kohonen network. A.F. Shapiro, R. Paul Gorman Insurance: Mathematics and Economics 26 2000 289–307 293 Fig. 6. Operation of a 2D Kohonen network. As indicated, the Kohonen SOFM is a two-layered network consisting of a set of input units in the input layer and a set of output units arranged in a grid called a Kohonen layer. The input and output layers are to- tally interconnected and there is a weight associated with each link, which is a measure of the intensity of the link. The sketch of the operation of an unsupervised NN is shown in Fig. 6. The first step in the process is to initialize the parameters and organize the data. This entails setting the iteration index, t, to 0, the interconnecting weights to small positive random values, and the learning rate to a value smaller than but close to 1. Each unit has a neighborhood of units associated with it and empirical evidence suggests that the best approach is to have the neighborhoods fairly broad initially and then to have them decrease over time. Similarly, the learning rate is a decreasing function of time. Each iteration begins by randomizing the training sample, which is composed of P patterns, each of which is represented by a numerical vector. For example, the patterns may be composed of solvent and insolvent insurance companies and the input variables may be financial ratios. Until the number of patterns used p exceeds the number available pP, the patterns are presented to the units on the grid, each of which is assigned the Euclidean distance between its connecting weight to the input unit and the value of the input. This distance is given by [ P j x j − w ij 2 ] 0.5 , where w ij is the connecting weight between the jth input unit and the ith unit on the grid and x j the input Fig. 7. An FL system. from unit j. The unit which is the best match to the pattern, the winning unit, is used to adjust the weights of the units in its neighborhood. The process contin- ues until the number of iterations exceeds some pre- determined value T. In the foregoing training process, the winning units in the Kohonen layer develop clusters of neighbors which represent the class types found in the training patterns. As a result, patterns associated with each other in the input space will be mapped on to output units which also are associated with each other. Since the class of each cluster is know, the network can be used to classify the inputs. 3.2. Fuzzy logic FL 6 was developed as a response to the fact that most of the parameters we encounter in the real world are not precisely defined. For example, a particular investor may have a “high risk capacity” or the rate of return on an investment might be “around 6”; the first of these is known as a linguistic variable while the second is known as a fuzzy number. These concepts and the structure of an FL system are discussed in this section. 3.2.1. The structure of a fuzzy logic system The essential structure of an FL system is depicted in the flow chart shown in Fig. 7, which was adapted from Von Altrock 1997, p. 37. 6 Following Zadeh 1994, p. 192, in this paper the term FL is used in the broad sense where it is essentially synonymous with fuzzy set theory. 294 A.F. Shapiro, R. Paul Gorman Insurance: Mathematics and Economics 26 2000 289–307 Fig. 8. Fuzzy set of clients with high risk capacity. In the figure, numerical variables are the input of the system. These variables are passed through a fuzzifi- cation stage, where they are transformed to linguistic variables and subjected to inference rules. The linguistic results are then transformed by a defuzzification stage into numerical values which become the output of the system. 3.2.2. Linguistic variables A linguistic variable Zedeh, 1975a,b, 1981 is a variables whose values are expressed as words or sen- tences. Risk capacity, for example, may be viewed both as a numerical value ranging over the interval [0,100], and a linguistic variable that can take on values like high, not very high, and so on. Each of these linguistic values may be interpreted as a label of a fuzzy subset of the universe of discourse X=[0,100], whose base variable, x, is the generic numerical value risk capacity. Such a set, an example of which is shown in Fig. 8, is characterized by a membership function, µ high x, which assigns to each object a grade of membership ranging between zero and one. In this case, which represents the set of clients with a high risk capacity, individuals with a risk capacity of 50, or less, are assigned a membership grade of zero and those with a risk capacity of 80, or more, are assigned a grade of one. Between those risk capacities, 50, 80, the grade of membership is fuzzy. Fuzzy sets are implemented by extending many of the basic identities that hold for ordinary sets. Thus, for example, the union of fuzzy sets A and B is the smallest fuzzy set containing both A and B, and the intersection of A and B is the largest fuzzy set which is contained in both A and B. Representative insurance paper involving linguistic variables include DeWit 1982, the first FL paper in the area, which dealt with individual underwriting, and Young 1993, 1996, who modeled the selection and rate changing process in group health insurance. 3.2.3. Fuzzy numbers The general characteristic of a fuzzy numbers Zedeh, 1975a,b; Dubois and Prade, 1980 is represented in Fig. 9. This shape of fuzzy number is referred to as a “flat” fuzzy number; if m 2 was equal to m 3 , it would be referred to as a “triangular” fuzzy number. The points m j , j=1, 2, 3, 4, and the functions f j y|M, j = 1, 2, M a fuzzy number, which are inverse functions mapping the membership function onto the real line, characterize the fuzzy number. As indicated, a fuzzy number is usually taken to be a convex fuzzy subset of the real line. As one would anticipate, fuzzy arithmetic can be ap- plied to the fuzzy numbers. Using the extension prin- ciple Zedeh, 1975a,b, the nonfuzzy arithmetic oper- ations can be extended to incorporate fuzzy sets and fuzzy numbers. Briefly, if is a binary operation such as addition + or min ∧, the fuzzy number z, defined by z=xy, is given as a fuzzy set by µ z w = V u,v µ x u ∧ µ y v, u, v, w ∈ R, subject to the constraint that w=uv, where µ x , µ y , and µ z denote the membership functions of x, y, and z, respectively and V u,v denotes the supremum over u, v. Representative insurance papers that focused on fuzzy numbers include Lemaire 1990, who showed how to compute a fuzzy premium for a pure en- dowment policy, Ostaszewski 1993, who extended Lemaire, and Cummins and Derrig 1997, who ad- dressed the financial pricing of property-liability insurance contracts. Fig. 9. A fuzzy number. A.F. Shapiro, R. Paul Gorman Insurance: Mathematics and Economics 26 2000 289–307 295 A large number of potential FL applications in insurance are mentioned in Ostaszewski 1993. Read- ers interested in a grand tour of the first 30 years of FL are urged to read the collection of Zadeh’s papers contained in Yager et al. 1987 and Klir and Yuan 1996. 3.3. Genetic algorithms GAs are automated heuristics that perform optimization by emulating biological evolution. They are particularly well suited for solving problems that involve loose constraints, such as discontinuity, noise, high dimensionality, and multimodal objective functions. Examples of GA applications in the insurance area include Wendt 1995, who used a GA to built a portfolio efficient frontier a set of portfolios with optimal combinations of risk and returns and Tan 1997, who developed a flexible framework to measure the profitability, risk, and competitiveness of insurance products. GAs can be thought of as an automated, intelligent approach to trial and error, based on principles of natural selection. In this sense, they are modern succes- sors to Monte Carlo search methods. The flow chart in Fig. 10 gives a representation of the process. As indicated, GAs are iterative procedures, where each iteration g represents a generation. The process starts with an initial population of solutions, P0, which are randomly generated. From this initial population, the best solutions are “bred” with each other and the worse are discarded. The process ends when the termination criterion is satisfied. For a simple example, suppose that the problem is to find by trial and error, the value of x, x=0, 1, . . . , Fig. 10. Flow chart of GA. 31, which maximizes fx, where fx is the output of a black box. Using the methodology of Holland 1975, an initial population of potential solutions {y j |j=1, . . . , N } would be randomly generated, where each solution would be represented in binary form. Thus, if 0 and 31 were in this initial population of solutions, they would be represented as 00000 and 11111, respectively. 7 A simple measure of the fitness of y j is p j = f y j P j f y j , and the solutions with the highest p j ’s would be bred with one another. There are three ways to develop a new generation of solutions: reproduction, crossover and mutation. Reproduction adds a copy of a fit individual to the next generation. In the previous example, reproduction would take place by randomly choosing a solution from the population, where the probability a given solution would be chosen depends on its p j value. Crossover emulates the process of creating children, and involves the creation of new individuals children from the two fit parents by a recombination of their genes parameters. In the example, crossover would take place in two steps: first, the fit parents would be randomly chosen on the basis of their p j values; second, there would be a recombination of their genes. If, for example, the fit parents were 11000 and 01101, crossover might result in the two children 11001 and 01100. Under mutation, there is a small probability that some of the gene values in the population will be replaced with randomly generated values. This has the potential effect of introducing good gene values that may not have occurred in the initial population or which were eliminated during the iterations. In this illustration, the process is repeated until the new generation has the same number of individuals M as the current one. 3.4. Hybrid systems While the foregoing discussions focused on each technology separately, a natural evolution in soft computing has been the emergence of hybrid systems, where the technologies are used simultaneously. FL based technologies can be used to design NNs or GAs, with the effect of increasing their capability to display good performance across a wide range of complex problems with imprecise data. Thus, for example, a 7 31=1×2 4 + 1×2 3 + 1×2 2 + 1×2 1 + 1×2 . 296 A.F. Shapiro, R. Paul Gorman Insurance: Mathematics and Economics 26 2000 289–307 fuzzy NN can be constructed where the NN possesses fuzzy signals andor has fuzzy weights. Conversely, FL can use technologies from other fields, like NNs or GAs, to deduce or to tune, from observed data, the membership functions in fuzzy rules, and may also structure or learn the rules themselves.

The soft computing technologies

3. The soft computing technologies

4. Functional classes

Parts

Dokumen yang terkait

PERSEPSI MAHASISWA TENTANG MATERI TAYANGAN SANG PEMBURU DI LATIVI ( Studi Pada Mahasiswa Jurusan Ilmu Komunikasi UMM Angkatan 2002)

APRESIASI MAHASISWA TERHADAP TAYANGAN Â“OPERA VAN JAVAÂ” DI TRANS7 (Studi Pada Mahasiswa Jurusan Ilmu Komunikasi UMM Angkatan 2008)

FAKTOR-FAKTOR PENYEBAB KESULITAN BELAJAR BAHASA ARAB PADA MAHASISWA MA’HAD ABDURRAHMAN BIN AUF UMM

38 Mahasiswa UMM Pilih KKN di Padang

FPP UMM Seleksi Sarjana Membangun Desa

Pengajian Muhammadiyah di UMM Akan Hadirkan BJ Habibie

The Influence Of Islamic Value Towards Social Reporting : a case study:BSM And BMI

An Identity Crisis In Hanrahan's Lost Girls And Love Hotels

ANALISIS MANAJEMEN PENCEGAHAN DAN PENANGGULANGAN KEBA- KARAN DI PUSKESMAS KECAMATAN CIPAYUNG JAKARTA TIMUR Analysis Of Management Prevention And Fight Fire At The Health Center Of Cipayung East Jakarta

Building Character And Literacy Skills Of Primary School Students Through Puppet Contemplative Sukuraga

Dukungan

Links

The soft computing technologies

3. The soft computing technologies

4. Functional classes

Parts

Dokumen yang terkait

PERSEPSI MAHASISWA TENTANG MATERI TAYANGAN SANG PEMBURU DI LATIVI ( Studi Pada Mahasiswa Jurusan Ilmu Komunikasi UMM Angkatan 2002)

APRESIASI MAHASISWA TERHADAP TAYANGAN Â“OPERA VAN JAVAÂ” DI TRANS7 (Studi Pada Mahasiswa Jurusan Ilmu Komunikasi UMM Angkatan 2008)

FAKTOR-FAKTOR PENYEBAB KESULITAN BELAJAR BAHASA ARAB PADA MAHASISWA MA’HAD ABDURRAHMAN BIN AUF UMM

38 Mahasiswa UMM Pilih KKN di Padang

FPP UMM Seleksi Sarjana Membangun Desa

Pengajian Muhammadiyah di UMM Akan Hadirkan BJ Habibie

The Influence Of Islamic Value Towards Social Reporting : a case study:BSM And BMI

An Identity Crisis In Hanrahan's Lost Girls And Love Hotels

ANALISIS MANAJEMEN PENCEGAHAN DAN PENANGGULANGAN KEBA- KARAN DI PUSKESMAS KECAMATAN CIPAYUNG JAKARTA TIMUR Analysis Of Management Prevention And Fight Fire At The Health Center Of Cipayung East Jakarta

Building Character And Literacy Skills Of Primary School Students Through Puppet Contemplative Sukuraga

Dokumen yang Anda mencari sudah siap untuk unduhkan