Some examples of applications of modelling and representation

are then used to select a likely model in a library of object models, also called indexing. The best match between image attributes and model attributes is then found. Finally, the match is verified using some decision procedure. The grouping, indexing, and matching steps essentially involve search procedures. Bottom-up control fails, however, in more complex images containing multiple objects with occlusion and overlap, as well as in the case of poor quality images, in which noise creates spurious attributes. This is a very likely scenario for remotely sensed images. In this situation, top-down or hybrid control strategies are more useful. In the top-down approach, the hypothesis phase requires the organisation of models indexed by attributes so that based on observed attributes, a small set of likely objects can be selected. The selected models are then used to Ž recognise objects in the verification phase Jain et . al., 1995 . A disadvantage of this approach is that the model control necessary in some parts of the image is too strong for other parts; for example, symmetry requirements imposed by the model could corrupt borders. In the hybrid approach, the two strategies are combined to improve processing efficiency. Attributes are grouped whenever the resulting attribute is more informative than individual attributes. This process is also called perceptual organisation. Ž . Lowe 1985, 1990 addressed this grouping question in object recognition and came up with some objec- tive criteria for grouping attributes; he looks for configurations of edge segments that are unlikely to happen by chance and are preserved under projec- tion. Collinear and parallel edges are an example. Ž . Zerroug and Nevatia 1993 utilise regularities in the projections of homogeneous generalised cylinders into 2-D. Most other researchers have developed ad Ž . hoc criteria for grouping, e.g., Steger et al. 1997 for road extraction, and Henricsson and Baltsavias Ž . 1997 for building extraction. It seems obvious that local context will play a large part in attribute grouping, since one would expect a particular arrangement of local attributes in relation to each other to define a local context. General knowledge about occlusion, perspective, geometry and physical support are also necessary for Ž . the recognition task. Brooks 1981 built a geometric reasoning system called ACRONYM for object recognition. The system SIGMA by Matsuyama and Ž . Hwang 1985 includes a geometric reasoning ex- Ž . pert. McGlone and Shufelt 1994 have incorporated projective geometry into their system for building Ž . extraction, while Lang and Forstner 1996 have ¨ developed polymorphic features for the development of procedures for building extraction. Context plays a significant role in image understanding. In particular, relaxation labelling methods use local and global context to perform semantic labelling of regions and objects in an image. After the segmentation phase, scene labelling should correspond with available scene knowledge and the labelling should be consistent. This problem is usually solved using constraint propagation: local constraints result in local consistencies, and by applying an iterative scheme, the local consistencies adjust to global consistencies in the whole image. A full survey of relaxation labelling is available in Hancock Ž . and Kittler 1990 . Discrete relaxation methods are oversimplified and cannot cope with incomplete or inaccurate segmentation. Probabilistic relaxation works on the basis that a locally inconsistent but very probable global interpretation may be more valuable than a consistent but unlikely explanation; Ž . see Rosenfeld et al. 1976 for an early example of this approach. To handle uncertainty at the matching stage, various evidence-based techniques have been used. Ex- amples include systems which utilise Dempster– Ž Shafer theory Wesley, 1986; Provan, 1990; Clark- . Ž . son, 1992 , reliability values Haar, 1982 , fuzzy Ž . logic Levine and Nazif, 1985 , the principle of least Ž . commitment Jain and Haynes, 1982 , confidence Ž . values McKeown and Harvey, 1987 , random closed Ž . sets Quinio and Matsuyama, 1991 and Bayesian Ž networks Rimmey, 1993; von Kaenel et al., 1993; . Sarkar and Boyer, 1994 .

4. Some examples of applications of modelling and representation

The applications of knowledge representation and modelling methods in machine vision, and photogrammetry and remote sensing, have incorporated most of the approaches described in the foregoing. The leaders in these applications have logically been researchers in machine vision. In the fields of photogrammetry and remote sensing, the approaches adopted have followed those in the field of computer vision, and have been adapted for the types of information being extracted. These applications demonstrate that there is a growing level of expertise in techniques of artificial intelligence amongst the researchers in photogrammetry and remote sensing. The evolution of these methods has been from rule- based systems to semantic networks and frames to descriptive logic. A review of some applications in machine vision, photogrammetry and remote sensing in this section will demonstrate these trends. 4.1. Logic The first researchers to advocate the use of logic as a representation in computer vision systems are Ž . Reiter and Mackworth 1989 . In their paper, they proposed a logical framework for depiction and interpretation of image and scene knowledge, as well as a formal mapping between the two. They propose image axioms, scene axioms and depiction axioms, whose logical model forms an interpretation of an image. They illustrate their approach using a simple map-understanding system called Mapsee. The ap- plication is relatively limited, however, and newer systems have not been reported. One reason could be the computational complexity. While logic provides a consistent formalism to specify constraints, ad hoc search using logic is not efficient. Further, FOL by itself is not good for representing uncertainty or incompleteness in data, which is in the nature of image properties. The correspondence between image elements and scene objects is not usually one-to- one, and additional logical relations are necessary to Ž . model these. Matsuyama and Hwang 1990 adopt a logical framework in which new logical constants and axioms are generated dynamically. 4.2. Rule-based and production systems Ž . Brooks 1981 developed ACRONYM, a model- based image understanding system for detecting 3-D objects, and tested it to extract aircraft in aerial images. 3-D models of aircrafts are stored using a frame-based representation. Given an image to be analysed, ACRONYM extracts line segments and obtains 2-D generalised cylinders. Rules encoding geometric knowledge as well as knowledge of imag- ing conditions are used to generate expected 3-D models of the scene, which are then matched against the frames to identify aircraft. Ž . SIGMA Matsuyama and Hwang, 1985 is an aerial image understanding system that uses frames to represent knowledge, and both top-down and bottom-up control schemes to extract features. It consists of three subsystems: the Geometric Reasoning Ž . Ž . Expert GRE , Model Selection Expert MSE , and Ž . Low Level Vision Expert LLVE . Information passes from the GRE to the MSE, which then com- municates with the LLVE. The frames in SIGMA use slots storing attributes of an object and its relationships to other objects. Based on the spatial knowledge in the frames, hypotheses are generated for objects and matched against image features. This is done by the MSE reasoning about the most likely appearance of an object and conveying this in image terms to the LLVE. This top-down selection of image attributes helps detect small attributes. The system was tested to extract houses and road segments from aerial images. Ž . McKeown et al. 1985 present a rule-based system for the interpretation of airports in aerial images. It was based on about 450 rules, divided into six classes for: initialisation, region-to-interpretation for interpreting the original image fragments, local evaluation, consistency checks, functional area rules for grouping of image fragments into functional areas, and goal-generation rules for building the airport model. Ž . McKeown and Harvey 1987 present a system for aerial image interpretation, with rules compiled from standard knowledge sets, called schemata. They generated rules automatically from higher level modules, which made for better error-handling and more efficient execution. Their system contained about 100 schemata, each of which generated about five rules. Ž . Strat and Fischler 1991 developed the knowledge-based system called ‘Condor’ for the recognition of terrain scenes based on context. Context is defined by rules within context sets at various levels. The context sets are not infallible, and hence, redun- dancy is built into them. The interpretation is based on three types of rules: candidate generation, candidate evaluation, and consistency determination. Can- didate comparisons are based on the evaluation of likely candidates in the evaluation process, which scores the relative likelihood that a candidate is an instance of that class. The authors state that this division of the knowledge assigns it to manageable sizes. Ž . Stilla et al. 1996 present a model-based system for automatic extraction of buildings from aerial images, in which objects to be recognised are modelled by production rules and depicted by a production set. The object model is both specific and generic. The specific model describes objects using a fixed topological structure, while the generic models are more general. These systems illustrate that rule-based systems do not guarantee additivity of knowledge and consistency of reasoning. Breaking up a rule base into multiple rules of varying granularity makes the pro- gram less modular and more difficult to modify. Ž . Draper et al. 1989 suggest blackboard and schema-based architectures to handle this. 4.3. Blackboard systems Ž . Nagao and Matsuyama 1980 first addressed the problem of scene understanding using the blackboard model, and applied it to aerial images of suburban areas, involving identification of cars, houses and Ž roads. Their system consists of a global database the . blackboard and a set of knowledge sources. The blackboard records data in a hierarchy consisting of elementary regions, characteristic regions and objects. The blackboard also stores a label picture, which links pixels in the original image to the regions in the database. Elementary regions are the result of an image segmentation process, and are characterised by grey-level, size and location in the image. Characteristic features of the regions are then extracted, resulting in the identification of elementary regions with the following attributes: 1. Large, homogeneous regions, based on region size. 2. Elongated regions, based on shape. 3. Regions in shadow, based on region brightness. 4. Regions capable of causing shadows, based on location of adjoining regions and the position of the sun. 5. Vegetation and water regions, from multispectral information. 6. High contrast texture regions, from textural information. These properties are stored on the blackboard by separate modules. The knowledge sources then identify a particular object, given the presence or absence of the characteristic features of various regions. Each knowledge source is a single rule, with a condition and a complex action part that performs various picture processing operations to detect the object. For example, the knowledge source to detect a crop field would look like: if large homogeneous region and vegetation region and not water region and not shadow making region then perform crop field identification. Each knowledge source identifies an object inde- pendently, and this might lead to conflicting identifi- Ž cations for the same region for example, crop field . and grassland . To solve this, the system automatically calculates a reliability value for each identification, and uses it to discard all but the most reliable. Ž . Fuger et al. 1994 present a blackboard-based ¨ data-driven system for analysis of man-made objects in aerial images. Generic object models are represented symbolically in the blackboard, an individual object being described by several attributes. The models are controlled by numerous parameters, which are determined by a closed-loop system using ‘evolution strategies’. Ž . Stilla 1995 presents a blackboard-based production system for image understanding, which is suit- able for the structural analysis of complex scenes in aerial images. Starting with primitive objects, a target object can be composed step-by-step using pro- ductions repeatedly. The compositions of objects are recorded and represented by a derivation graph. The map is modelled as a set of straight lines. The results of map analysis are one or more target objects with their corresponding derivation graphs. Image analysis may also be performed identically, by segmenting the binary image and approximating contours by straight lines. Blackboard systems in general tend to have a centralised control structure so that efficiency be- comes an issue. Also, blackboards assume that knowledge sources will be available when needed and then vanish, whereas in vision applications, they tend to persist as long as the image is being analysed. 4.4. Frames Ž . Hanson and Riseman 1978 used frames as hypothesis generation mechanisms for vision systems. Knowledge about classes of objects was represented as frames, and slots represented binary geometric relations between classes of objects. Slots also contained production rules for instantiating other object frames. Thus, frames are used both for control and Ž . representation. Ikeuchi and Kanade 1988 used frames to represent aspects of 3-D objects. When exact object models are available, processing is top- down, but given weaker models and more exact data, processing is bottom-up. However, using frames for both control and representation hides the procedural behaviour of the system and destroys its temporal Ž . coordination Draper et al., 1989 . Other systems which use frames include ACRONYM, SIGMA and Nagao and Matsuyama’s system, already described above. 4.5. Semantic network Ž . Nicolin and Gabler 1987 describe a system to analyse aerial images, using semantic nets to represent and interpret the image. The system consists of Ž . a Short Term Memory STM , a Methodology Base Ž . Ž . MB , and a Long Term Memory LTM . The STM is conceptually equivalent to a blackboard and stores the partial interpretation of the image. The LTM stores the a priori knowledge of the scene and the Ž domain-specific knowledge i.e., the knowledge . base . The system matches the contents of the STM against those of the LTM to produce an interpretation. This is accomplished using an inference mecha- nism that calls modules in the MB. The initial contents of the STM are established in a bottom-up way, and a model-driven phase generates and verifies the presence or absence of object attributes stored in the LTM. Ž . Mayer 1994 has developed a semantic-network- based system for knowledge-based extraction of objects from digitised maps. The system is based on a combined semantic network and frames representation, as well as a combination of model-driven and data-driven control. The model is composed of three levels which generally correspond to the respective layers of bottom-up image processing: 1. The image layer, e.g., the digitised map, 2. Image-graph and graphics and text layers, 3. Semantic objects. The semantic network is built up from the concept of ‘partrpart of’ elements in the graphs layer to the semantic objects, which comprise the ‘specialisa- tionrgeneralisation’ relations between the graphics objects and the terrain objects. For example, an elongated area in the graphics objects layer is spe- cialised into ‘road-sides’, ‘pavements’, ‘road network’, etc. Descriptions of other objects are not given, but the tests demonstrated the extraction of parcels and road networks. The frames are designed to analyse the various concepts and their properties. The object extraction is based on both model-driven and data-driven instantiation, with the initial search being based on a goal specified by the user. While the method is based on the extraction of well-defined information on maps, Mayer believes that the process should be useful for the extraction of information from images. Ž . Tonjes 1996 has used semantic networks for ¨ modelling landscapes from overlapping aerial images. The output is a 3-D view of the terrain with appropriate representations of the vegetation. Tonjes ¨ states that semantic networks are suited to representing knowledge of structural objects. His semantic network is described by frames that include the relationships, attributes, and methods. The semantic net has three layers: 1. Sensor layer, which represents the segmentation layer, based on texture and stripes, as well as the image details; 2. Geometry and material layer, which represents the 3-D surface layer following the interpretation of the terrain cover from the sensor layer; 3. Scene layer, which are the extracted objects. The semantic network is established between compo- nents in the three layers.The relationships ‘con–of’ are concrete realisations of objects in the image data; ‘part–of’ describes the decomposition of the objects into parts; while ‘is–a’ is the specialisation of the object. The object descriptions are tracked through each layer for reconstruction, which is based on both data-driven as well as model-driven processes. Ž . Lang and Forstner 1996 have based their method ¨ of extraction of buildings on polymorphic mid-level features. The approach involves semantic modelling using a ’part–of’ hierarchical representation. Rela- tions between the parts have not yet been included. The hypothesis generation of the building is based on a combination of a data-driven model for the original generation of the vertices, and subsequent model-driven approaches for hypothesis generation of object interpretation and verification, using four building types as the models: flat roof, non-orthogo- nal flat roof, gable roof, and hip roof. The approach successfully extracts buildings. Ž . Schilling and Vogtle 1996 have developed a ¨ procedure for updating digital map bases using exist- ing map bases to aid the interpretation. The image is compared with the map to detect changes since the compilation of the map. New features are then analysed by semantic networks. Two networks are cre- ated, one for the scene and the other for the image, with the typical relationships established at different levels in the networks. Ž . De Gunst 1996 has developed a combined data- driven and model-driven approach to recognising objects required for updating digital map data. The process is based on object oriented models for road descriptions and a semantic network for the feature recognition, based on frames. The frames define such details as object relations, object definition, alternative object definitions and preprocessing relations. Road details include complex road junctions which are described by the knowledge base. This is a very detailed study involving several different types of road features. The success of the investigations varied significantly, demonstrating the difficulty in understanding such details. Ž . Ž . Quint and Sties 1996 and Quint 1997 present a model-based system called MOSES to analyse aerial images, which uses semantic networks as a modelling tool. Models are automatically refined by using knowledge gained from topographical maps or GIS data. The generatiÕe model is the most general model containing common sense knowledge about the environment. Concepts in the generic models in the map and image domain are specialisations of the corresponding concepts in the generative model. A specific model is automatically generated by the system and is specific to the current scene; it is generated by combining the scene description obtained after map analysis with the generic model in the image domain. Initially, digitally available line segments are used for the structural analysis of the map, resulting in a structural description of the map scene. The scene description so obtained is then combined with the generic model in the image domain to yield the specific model, which will be used for image analysis. For structural analysis, image Ž . primitives currently line segments and regions serve as input. The analysis is model-driven, resulting in Ž . recognition of objects parking places in the project . A merit function is used to guide search in the image analysis process. To sum up, semantic networks have found wide acceptance and use in the interpretation of aerial images and digital maps. 4.6. Description logics There are very few photogrammetric applications based on description logics. One such is Lange and Ž . Schroder’s 1994 description-logic-based approach ¨ to the interpretation of changes in aerial images with respect to reference information extracted from a map. Knowledge about types of objects and types of possible changes are represented using a KL-ONE- Ž like description logic Brachman and Schmolze, . 1985; Nebel, 1990 , which permit the description of concepts in terms of necessary and sufficient conditions. Factual information about the scene and the interpretation are represented using the assertional component of the description logic. Geometric and topological constraints and relations between spatial objects are represented in the logic as object concepts and change concepts. An object is recognised to be an instance of an object concept after the image has been preprocessed and the attributes extracted. The definitions of the change concepts are used in exactly the same way to recognise changes. Search for instantiation is goal-directed, and uses a number of heuristics. The examples in the paper, however, seem to be based on artificial images.

Some examples of applications of modelling and representation

4. Some examples of applications of modelling and representation

5. Conclusions

Parts

Dokumen yang terkait

AN ANALYSIS OF HANKÂ’S ANGER ON Â“ME, MYSELF & IRENEÂ” FILM

APRESIASI MAHASISWA TERHADAP TAYANGAN Â“OPERA VAN JAVAÂ” DI TRANS7 (Studi Pada Mahasiswa Jurusan Ilmu Komunikasi UMM Angkatan 2008)

Pencerahan dan Pemberdayaan (Enlightening & Empowering)

FAKTOR-FAKTOR PENYEBAB KESULITAN BELAJAR BAHASA ARAB PADA MAHASISWA MA’HAD ABDURRAHMAN BIN AUF UMM

FPP UMM Seleksi Sarjana Membangun Desa

Pengajian Muhammadiyah di UMM Akan Hadirkan BJ Habibie

IDENTIFIKASI PERUBAHAN STRUKTUR PEREKONOMIAN DAN SEKTOR KUNCI PROPINSI JAWA TIMUR (MENGGUNAKAN TABEL I-O TAHUN 2000 & 2006)

Pengaruh pemahaman fiqh muamalat mahasiswa terhadap keputusan membeli produk fashion palsu (study pada mahasiswa angkatan 2011 & 2012 prodi muamalat fakultas syariah dan hukum UIN Syarif Hidayatullah Jakarta)

Sistem Informasi pembelian Dan Penjualan Tiket Di Paradiso Tour & TRavel Bandung

Pengaruh Pyhsical Aspect, Reliability, Personal Interaction, Problem Solving, dan Policy terhadap Loyalitas Pelanggan Ritel ( Studi pada Chandra Supermarket & Dept. Store Metro )

Dukungan

Links

Some examples of applications of modelling and representation

4. Some examples of applications of modelling and representation

5. Conclusions

Parts

Dokumen yang terkait

AN ANALYSIS OF HANKÂ’S ANGER ON Â“ME, MYSELF & IRENEÂ” FILM

APRESIASI MAHASISWA TERHADAP TAYANGAN Â“OPERA VAN JAVAÂ” DI TRANS7 (Studi Pada Mahasiswa Jurusan Ilmu Komunikasi UMM Angkatan 2008)

Pencerahan dan Pemberdayaan (Enlightening & Empowering)

FAKTOR-FAKTOR PENYEBAB KESULITAN BELAJAR BAHASA ARAB PADA MAHASISWA MA’HAD ABDURRAHMAN BIN AUF UMM

FPP UMM Seleksi Sarjana Membangun Desa

Pengajian Muhammadiyah di UMM Akan Hadirkan BJ Habibie

IDENTIFIKASI PERUBAHAN STRUKTUR PEREKONOMIAN DAN SEKTOR KUNCI PROPINSI JAWA TIMUR (MENGGUNAKAN TABEL I-O TAHUN 2000 & 2006)

Pengaruh pemahaman fiqh muamalat mahasiswa terhadap keputusan membeli produk fashion palsu (study pada mahasiswa angkatan 2011 & 2012 prodi muamalat fakultas syariah dan hukum UIN Syarif Hidayatullah Jakarta)

Sistem Informasi pembelian Dan Penjualan Tiket Di Paradiso Tour & TRavel Bandung

Pengaruh Pyhsical Aspect, Reliability, Personal Interaction, Problem Solving, dan Policy terhadap Loyalitas Pelanggan Ritel ( Studi pada Chandra Supermarket & Dept. Store Metro )

Dokumen yang Anda mencari sudah siap untuk unduhkan