Tuning the Repository Advanced Actions for Administering Repositories

4 Setting-up the Topology 4-1 4 Setting-up the Topology This chapter describes how to set up the topology in Oracle Data Integrator. An overview of Oracle Data Integrator topology concepts and components is provided. This chapter contains these sections: ■ Section 4.1, Introduction to the Oracle Data Integrator Topology ■ Section 4.2, Setting Up the Topology ■ Section 4.3, Managing Agents

4.1 Introduction to the Oracle Data Integrator Topology

The Oracle Data Integrator Topology is the physical and logical representation of the Oracle Data Integrator architecture and components. This section contains these topics: ■ Section 4.1.1, Physical Architecture ■ Section 4.1.2, Contexts ■ Section 4.1.3, Logical Architecture ■ Section 4.1.4, Agents ■ Section 4.1.5, Languages ■ Section 4.1.6, Repositories

4.1.1 Physical Architecture

The physical architecture defines the different elements of the information system, as well as their characteristics taken into account by Oracle Data Integrator. Each type of database Oracle, DB2, etc., file format XML, Flat File, or application software is represented in Oracle Data Integrator by a technology. A technology handles formatted data. Therefore, each technology is associated with one or more data types that allow Oracle Data Integrator to generate data handling scripts. The physical components that store and expose structured data are defined as data servers. A data server is always linked to a single technology. A data server stores information according to a specific technical logic which is declared into physical schemas attached to this data server. Every database server, JMS message file, group of flat files, and so forth, that is used in Oracle Data Integrator, must be declared as a data server. Every schema, database, JMS Topic, etc., used in Oracle Data Integrator, must be declared as a physical schema. 4-2 Oracle Fusion Middleware Developers Guide for Oracle Data Integrator Finally, the physical architecture includes the definition of the Physical Agents. These are the Java software components that run Oracle Data Integrator jobs.

4.1.2 Contexts

Contexts bring together components of the physical architecture the real Architecture of the information system with components of the Oracle Data Integrator logical architecture the Architecture on which the user works. For example, contexts may correspond to different execution environments Development, Test and Production or different execution locations Boston Site, New-York Site, and so forth. where similar physical resource exist. Note that during installation the default GLOBAL context is created.

4.1.3 Logical Architecture

The logical architecture allows a user to identify as a single Logical Schema a group of similar physical schemas - that is containing datastores that are structurally identical - but located in different physical locations. Logical Schemas, like their physical counterpart, are attached to a technology. Context allow to resolve logical schemas into physical schemas. In a given context, one logical schema resolves in a single physical schema. For example, the Oracle logical schema Accounting may correspond to two Oracle physical schemas: ■ Accounting Sample used in the Development context ■ Accounting Corporate used in the Production context These two physical schemas are structurally identical they contain accounting data, but are located in different physical locations. These locations are two different Oracle schemas Physical Schemas, possibly located on two different Oracle instances Data Servers. All the components developed in Oracle Data Integrator are designed on top of the logical architecture. For example, a data model is always attached to logical schema, and data flows are defined with this model. By specifying a context at run-time, the model’s logical schema resolves to a single physical schema, and the data contained in this schema in the data server can be accessed by the integration processes.

4.1.4 Agents

Oracle Data Integrator run-time Agents orchestrate the execution of jobs. These agents are Java components. The run-time agent functions as a listener and a scheduler agent. The agent executes jobs on demand model reverses, packages, scenarios, interfaces, and so forth, for example when the job is manually launched from a user interface or from a command line. The agent is also to start the execution of scenarios according to a schedule defined in Oracle Data Integrator. Third party scheduling systems can also trigger executions on the agent. See Section 20.9.2, Scheduling a Scenario or a Load Plan with an External Scheduler for more information. Typical projects will require a single Agent in production; however, Section 4.3.3, Load Balancing Agents describes how to set up several load-balanced agents. Setting-up the Topology 4-3 Agent Lifecycle The lifecycle of an agent is as follows: 1. When the agent starts it connects to the master repository. 2. Through the master repository it connects to any work repository attached to the Master repository and performs the following tasks at startup: ■ Clean stale sessions in each work repository. These are the sessions left incorrectly in a running state after an agent or repository crash. ■ Retrieve its list of scheduled scenarios in each work repository, and compute its schedule. 3. The agent starts listening on its port. ■ When an execution request arrives on the agent, the agent acknowledges this request and starts the session. ■ The agent launches the sessions start according to the schedule. ■ The agent is also able to process other administrative requests in order to update its schedule, stop a session, respond to a ping or clean stale sessions. The standalone agent can also process a stop signal to terminate its lifecycle. Refer to Chapter 20, Running Integration Processes for more information about a session lifecycle. Agent Features Agents are not data transformation servers. They do not perform any data transformation, but instead only orchestrate integration processes. They delegate data transformation to database servers, operating systems or scripting engines. Agents are multi-threaded lightweight components. An agent can run multiple sessions in parallel. When declaring a physical agent, it is recommended that you adjust the maximum number of concurrent sessions it is allowed to execute simultaneously from a work repository. When this maximum number is reached, any new incoming session will be queued by the agent and executed later when other sessions have terminated. If you plan to run multiple parallel sessions, you can consider load balancing executions as described in Section 4.3.3, Load Balancing Agents . Standalone and Java EE Agents The Oracle Data Integrator agents exists in two flavors: standalone agent and Java EE agent. A standalone agent runs in a separate Java Virtual Machine JVM process. It connects to the work repository and to the source and target data servers via JDBC. Standalone agents can be installed on any server with a Java Machine installed. This type of agent is more appropriate when you need to use a resource that is local to one of your data servers for example, the file system or a loader utility installed with the database instance, and you do not want to install a Java EE application server on this machine. A Java EE agent is deployed as a web application in a Java EE application server for example Oracle WebLogic Server. The Java EE agent can benefit from all the features of the application server for example, JDBC data sources or clustering for Oracle WebLogic Server. This type of agent is more appropriate when there is a need for centralizing the deployment and management of all applications in an enterprise application server, or when you have requirements for high availability.