Using Data Flow Diagrams
Program Studi: Manajemen Bisnis Telekomunikasi & Informatika Mata Kuliah: Systems Analysis and Design Oleh: Yudi Priyadi Using Data Flow Diagrams SOURCE: Systems Analysis and Design, 9e
Kendall & Kendall, Copyright © 2014 Pearson Education, Inc. Publishing as Prentice Hall
Learning Objectives Comprehend the importance of using logical and physical data flow diagrams (DFDs) to graphically depict movement for humans and systems in an organization.
Create, use, and explode logical DFDs to capture and analyze the current system through parent and child levels.
Develop and explode logical DFDs that illustrate the proposed system. Produce physical DFDs based on logical DFDs you have developed. Understand and apply the concept of partitioning of physical DFDs.
Kendall & Kendall Data Flow Diagrams Graphically characterize data processes and flows in a business system Depict:
System inputs
Processes
Outputs Kendall & Kendall
Data flow diagram symbols Data flow diagram levels Creating data flow diagrams Physical and logical data flow diagrams
Partitioning Communicating using data flow diagrams
Major Topics Kendall & Kendall
Freedom from committing to the technical implementation too early Understanding of the interrelatedness of systems and subsystems Communicating current system knowledge to users Analysis of the proposed system
Advantages of the Data Flow Approach Kendall & Kendall
Basic Symbols A double square for an external entity An arrow for movement of data from one point to another A rectangle with rounded corners for the occurrence of a transforming process
An open-ended rectangle for a data store Kendall & Kendall
The Four Basic Symbols Used in Data Flow Diagrams, Their Meanings, and Examples
(Figure 7.1) Kendall & Kendall Represent another department, a business, a person, or a machine
A source or destination of data, outside the boundaries of the system
Should be named with a noun External Entities
Shows movement of data from one point to another Described with a noun Arrowhead indicates the flow direction
Represents data about a person, place, or thing Data Flow
Denotes a change in or transformation of data Process
Represents work being performed in the system Naming convention: Assign the name of the whole system when naming a high-level process
To name a major subsystem attach the word subsystem to the name
Use the form verb-adjective-noun for detailed processes A depository for data that allows examination, addition, and retrieval of data
Data Store Named with a noun, describing the data Data stores are usually given a unique reference number, such as D1, D2, D3
Represents a:
Database Computerized file
Steps in Developing Data Flow Diagrams (Figure 7.2)
The highest level in a data flow diagram
Contains only one process, representing the entire system The process is given the number 0 All external entities, as well as major data flows are shown
Creating the Context Diagram The data flow diagram must have one process Must not be any freestanding objects A process must have both an input and output data flow
A data store must be connected to at least one process External entities should not be connected to one another
Basic Rules
Representation of Data Flow and Information
Concept of Data Flow Diagram
IPO: Input →Process → Output
- Data→ Process → Information
- Data flow diagram is a graphical technique that depicts information
flow and the transforms that are applied as data moves from input to output.
- DFDs use four basic symbols that represent processes, data flows, data stores, and entities
Gane and Sarson symbol set
Yourdon symbol set
Symbols for DFD Yourdon notations (source: www.yourdon.com) External Entity:
External Source or destination of data
Interactor Process:
Process Action on data name Data Store: Storage of data
Data store Data Flow: Data Transfer Context Diagrams (incomplete)
A context diagram is a top level (also known as Level 0) data flow diagram.
It only contains one process node (process 0) that generalizes the function of the entire system in relationship to external entities. Creating a Set of DFDs
Create a graphical model of the information system based on your fact-finding results
Performing three main tasks
Step 1: Draw a context diagram
Step 2: Draw a DFD level 1
Step 3: Draw the lower-level diagrams
- Drawing Guidelines 1.
Draw the context diagram so it fits on one page 2. Use the name of the information system as the process name in the context diagram 3. Use unique names within each set of symbols 4.
Do not cross lines 5. Provide a unique name and reference number for each process 6. Obtain user input and feedback
Draw a Context Diagram
External Entity External entity represents the sources and destination of data created by the system.
External entity represents the immediate interface of the system with the external world.
When an external source of data is also a destination for data, a loop or occurrence number may be used.
In case the destination or use of data created by the process are not known, the flow simply points outside the system. Similarly, data flows may originate from “nowhere”.
Process Boxes Each processes box in a DFD describes an action on data
The Identifier. A number indicating the sequence of the process. The Action. A verb specifying the action on which it is performed on the data.
The Actor or Place. A noun indicating who performs the action or where it is performed.
Data Flow Arrows Data flow arrows link all the process boxes and data stores in DFDs.
Data flows should be labeled, except in case the data flows into and out of simple files.
DFDs show only the flow of data, not materials.
A DFD depicts information flow without explicit representation of procedural logic (e.g., conditions or loops).
Data Store Rectangles Data stores can be manual files or computer files. The type of file is not indicated.
Only in case the data store is altered the flow is not indicated. A simple access is not indicated.
A data store is never the direct recipient of unprocessed data from external sources or from other data stores nor is data from a data store ever directly delivered to an external sources. There must be a process step in between.
Examples of Data Stores Read
Read/ Write Write A data item is created or deleted or updated in the data store by
Rules for Constructing DFD
DFD Not Allowed Flows
DFD Not Allowed Flows If part of our system If not part of our flow ignore
Only one direction of flow between processes Data Flows
Joins & forks allowed only if exactly the same data Data Flows
Cannot go directly back to the process it leaves Data Flows
- Incorrect if.........
- Data which moves together should be shown in
a single data flow Data Flows itemised calls invoice invoice payment itemised calls And invoice Pay Invoice Telephone Company
Pay Invoice Telephone Company invoice payment
DFD Rules
Incorrect Correct
DFD Rules Correct Incorrect
DFD Rules
Incorrect Correct
Level 0 CD Origin #1 a z Destination 3 b
c
Origin #2
4 Destination b: ........... Explanation: a: .........
Level 1 DFD
1 a b c z
3
4
d e f i
Origin #1 Origin #2 Destination 3 Destination
4 n p dtstore1 dtstore2
Level 2 DFD
4
dtstore3
Origin #2
- through different levels
The conservation of input and output flows Balancing
A A C D B
B E C
A balanced DFD Fragment source: www.yourdon.com Example
Example of Context Diagram (Let’s check with the rules together ^^ ...)
Example of DFD Level 1 (Let’s check with the rules together ^^ ...)
Context Diagram (Let’s check with the rules together ^^ ...)
Staff Assignment
PaymentCampaign Manager Campaign Staff Accountant Client Agate Staff Grade Budget Management Campaign System Advert Completion Client Contact Contact Staff Campaign Staff Advert Concept Note © 2010 Bennett, McRobb and Farmer Concept Note Staff
Top Level Diagram (Level 1) 1. Record Clients Campaign Manager Client
Staff Assignment Campaign Staff Campaign Advert Accountant Concept Note Staff Concept Note Staff Staff Grade Staff Contact Payment Advert Completion Client Contact 3. Prepare Adverts 6. Browse Concept Notes Concept Note 4. Maintain Staff 5. Manage Adverts Adverts Advert Contact + Completion Date Clients Client 2. Plan and Manage Campaigns Staff Members Staff Budget Cost Campaigns Campaign Staff Staff (Let’s check with the rules together ^^ ...) Level 2 Diagram Staff Members (Let’s check with the rules together ^^ ...) Staff Contact Contact Staff Set Client
5.1 Contact Adverts Completed Set Advert 5.2 Completion Date © 2010 Bennett, McRobb and Farmer Context Diagram (Figure 7.3)
Drawing Diagram 0
The explosion of the context diagram May include up to nine processes Each process is numbered Major data stores and all external entities are included
Start with the data flow from an entity on the input side Work backward from an output data flow Examine the data flow to or from a data store Analyze a well-defined process Note Greater Detail in Diagram 0 (Figure 7.3)
Data flow diagrams are built in layers The top level is the context level Each process may explode to a lower level The lower level diagram number is the same as the parent process number
Processes that do not create a child diagram are called primitive Data Flow Diagram Levels Creating Child Diagrams Each process on diagram 0 may be exploded to create a child diagram
A child diagram cannot produce output or receive input that the parent process does not also produce or receive
The child process is given the same number as the parent process
Process 3 would explode to Diagram 3
Entities are usually not shown on the child diagrams below Diagram 0
If the parent process has data flow connecting to a data store, the child diagram may include the data store as well When a process is not exploded, it is called a primitive process Differences between the Parent Diagram (above) and the Child Diagram (below) (Figure 7.4)
Forgetting to include a data flow or pointing an arrow in the wrong direction
Connecting data stores and external entities directly to each other Incorrectly labeling processes or data flow Data Flow Diagrams Error Summary
Including more than nine processes on a data flow diagram Omitting data flow
Creating unbalanced decomposition (or explosion) in child diagrams Checking the Diagrams for Errors (Figure 7.5) Forgetting to include a data flow or pointing an arrow in the wrong
direction
(continued Figure 7.5) Checking the Diagrams for Errors
Connecting data stores and external entities directly to each other
Typical Errors that Can Occur in a Data Flow Diagram (Payroll Example)
(continued Figure 7.5) Logical Focuses on the business and how the business operates Not concerned with how the system will be constructed Describes the business events that take place and the data required and produced by each event
Logical and Physical Data Flow Diagrams Physical
Shows how the system will be implemented
Depicts the system Features Common of Logical and Physical Data Flow Diagrams (Figure 7.7)
The Progression of Models from Logical to Physical (Figure 7.8)
(Figure 7.9) Logical Data Flow Diagram Example
(Figure 7.9) Physical Data Flow Diagram Example
Better communication with users
More stable systems Better understanding of the business by analysts Flexibility and maintenance Elimination of redundancy and easier creation of the physical model
Developing Logical Data Flow Diagrams Kendall & Kendall
Clarifying which processes are performed by humans and which are automated Describing processes in more detail Sequencing processes that have to be done in a particular order Identifying temporary data stores Specifying actual names of files and printouts
Developing Physical Data Flow Diagrams
Physical Data Flow Diagrams Contain Many Items Not Found in Logical Data Flow Diagrams (Figure 7.10)
CRUD Matrix The acronym CRUD is often used for
Create
Read
Update
Delete These are the activities that must be present in a system for each master file
A CRUD matrix is a tool to represent where each of these processes occurs in a system
(Figure 7.11) CRUD Matrix
An input flow from an external entity is sometimes called a trigger because it starts the activities of a process
Events cause the system to do something and act as a trigger to the system
An approach to creating physical data flow diagrams is to create a data flow
diagram fragment for each unique system eventEvent Modeling and Data Flow Diagrams
An event table is used to create a data flow diagram by analyzing each event
and the data used and produced by the event Every row in an event table represents a data flow diagram fragment and is used to create a single process on a data flow diagram
Event Response Tables
An Event Response Table for an Internet Storefront (Figure 7.12)
Data Flow Diagrams for the First Three Rows of the Internet Storefront Event Response Table (Figure 7.13)
Use Cases and Data Flow Diagrams Each use case defines one activity and its trigger, input, and output Allows the analyst to work with users to understand the nature of the processes and activities and then create a single data flow diagram fragment
Partitioning Data Flow Diagrams Partitioning is the process of examining a data flow diagram and determining how it should be divided into collections of manual procedures and computer programs
A dashed line is drawn around a process or group of processes that should be placed in a single computer program
Different user groups Timing Similar tasks
Efficiency Consistency of data Security
Reasons for Partitioning
Improves the way humans use the site Improves speed of processing Ease of maintaining the site
Partitioning Websites
Use unexploded data flow diagrams early when ascertaining information requirements
Meaningful labels for all data components Communicating Using Data Flow Diagrams
Data flow diagrams Structured analysis and design tools that allow the analyst to comprehend the system and subsystems visually as a set of interrelated data flows
DFD symbols
Rounded rectangle
Double square An arrow Open-ended rectangle
Summary
Creating the logical DFD Context-level data flow diagram
Level 0 logical data flow diagram Child diagrams
Creating the physical DFD
Create from the logical data flow diagram Partitioned to facilitate programming
Partitioning data flow diagrams
Whether processes are performed by different user groups
Processes execute at the same time Processes perform similar tasks
Batch processes can be combined for efficiency of data Copyright © 2014 Pearson Education, Inc.
Publishing as Prentice Hall Program Studi: Manajemen Bisnis Telekomunikasi & Informatika Mata Kuliah: Systems Analysis and Design Oleh: Yudi Priyadi
Learning Objectives Understand database concepts. Use normalization to efficiently store data in a database. Use databases for presenting data. Understand the concept of data warehouses. Comprehend the usefulness of publishing databases to the Web.
Understand the relationship of business intelligence to data warehouses, big
data, business analytics and text analytics in helping systems and people make
decisions. Databases Normalization Key design Using the database Data warehouses Data mining Business intelligence
Major Topics
Data Storage There are two approaches to the storage of data in a computer- based system:
Store the data in individual files, each unique to a particular application
Store data in a database
A database is a formally defined and centrally controlled store of data intended for use in many different applications The data must be available when the user wants to use them
The data must be accurate and consistent Efficient storage of data as well as efficient updating and retrieval It is necessary that information retrieval be purposeful
Effectiveness objectives of the database: Ensuring that data can be shared among users for a variety of applications Maintaining data that are both accurate and consistent Ensuring data required for current and future applications will be readily available
Allowing the database to evolve as the needs of the users grow
Allowing users to construct their personal view of the data without concern for the way the data are physically stored
Databases
Reality, Data, and Metadata Reality
The real world
Data
Collected about people, places, or events in reality and eventually stored in a file or database Metadata
Information that describes data Reality, Data, and Metadata (Figure 13.1)
Entities Any object or event about which someone chooses to collect data
May be a person, place, or thing May be an event or unit of time
Entity Subtype An entity subtype is a special one-to-one relationship used to represent additional attributes, which may not be present on every record of the first entity
This eliminates null fields stored on database tables
For example, students who have internships: the STUDENT MASTER should
Relationships Relationships
One-to-one
One-to-many
Many-to-many
A single vertical line represents one A crow’s foot represents many Entity-Relationship Diagrams Associations (Figure 13.2, Part 1)
Entity-Relationship Diagrams Associations (Figure 13.2, Part 2)
Entity-relationship (E-R) diagrams can show one-to- Entity-Relationship Diagrams Associations (Figure 13.2, Part 3)
Entity-relationship (E-R) diagrams can show one-to-one, one-to-many, or many-to-many associations
(Figure 13.3) Entity-Relationship Symbols and Their Meanings The Entity-Relationship Diagram for Patient Treatment (Figure 13.4) Attributes can be listed alongside the entities. The key is underlined.
Attributes, Records, and Keys Attributes represent some characteristic of an entity Records are a collection of data items that have something in common with the entity described
Keys are data items in a record used to identify the record Key Types
Key types are: Primary key
—unique attribute for the record Candidate key
—an attribute or collection of attributes, that can serve as a primary key Secondary key
—a key which may not be unique, used to select a group of records
Data about the data in the file or database Describe the name given and the length assigned each data item Also describe the length and composition of each of the records
Metadata
Metadata
Metadata (Figure 13.7)
includes a description of what the value of each data item looks like.
A file contains groups of records used to provide information for operations, planning, management, and decision making
Files can be used for storing data for an indefinite period of time, or they can be used to store data temporarily for a specific purpose
Files Master file Table file
Transaction file Report file
File Types Kendall & Kendall
Master and Table Files Master files:
Contain records for a group of entities
Contain all information about a data entity
Table files: Contains data used to calculate more data or performance measures Usually read-only by a program Transaction and Report Files Transaction records:
Used to enter changes that update the master file and produce reports Report files:
Used when it is necessary to print a report when no printer is available
Useful because users can take files to other computer systems and output to specialty devices Relational Databases A database is intended to be shared by many users
There are three structures for storing database files:
Relational database structures Hierarchical database structures Network database structures Database Design (Figure 13.8) Database design includes synthesizing user reports, user views, and logical and physical designs
Relational Data Structure (Figure 13.9)
In a relational data structure, data are stored in many tables. Normalization Normalization is the transformation of complex user views and data
stores to a set of smaller, stable, and easily maintainable data structures
The main objective of the normalization process is to simplify all the complex data items that are often found in user views Normalization of a Relation Is Accomplished in Three Major Steps (Figure 13.10)
Shows data associations of data elements Each entity is enclosed in an ellipse Arrows are used to show the relationships
Data Model Diagrams
Drawing Data Model (Figure 13.13) Drawing data model diagrams for data associations sometimes helps analysts appreciate the complexity of data storage.
First Normal Form (1NF) Remove repeating groups
The primary key with repeating group attributes are moved into a new table
When a relation contains no repeating groups, it is in first normal form
The Original Unnormalized Relation (Figure 13.16)
The original unnormalized relation SALES-REPORT is separated into two relations, SALESPERSON (3NF) and SALESPERSON- CUSTOMER (1NF). Second Normal Form (2NF) Remove any partially dependent attributes and place them in another relation
A partial dependency is when the data are dependent on a part of a primary key
A relation is created for the data that are only dependent on part of the key and another for data that are dependent on both parts Second Normal Form (Figure 13.18 ) The relation SALESPERSON- CUSTOMER is separated into a relation called CUSTOMER- WAREHOUSE (2NF) and a relation called SALES (1NF).
Third Normal Form (3NF) Must be in 2NF Remove any transitive dependencies A transitive dependency is when nonkey attributes are dependent not only on the primary key, but also on a nonkey attribute
(Figure 13.20)
Third Normal Form
The relation CUSTOMER- WAREHOUSE is separated into two relations called CUSTOMER (1NF) and WAREHOUSE (1NF).
(Figure 13.22) Al S. Well Hydraulic Company E-R Diagram Using the Entity-Relationship Diagram to Determine Record Keys When the relationship is one-to-many, the primary key of the file at the one end of the relationship should be contained as a foreign key on the file at the many end of the relationship
A many-to-many relationship should be divided into two one-to- many relationships with an associative entity in the middle Guidelines for Master File/Database Relation Design Each separate data entity should create a master database table
A specific data field should exist on one master table Each master table or database relation should have programs to create, read, update, and delete the records
Entity integrity Referential integrity Domain integrity
Integrity Constraints The primary key cannot have a null value If the primary key is a composite key, none of the fields in the key can contain a null value
Entity Integrity
Referential integrity governs the nature of records in a one-to-many relationship
Referential integrity means that all foreign keys in the many table (the child table) must have a matching record in the parent table
Referential Integrity Referential integrity implications:
You cannot add a record in the child (many) table without a matching record in the parent table
You cannot change a primary key that has matching child table records You cannot delete a record that has child records Implemented in two ways: A restricted database updates or deletes a key only if there are no matching child records
A cascaded database will delete or update all child records when a parent record is deleted or changed
Domain integrity rules are used to validate the data
Domain integrity has two forms:
Check constraints, which are defined at the table level Rules, which are defined as separate objects and can be used within a number of fields
Domain Integrity
Data redundancy Insert anomaly Deletion anomaly
Anomalies Data Redundancy When the same data is stored in more than one place in the
database
Solved by creating tables that are in third normal form Insert Anomaly
Occurs when the entire primary key is not known and the database
cannot insert a new record, which would violate entity integrity
Can be avoided by using a sequence number for the primary key Deletion Anomaly
Happens when a record is deleted that results in the loss of other related data
Update Anomaly When a change to one attribute value causes the database to either contain inconsistent data or causes multiple records to need changing
Choose a relation from the database Join two relations together Project columns from the relation Select rows from the relation Derive new attributes Index or sort rows Calculate totals and performance measures Present data
Retrieving and Presenting Database Data
Data warehouses are used to organize information for quick and effective queries
In the data warehouse, data are organized around major subjects
Data in the warehouse are stored as summarized rather than detailed raw data
Data in the data warehouse cover a much longer time frame than in a traditional transaction-oriented database
Data warehouses are organized for fast queries
Data Warehouses and Database Differences
Data warehouses are usually optimized for answering complex queries, known as OLAP
Data warehouses allow for easy access via data-mining software
Data warehouses include multiple databases that have been processed so that data are
uniformly defined Data warehouses usually include data from outside sources
Online Analytic Processing Software
Online analytic processing (OLAP) is meant to answer decision makers’ complex questions by defining a multidimensional database
Statistical analysis
Decision trees
Neural networks
Intelligent agents
Fuzzy logic Data-Mining Decision Aids
Associations —patterns that occur together
Sequences —patterns of actions that take place over a period of time
Clustering —patterns that develop among groups of people
Trends —the patterns that are noticed over a period of time
Data-Mining Patterns Data Mining (Figure 13.27) Data mining collects personal information about customers in an effort to be more specific in interpreting and anticipating their preferences
Costs may be too high to justify Has to be coordinated Ethical aspects
Data-Mining Problems
Business Intelligence (BI)
Business intelligence is a decision support system (DSS) for organizational decision makers
It is composed of features that gather and store data
It uses knowledge management approaches combined with analysis
This becomes input to decision makers’ decision-making processes
Business intelligence is built around processing large volumes of data
Big data is when data sets become too large or too complex to be handled with traditional tools or within traditional databases or data warehouses
Big data is a strategy that permits organizations to cope with ever-increasing numbers of data from a myriad of sources Human generated
Five prominent methods are used for analyzing business intelligence
Slice-and-dice drilldown
Ad hoc queries
Real-time analysis
Forecasting
Scenarios Analyzing Business Intelligence Text Analytics Text analytics is a way to structure the unstructured
Turning qualitative material into quantitative material The broader view is to tap into qualitative unstructured data that can be of use to decision makers who must recommend courses of action to their organizations that are backed by data Text Analytics Sources
Sources of big data for text analytics include unstructured, qualitative, or “soft,” data generated through:
Blogs
Chat rooms Questionnaires using open-ended questions Online discussions conducted on the Web
Social media such as Twitter Facebook
Other Web-generated dialogs between customers and an organization
Storing data
Individual files
Database Reality, data, metadata
Conventional files
Type
Organization
Database Relational
Hierarchical Summary
E-R diagrams
Normalization
First normal form Second normal form Third normal form
Data warehouse Data mining
Copyright © 2014 Pearson Education, Inc.
Publishing as Prentice Hall