BUSINESS STATISTICS 511 Year 1 Semester (1)

A N A G EM EN T S C IEN C ES

QUALIFICATION TITLE: Year 1 Semester 2

B A C H EL O R O F C O M M ER C E LEARNER GUIDE

ST

S: MARKETING MANAGEMENT 511 (1 SEMESTER) PREPARED ON BEHALF OF

NI N G & B U S I N E S S C O L L E G E (P T Y ) L T D

U TH O R : D r . L a w r e n c e L e k h a n y a

I T O R : Mr . S i m b r a s h e M a g w a g w a

C U L T Y HE A D : P r o f . R o s h M a h a r a j

Copyright © 2013

T r a i n i n g & B u s i n e s s C o l l e g e (P t y ) L t d

egistration Number: 2000/000757/07

rved; no part of this publication may be reproduced in r by any means, including photocopying machines,

out the written permission of the Institution.

Previously

BUSINESS ADMINISTRATION, MANAGEMENT & COMMERCIAL SCIENCES

LEARNER GUIDE MODULE: BUSINESS STATISTICS 511 (1 ST SEMESTER)

Copyright © 2017 Richfield Graduate Institute of Technology (Pty) Ltd Registration Number: 2000/000757/07

All rights reserved; no part of this publication may be reproduced in any form or by any means, including photocopying

machines, without the written permission of the Institution.

GDE

TABLE OF CONTENTS

TOPICS Section A: Preface

1. Welcome

2. Title of Modules

3. Purpose of Module

4. Learning Outcomes

5. Method of Study

6. Lectures and Tutorials

7. Notices

8. Prescribed & Recommended Material

9. Assessment & Key Concepts in Assignments and Examinations

10. Specimen Assignment Cover Sheet

11. Work Readiness Programme

12. Work Integrated Learning

Section B: TOPIC 1: INTRODUCTION TO DESCRIPTIVE STATISTICS

1.1 What Is Statistics?

1. 2 Descriptive Statistics

1.3. Inferential Statistics

1.6 Summation Notation

1.7 Measurement Scales

Assessment questions

TOPIC 2:DESCRIBING UNIVARIATE DATA

2.1 Central Tendency

2.7 Semi-Interquartile Range

2.9 Standard Deviation

2.10 Shape The Distribution 36

2.13 Types Of Graphs

Assessment questions

TOPIC 3: CORRELATION AND SIMPLE LINEAR REGRESSION ANALYSIS

3.1. Scatter Plots

3.2. Introduction To Pearson's Correlation

3.3 Regression Analysis

Assessment questions

TOPIC 4: INTRODUCTION TO PROBALITY

4.1 Simple Probability

4.2 Conditional Probability

4.3 Probability Of A And B

4.4 Probability Of A Or B

TOPIC 5: DISCRETE PROBABILITY DISTRIBUTION

5.1 Permutations And Combinations

5.2 Binomial Probability Distribution

5.3 The Poisson Distribution

Assessment questions

TOPIC 6: CONTINUOUS PROBABILITY DISTRIBUTION

6.1 What Is A Normal Distribution

6.2 The Standard Normal Distribution

6.3 Converting To Percentiles And Back

6.4 Area Under Portions Of The Curve

Assessment Questions

TOPIC 7: ADDENDUM 511 (A): REVISION QUESTIONS

78 TOPIC 8: ADDENDUM 511 (B): TYPICAL EXAMINATION

80 QUESTIONS

1. WELCOME

Welcome to the Faculty of Business, Economics& Management Sciences at Richfield Graduate Institute of Technology (Pty) Ltd. We trust you will find the contents and learning outcomes of this module both interesting and insightful as you begin your academic journey and eventually your career in the business world.

This section of the study guide is intended to orientate you to the module before the commencement of formal lectures.

The following lecturers will focus on the study units described.

SECTION A: WELCOME & ORIENTATION Study unit 1: Orientation Programme

Lecture 1

Introducing academic staff to the students by academic head. Introduction of institution policies.

Study unit 2: Orientation of Students to Library and Students Facilities Lecture 2

Introducing students to physical structures Issuing of foundation learner guides and necessary learning material

Study unit 3: Distribution and Orientation of Business Statistics Learner Lecture 3 Guides, Textbooks and Prescribed Materials

Study unit 4: Discussion on the Objectives and Outcomes of Business Lecture 4 Statistics 511

Study unit 5: Orientation and guidelines to completing Assignments Lecture 5

Review and Recap of Study units 1-4

Section B: Business Statistics511 (1 st Semester)

2. TITLE OF MODULES, COURSE, CODE, NQF LEVEL, CREDITS & MODE OF DELIVERY

Semester

Title of Module

Business Statistics 511

Code

BUS_511

NQF level

Credits

Mode of delivery

Contact/Distance

3. PURPOSE OF THE MODULES

These introductory courses covers the concepts and techniques concerning explanatory data analysis, frequency distributions, central tendency and variation, probability, sampling, inference, regression and correlation. Students will be exposed to these topics and how each applies to and can be used in the business environment. Students will master problem solving both manual computations and statistical software

4. LEARNING OUTCOMES

On completion of these modules the student will be able to:

 Appreciate the role of statistics in management decision making.  develop an intuitive understanding of the techniques by giving an explanation for

each method and interpretation of the solutions  Have a general understanding of basic probability concepts

 Understand the statistical measures which condense and describe the characteristics of raw data

5. METHOD OF STUDY

The sections that have to be studied are indicated under each topic. These form the basis for tests, assignments and examination. To be able to do the activities and assignments for this module, and to achieve the learning outcomes and ultimately to be successful in the tests and examination, you will need an in- depth understanding of the content of these sections in the learning guide and prescribed book. In order to master the learning material, you must accept responsibility for your own studies. Learning is not the same as memorizing. You are expected to show that you understand and are able to apply the information. Use will also be made of lectures, tutorials, case studies and group discussions to present this module.

6. LECTURES AND TUTORIALS

Students must refer to the notice boards on their respective campuses for details of the lecture and tutorial time tables. The lecturer assigned to the module will also inform you of the number of lecture periods and tutorials allocated to a particular module. Prior preparation is required for each lecture and tutorial. Students are encouraged to actively participate in lectures and tutorials in order to ensure success in tests, assignments and examinations.

7. NOTICES

All information pertaining to this module such as tests dates, lecture and tutorial time tables, assignments, examinations etc. will be displayed on the notice board located on your campus. Students must check the notice board on a daily basis.

Should you require any clarity, please consult your lecturer, or programme manager, or administrator on your respective campus.

8. PRESCRIBED & RECOMMENDED MATERIAL

8.1 Prescribed Material

The prescribed text books for this module Wegner, T. 2016. Applied Business Statistics: Methods and Excel-basic applications. 4 th ed. Cape Town: Juta.

Business statistics 511 has a well balanced approach in that it is structured such that it not only informs and educates you about the theoretical back-ground required in the business world, but also has a powerful practical element / component. Our practical syllabus follows strongly in line with that of strong management principles and standards currently employed by many enterprises today.

8.2 Recommended Material

Willemse, I. and Nyelisani, P. 2015. Statistical Methods and Calculation Skills. 4th ed. Cape Town: Juta & Company Ltd.

Your lecturer will provide you with a list of additional recommended material as the module progresses

8.3 Independent Research:

The student is encouraged to undertake independent research with emphasis on the Presentation and interpretation of the data collected.

8.4 Library Infrastructure

The following services are available to you:  Each campus keeps a limited quantity of the recommended reading titles and a larger

variety of similar titles which you may borrow. Please note that students are required to purchase the prescribed materials.

 Arrangements have been made with municipal, state and other libraries to stock our recommended reading and similar titles. You may use these on their premises or borrow them if available. It is your responsibility to safe keeps all library books.

 RGI has also allocated one library period per week as to assist you with your formal research under professional supervision.

 RGI has dedicated electronic libraries for use by its students. The computers laboratories, when not in use for academic purposes, may also be used for research purposes. Booking is essential for all electronic library usage.

9. ASSESSMENT

Final Assessment for this module will comprise two CA tests, an assignment and an examination. Your lecturer will inform you of the dates, times and the venues for each of these. You may also refer to the notice board on your campus or the Academic Calendar which is displayed in all lecture rooms.

9.1 CA Tests

There are two compulsory tests for each module (in each semester).

9.2 Assignment

There is one compulsory assignment for each module in each semester. Your lecturer will inform you of the Assessment questions at the commencement of this module. It is therefore necessary to study on an ongoing basis.

9.3 Examination

There is one two hour examination for each module. Make sure that you diarize the correct date, time and venue. The examinations FACULTY will notify you of your results once all administrative matters are cleared and fees are paid up.

The examination may consist of multiple choice questions, short questions and essay type questions. This requires you to be thoroughly prepared as all the content matter of lectures, tutorials, all references to the prescribed text and any other additional documentation/reference materials is examinable in both your tests and the examinations.

The examination FACULTY will make available to you the details of the examination (date, time and venue) in due course. You must be seated in the examination room 15 minutes before the commencement of the examination. If you arrive late, you will not be allowed any extra time. Your student registration card must be in your possession at all times.

9.4 Final Assessment

The final assessment for this module will be weighted as follows:

CA Test 1 + CA Test 2 + Assignment = 40% Examination

9.5 Key Concepts in Assignments and Examinations

In assignment and examination questions you will notice certain key concepts (i.e. words/verbs) which tell you what is expected of you. For example, you may be asked in a question to list, describe, illustrate, demonstrate, compare, construct, relate, criticize, recommend or design particular information/aspects/factors /situations. To help you to know exactly what these key concepts or verbs mean so that you will know exactly what is expected of you, we present the following taxonomy by Bloom, explaining the concepts and stating the level of cognitive thinking that theses refer to.

Competence

Skills Demonstrated

observation and recall of information knowledge of dates, events, places knowledge of major ideas

Knowledge

mastery of subject matter

Question Cues

list, define, tell, describe, identify, show, label, collect, examine, tabulate, quote, name, who, when, where, etc. understanding information grasp meaning translate knowledge into new context interpret facts, compare, contrast

Comprehension

order, group, infer causes predict consequences

Question Cues

summarize, describe, interpret, contrast, predict, associate, distinguish, estimate, differentiate, discuss, extend use information use methods, concepts, theories in new situations solve problems using required skills or knowledge

Application

Questions Cues

apply, demonstrate, calculate, complete, illustrate, show, solve, examine, modify, relate, change, classify, experiment, discover

seeing patterns organization of parts

Analysis

recognition of hidden meanings identification of components

Question Cues

analyze, separate, order, explain, connect, classify, arrange, divide, compare, select, explain, infer

use old ideas to create new ones generalize from given facts relate knowledge from several areas

Synthesis

predict, draw conclusions

Question Cues

combine, integrate, modify, rearrange, substitute, plan, create, design, invent, what if?, compose, formulate, prepare, generalize, rewrite compare and discriminate between ideas assess value of theories, presentations make choices based on reasoned argument

Evaluation

verify value of evidence recognize subjectivity

Question Cues

assess, decide, rank, grade, test, measure, recommend, convince, select, judge, explain, discriminate, support, conclude, compare, summarize

10. Specimen Assignment Cover Sheet BUSINESS ADMINISTRATION, MANAGEMENT & COMMERCIAL SCIENCES BUSINESS STATISTICS 511 ASSIGNMENT COVER SHEET

1 ST SEMESTER ASSIGNMENT

Name & Surname: ______________________________ ICAS No: _________________ Qualification: ______________________ Semester: _____ Module Name: __________________________ Specialization: _____________________

Date Submitted: ___________

QUESTION NUMBER

MARK ALLOCATION

EXAMINER MARKS MODERATOR MARKS

TOTAL

Examiner’s Comments:

Moderator’s Comments:

Signature of Examiner: Signature of Moderator:

The purpose of an assignment is to ensure that the student is able to:

 make informed decisions based on data  correctly apply a variety of statistical procedures and tests  know the uses, capabilities and limitations of various statistical procedures  interpret the results of statistical procedures and tests

Instructions and guidelines for writing assignments

1. Use the correct cover page provided by the institution.

2. All essay type assignments must include the following:

2.1 Table of contents

2.2 Introduction

2.3 Main body with subheadings

2.4 Conclusions and recommendations

2.5 Bibliography

3. The length of the entire assignment must have minimum of 5 pages, preferably typed with font size 12

3.1 The quality of work submitted is more important than the number of assigned pages.

4. Copying is a serious offence which attracts a severe penalty and must be avoided at all costs. If any student transgresses this rule, the lecturer will retain the assignments and ask the affected students to resubmit a new assignment which will be capped at 50%.

5. Use the Harvard referencing method.

ASSESSMENT CRITERIA When the final mark is calculated the following criteria must be taken into account:

1. READING AND KNOWLEDGE OF SUBJECT MATTER

 Wide reading and comprehensive knowledge in the application of theory

2. UNDERSTANDING, ANALYSIS AND ARGUMENT

 Complete and perceptive awareness of issues and clear grasp of their wider significance. Clear evidence of independent thought and ability to defend a position

logically and convincingly.

3. ORGANISATION AND PRESENTATION

 Careful thought given to arrangement and development of material and argument.  Good English with appropriate referencing and comprehensive bibliography.

ASSIGNMENT GUIDELINES The purpose of an assignment is to ensure that the student is able to:

 Interpret, convert and evaluate text.  Have sound understanding of key fields viz principles and theories, rules, concepts and

awareness of how to cognate areas.  Solve unfamiliar problems using correct procedures and corrective actions.  Investigate and critically analyse information and report thereof.  Present information using Information Technology.  Present and communicate information reliably and coherently.  Develop information retrieval skills.  Use methods of enquiry and research in a disciplined field.

ASSESSMENT CRITERIA When the final Mark is allocated the above criteria must be taken into account

A. Content- Relevance: Has the student Answered the Question

B. Research ( A minimum of “TEN SOURCES” is recommended) Reference , books, Internet, Newspapers, Text Books

C. Presentation : Introduction, Body, Conclusion, Paragraphs, Neatness, Integration, Grammar / Spelling, Page Numbering, Diagrams, Tables, Graphs, Bibliography

NB: All Assignments are compulsory as they form part of continuous assessment that counts towards the final mark

11. WORK READINESS PROGRAMME (WRP)

In order to prepare students for the world of work, a series of interventions over and above the formal curriculum, are concurrently implemented to prepare students. These include:  Soft skills  Employment skills  Life skills  End –User Computing (if not included in your curriculum) The illustration below outlines some of the key concepts for Work Readiness that will be included in your timetable.

SOFT SKILLS LIFE SKILLS  Time Management

 Manage Personal Finance

 Working in Teams

 Driving Skills  Problem Solving Skills

 Basic Life Support &

 Attitude & Goal Setting

First Aid

 Etiquettes & Ethics

 Entrepreneurial skills

 Communication Skills

 Counseling skills

WORK READINESS PROGRAMM

EMPLOYMENT SKILLS  CV Writing  Interview Skills

 Presentation Skills  Employer / Employee Relationship

 End User Computing

 Email & E-Commerce  Spread Sheets

 Data base

 Presentation  Office Word

It is in your interest to attend these workshops, complete the Work Readiness Log Book and prepare for the Working World.

12. WORK INTEGRATED LEARNING (WIL)

Work Integrated Learning forms a core component of the curriculum for the completion of this programme. All modules which form part of this qualification will be assessed in an integrated manner towards the end of the programme or after completion of all other modules.

Prerequisites for placement with employers will include:  Completion of all tests & assignment  Success in examination  Payment of all arrear fees  Return of library books, etc.  Completion of the Work Readiness Programme.

Students will be fully inducted on the Work Integrated Learning Module, the Workbooks & assessment requirements before placement with employers.

The partners in Work Readiness Programme (WRP) include:

SECTION B LEARNER GUIDE

MODULE: BUSINESS STATISTICS 511, 1 st SEMESTER TOPIC 1: INTRODUCTION TO DESCRIPTIVE STATISTICS

TOPIC 2: DESCRIBING UNIVARIATE DATA TOPIC 3: CORRELATION SIMPLE LINEAR REGRESSION ANALYSIS TOPIC 5: INTRODUCTION TO PROBALITY TOPIC 6: CONTINUOUS PROBABILITY DISTRIBUTION ADDENDUM 511 (A): REVISION QUESTIONS ADDENDUM 511 (B): TYPICAL EXAMINATION QUESTIONS

TOPIC 1: INTRODUCTION TO DESCRIPTIVE STATISTICS

1.2 What Is Statistics?

1.3 Descriptive Statistics

1.4 Inferential Statistics

1.7 Summation Notation

Lecture 8

1.8 Measurement Scales Assessment questions

TOPIC 2:DESCRIBING UNIVARIATE DATA

2.1 Central Tendency

2.7 Semi-Interquartile Range

2.8 Variance

Lecture

2.9 Standard Deviation

9-20

2.10 Shape The Distribution

2.11 Skewness

2.12 Kurtosis

2.13 Types Of Graphs

2.16 Assessment questions

2.17 Central Tendency

2.18 Mean Assessment questions

TOPIC 3: CORRELATION SIMPLE LINEAR REGRESSION ANALYSIS

3.1 3.1. Scatter Plots

3.2 3.2. Introduction To Pearson's Correlation

Lecture

3.3 3.3 Regression Analysis

21-25

Assessment questions

TOPIC 4: INTRODUCTION TO PROBALITY

4.1 Simple Probability

4.2 Conditional Probability

Lecture

4.3 Probability Of A And B

32 - 35

4.4 Probability Of A Or B

TOPIC 5: DISCRETE PROBABILITY DISTRIBUTION

Lecture

5.1 Permutations And Combinations

36-37

5.2 Binomial Probability Distribution

5.3 The Poisson Distribution Assessment questions

TOPIC 6: CONTINUOUS PROBABILITY DISTRIBUTION

6.1 What Is A Normal Distribution

6.2 The Standard Normal Distribution

6.3 Converting To Percentiles And Back

Lecture

38- 41

6.4 Area Under Portions Of The Curve Assessment Questions

TOPIC 7: ADDENDUM 511 (A): REVISION QUESTIONS

TOPIC 8: ADDENDUM 511 (B): TYPICAL EXAMINATION QUESTIONS

The following are guide icons that will be used throughout this learner guide:

Icon

Description

Learning Outcomes

Study

Read

Writing Activity

Think Point

Research

Glossary

Key Points

Review Question

Case Study

Bright Idea

Problem(s)

Multimedia Resource

Web Resource

TOPIC 1

1. INTRODUCTION TO DESCRIPTIVE STATISTICS

Learning Outcomes:

 In this topic you will learn about the term ‘statistics’

and the use of it.  Knowledge about two types of statistics namely

descriptive and inferential.  An ability to use variables and parameters. You will learn about different measuring scales nominal, ordinal, interval, and ratio.

1.1 WHAT IS STATISTICS?

The word "statistics" is used in several different senses. In the broadest sense, "statistics" refers to a range of techniques and procedures for analyzing data, interpreting data, displaying data, and making decisions based on data. This is what courses in "statistics" generally cover.

In a second usage, a "statistic" is defined as a numerical quantity (such as the mean) calculated from a sample. Such statistics are used to estimate parameters.

The term "statistics" sometimes refers to calculated quantities regardless of whether or not they are from a sample. For example, one might ask about a baseball player's statistics and

be referring to his or her batting average, runs batted in, number of home runs, etc. Or, "government statistics" can refer to any numerical indexes calculated by a governmental agency.

Although the different meanings of “statistics” have the potential for confusion, a careful consideration of the context in which the word is used should make its intended meaning

clear.

1. 2 DESCRIPTIVE STATISTICS

One important use of statistics is to summarize a collection of data in a clear and understandable way. For example, assume a psychologist gave a personality test measuring shyness to all 2500 students attending a small college. How might these measurements be summarized? There are two basic methods: numerical and graphical. Using the numerical approach one might compute statistics such as the mean and standard deviation. These statistics convey information about the average degree of shyness and the degree to which people differ in shyness. Using the graphical approach one might create a stem and leaf display and a box plot. These plots contain detailed information about the distribution of shyness scores.

Graphical methods are better suited than numerical methods for identifying patterns in the data. Numerical approaches are more precise and objective. Since the numerical and graphical approaches complement each other, it is wise to use both but not at the same time for the same data.

1.3. INFERENTIAL STATISTICS

Inferential statistics are used to draw inferences about a population from a sample. Consider an experiment in which 10 subjects who performed a task after 24 hours of sleep deprivation scored 12 points lower than 10 subjects who performed after a normal night's sleep. Is the difference real or could it be due to chance? How much larger could the real difference be than the 12 points found in the sample? These are the types of questions answered by inferential statistics.

There are two main methods used in inferential statistics: estimation and hypothesis testing. In estimation, the sample is used to estimate a parameter and a confidence interval about the estimate is constructed.

In the most common use of hypothesis testing, a "straw man" null hypothesis is put forward and it is determined whether the data are strong enough to reject it. For the sleep deprivation study, the null hypothesis would be that sleep deprivation has no effect on performance.

(Population: A population consists of an entire set of objects, observations, or scores that have something in common. For example, a population might be defined as all males between the ages of 15 and 18.

Some populations are only hypothetical. Consider an experimenter interested in the possible effectiveness of a new method of teaching reading. He or she might define a population as the reading achievement scores that would result if all six year olds in the US were taught with this new method. The population is hypothetical in the sense that it does not exist a group of students who have been taught using the new method; the population consists of the scores that would

be obtained if they were taught with this method.

The distribution of a population can be described by several parameters such as the mean and standard deviation. Estimates of these parameters taken from a sample are called

statistics.

Sample: A sample is a subset of a population. Since it is usually impractical to test every member of a population, a sample from the population is typically the best approach available.)

1.4 VARIABLES

A variable is any measured characteristic or attribute that differs for different subjects. For example, if the weight of 30 subjects were measured, then weight would be a variable.

Quantitative and Qualitative

Variables can be quantitative or qualitative. Qualitative variables are sometimes called "categorical variables ”. Quantitative variables are measured on an ordinal, interval, or ratio scale; qualitative variables are measured on a nominal scale. If five-year old subjects were asked to name their favourite colour, then the variable would be qualitative. If the time it took them to respond were measured, then the variable would be quantitative.

Independent and Dependent variable

When an experiment is conducted, some variables are manipulated by the experimenter and others are measured from the subjects. The former variables are called "independent variables"; or "factors," the latter are called "dependent variables" or "dependent measures." For example, consider a hypothetical experiment on the effect of drinking alcohol on reaction time: Subjects drank water, one beer, three beers, or six beers and then had their reaction times to the onset of a stimulus measured. The independent variable would be the number of beers drunk (0, 1, 3, or 6) and the dependent variable would be reaction time.

Continuous and Discrete variable

Some variables (such as reaction time) are measured on a continuous scale. There are an infinite number of possible values these variables can take on.

Other variables can only take on a limited number of values. For example, if a dependent variable were a subject's rating on a five- point scale where only the values 1, 2, 3, 4, and 5 were allowed, then only five possible values could occur. Such variables are called "discrete" variables.

Nominal: Nominal measurement consists of assigning items to groups or categories. No quantitative information is conveyed and no ordering of the items is implied. Nominal scales are therefore qualitative rather than quantitative. Religious preference, race, and sex are all examples of nominal scales. Frequency distributions are usually used to analyze data measured on a nominal scale. The main statistic computed is the mode. Variables measured on a nominal scale are often referred to as categorical or qualitative variables.

Ordinal: Measurements with ordinal scales are ordered in the sense that higher numbers represent higher values. However, the intervals between the numbers are not necessarily equal. For example, on a five-point rating scale measuring attitudes toward gun control, the difference between a rating of 2 and a rating of 3 may not represent the same difference as the difference between a rating of 4 and a rating of 5. There is no "true" zero point for ordinal scales since the zero point is chosen arbitrarily. The lowest point on the rating scale in the example was arbitrarily chosen to be 1. It could just as well have been 0 or -5.

Interval: On interval measurement scales, one unit on the scale represents the same magnitude on the trait or characteristic being measured across the whole range of the scale. For example, if anxiety were measured on an interval scale, then a difference between a score of 10 and a score of 11 would represent the same difference in anxiety, as would a difference between a score of 50 and a score of 51. Interval scales do not have a "true" zero point, however, and therefore it is not possible to make statements about how many times higher one score is than another. For the anxiety scale, it would not be valid to say that a person with a score of 30 was twice as anxious as a person with a score of 15. True interval Interval: On interval measurement scales, one unit on the scale represents the same magnitude on the trait or characteristic being measured across the whole range of the scale. For example, if anxiety were measured on an interval scale, then a difference between a score of 10 and a score of 11 would represent the same difference in anxiety, as would a difference between a score of 50 and a score of 51. Interval scales do not have a "true" zero point, however, and therefore it is not possible to make statements about how many times higher one score is than another. For the anxiety scale, it would not be valid to say that a person with a score of 30 was twice as anxious as a person with a score of 15. True interval

Ratio: Ratio scales are like interval scales except they have true zero points. A good example is the Kelvin scale of temperature. This scale has an absolute zero. Thus, a temperature of 300 Kelvin is twice as high as a temperature of 150 Kelvin.

1.5 PARAMETERS

A parameter is a numerical quantity measuring some aspect of a population of scores. For example, the mean is a measure of central tendency.

Greek letters are used to designate parameters. At the bottom of this page are shown several parameters of great importance in statistical analyses and the Greek symbol that represents each one. Parameters are rarely known and are usually estimated by statistics computed in samples. To the right of each Greek symbol is the symbol for the associated statistic used to estimate it from a sample.

Standard deviation

Proportion

Correlation

Central tendency: Measures of central tendency are measures of the location of the middle or the centre of a distribution. The definition of "middle" or "centre" is purposely left somewhat vague so that the term "central tendency" can refer to a wide variety of measures. The mean is the most commonly used measure of central tendency. The following measures of central tendency are discussed in this text:

 Mean  Median  Mode

1.6 SUMMATION NOTATION

The Greek letter Σ (a capital sigma) is used to designate summation. For example, suppose an experimenter measured the performance of four subjects on a memory task. Subject 1's

score will be referred to as X 1 , Subject 2's as X 2 , and so on.

The scores are shown below:

The way to use the summation sign to indicate the sum of all four X's is:

This notation is read as follows: Sum the values of X from X 1 through X 4 . The index i (shown just under the Σsign) indicates which values of X are to be summed. The index i takes on

values beginning with the value to the right of the "=" sign (1 in this case) and continues sequentially until it reaches the value above the Σ sign (4 in this case). Therefore ‘i’ takes on

the values 1, 2, 3, and 4 and the values of X 1 ,X 2 ,X 3 , and X 4 are summed (7 + 6 + 5 + 8 = 26).

In order to make formulas more general, variables can be used with the summation notation. For example,

means to sum up values of X from 1 to N where N can be any number but usually indicates the sample size.

Often an abbreviated form of the summation notation is used. For example, ΣX means to sum all the values of X. When only a subset of the values of X is to be summed then the full version is required. Thus, the sum of all elements of X except the first and the last (the N'th) would be indicated as:

which would be read as the sum of X with i going from 2 to N-1.

Some formulas require that each number be squared before the numbers are summed. This is indicated by:

and is equal to 7 2 +6 2 +5 2 +8 2 = 174.

The abbreviated version is simply: ΣX 2 . It is very important to note that it makes a big difference whether the numbers are squared first and then summed or summed first and

then squared. The symbol (ΣX) 2 indicates that the numbers should be summed first and then squared. For the present example, this equals:

(7 + 6 + 5 + 8) 2 = 26 2 = 676. This, of course, is quite different from 174.

Sometimes a formula requires that the sum of cross products be computed. For instance, if

3 subjects were each tested twice, they might each have a score on X and on Y.

Subject

The sum of cross products (2 x 3) + (1 x 6) + (4 x 5) = 32 can be represented in summation notation simply as: ΣXY.

Basic Theorems The following data will be used to illustrate the theorems:

Σ(X + Y) = ΣX + ΣY

Σ(X + Y) = 11 + 5 + 5 = 21 ΣX = 3 + 2 + 4 = 9 ΣY = 8 + 3 + 1 = 12 ΣX + ΣY = 9 + 12 = 21

ΣaX = aΣX(a is a constant) for an example, let a = 2.

ΣaX = (2) (3) + (2) (2) + (2)(4) = 18

a ΣX = (2)(9) = 18

(N is the sample size, in this case, and is the mean which is also equal to 3 in this case.

2 2 Σ(X- ) 2 = (3-3) + (2-3) 2 + (4-3) =2

2 =3 2 +2 2 +4 ΣX 2 = 29

2 (ΣX) 2 /N = 9 /3 = 27

2 ΣX 2 - (ΣX) /N = 29 - 27 = 2

1.7 MEASUREMENT SCALES

Measurement is the assignment of numbers to objects or events in a systematic fashion. Four levels of measurement scales are commonly distinguished: nominal, ordinal, interval,

and ratio.

There is a relationship between the level of measurement and the appropriateness of various statistical procedures.

For example, it would be silly to compute the mean of nominal measurements. However, the appropriateness of statistical analyses involving means for ordinal level data has been controversial. One position is that data must be measured on an interval or a ratio scale for the computation of means and other statistics to be valid. Therefore, if data are measured on an ordinal scale, the median but not the mean can serve as a measure of central

tendency.

The arguments on both sides of this issue will be examined in the context of a hypothetical experiment designed to determine whether people prefer to work with colour or with black and white computer displays. Twenty subjects viewed black and white displays and 20 subjects viewed colour displays.

Displays were rated on a 7 point scale where a 1 was the lowest rating and a 7 was the highest rating. This rating scale is only an ordinal scale since there is no assurance that the difference between a rating of 1 and a rating of 2 represents the same degree of difference in preference as the difference between a rating of 5 and a rating of 6.

The mean rating of the colour display was 5.5 and the mean rating of the black and white display was 3.9. The first question the experimenter would ask is how likely is it that this big

a difference between means could have occurred just because of chance factors such as which subjects saw the black and white display and which subjects saw the colour display. Standard methods of statistical inference can answer this question. Assume these methods led to the conclusion that the difference was not due to chance but represented a "real" difference in means. Does the fact that the rating scale was ordinal instead of interval have any implications for the validity of the statistical conclusion that the difference between means was not due to chance?

The answer is an unequivocal "NO." There is really no room for argument here. What can be questioned, however, is whether it is worth knowing that the mean rating of color displays is higher than the mean rating for B & W displays.

The argument that it is not worth knowing assumes that means of ordinal data are meaningless. Supporting the notion that means of ordinal data are meaningless is the fact that examples (see below) can be made up showing that a difference between means on an ordinal scale can be in the opposite direction of what they would have been if the "true" measurement scale had been used.

If means of ordinal data are meaningless, why should anyone care whether the difference between two meaningless quantities (the two means) is due to chance or not. Naturally enough, the answer lies in challenging the proposition that means of ordinal data are meaningless. There are two counter arguments to the example showing that using an ordinal scale can reverse the direction of the difference between means.

The first is philosophical and challenges the validity of the notion that there is some unseen "true" measurement scale that is only being approximated by the rating scale. The second counter argument accepts the notion of an underlying scale but considers the examples to

be very contrived and unlikely to occur in real data. Measurement scales used in behavioral research are invariably somewhere between ordinal and interval scales. In the preference be very contrived and unlikely to occur in real data. Measurement scales used in behavioral research are invariably somewhere between ordinal and interval scales. In the preference

There are some cases where one can validly argue that the use of an ordinal instead of a ratio scale seriously distorts the conclusions. Consider an experiment designed to determine whether 5-year old children are more distractible than 10-year old children.

Children of both ages perform a memory task once with and once without distraction. The means are given below:

Distraction

No Distraction

It looks as though the 10-year olds are more distractible since distraction cost them 4 points but only cost the 5-year olds 3 points. However, it might be that a change from 3 to 6 represents a larger difference than a change from 8 to 12. Consider that the performance of 5-year olds dropped 50% from distraction but the performance of 10-year olds dropped only 33%.

Which age group is "really" more distractible? Unfortunately, there is no clearly right or wrong answer. If proportional change is considered, then 5-year olds are more distractible; if the amount of change is considered then 10-year olds are more distractible. Keep in mind that statistical conclusions are not affected by the choice of measurement scale even though the all-important interpretation of these conclusions can be.

In this example, a statistical test could validly rule out chance as an explanation of the finding that 10-year olds lost more points from distraction than did 5-year olds. However, the statistical test will not reveal whether a greater drop necessarily means 10-year olds are more distractible. So the conclusion that distraction costs 10-year olds more points than it costs 5-year olds is valid. The interpretation depends on measurement issues.

In summary, statistical analyses provide conclusions about the numbers entered into them. Relating these conclusions to the substantive research issues depends on the measurement operations.

Examples: Assume there were a "true" measurement scale for job satisfaction and that it maps onto a 7-point rating scale as follows: "True scale" 7-point scale

1-5

1 6-40

2 41-42

3 43-75

4 76-90

5 91-94

6 95-100

Thus if someone's "true" job satisfaction were 55 he or she would have a rated score of 4. Now consider the following two sets of job satisfaction scores:

Group A

Group B

True Scale

Rating

True Scale

On the "true" scale the mean for Group B is 61.8, which is much higher than the mean for Group A that is 48.2. However on the 7-point rating scale, the mean for B is only 3.8 which is lower than the mean for A of 4.2.

Problems:

1. A teacher wishes to know whether the male in his/her class have more favorable attitudes toward gun control than do the female. All students in the class are given a questionnaire about gun control and the mean responses of the males and the females are compared. Is this an example of descriptive or inferential statistics?

2. A medical researcher is testing the effectiveness of a new drug for treating Parkinson's disease. Ten subjects with the disease are given the new drug and 10 are given a placebo. Improvement in symptomology is measured. What would be the roles of descriptive and inferential statistics in the analysis of these data?

3. What are the advantages and disadvantages of graphical as opposed to numerical approaches to descriptive statistics?

4. Distinguish between random and stratified sampling?

5. A study is conducted to determine whether people learn better with spaced or massed practice. Subjects volunteer from an introductory psychology class. The first 10 subjects who volunteer are assigned to the massed-practice condition; the next 10 are assigned to the spaced-practice condition. Discuss the consequences and seriousness of each of the following two kinds of non-random sampling: (1) Subjects are not randomly sampled from some specified population and (2) subjects are not randomly assigned to conditions. In general, which type of non-random sampling is more serious?

6. Define independent and dependent variables.

7. Categorize the following variables as being qualitative or quantitative: -Response time -Rating of job satisfaction -Favorite color -Occupation aspired to -Number of words remembered

8. Specify the level of measurement used for the items in Question 7.

9. Categorize the variables in Question 7 as being continuous or discrete.

10. Are Greek letters used for statistics or for parameters?

11. When would the mean score of a class on a final exam be considered a statistic? When would it be considered a parameter?

12. An experiment is conducted to examine the effect of punishment on learning speed in rats. What are the independent and dependent variables?

13. For the numbers 1, 2, 4, 8

Compute: SX, SX 2 and (SX) 2

14. SX = 7 and SX 2 = 21. A new variable Y is created by multiplying each X by 3. What are SY and SY 2 equal to?

For additional reading on this topic, a student must refer to the recommended text book for Business Statistics [Applied Business Statistics, Methods and Excel-basic applications (3rd edition) by: Trevor

Wegner (page 63)

TOPIC 2

2. DESCRIBING UNIVARIATE DATA

Learning Outcomes:

 In this topic you will learn what is central tendency as

well as its measures.  Knowledge about shapes, graphs, ranges etc.

2.1 CENTRAL TENDENCY

Measures of central tendency are measures of the location of the middle or the centre of a distribution. The definition of "middle" or "centre" is purposely left somewhat vague so that the term "central tendency" can refer to a wide variety of measures. The mean is the most commonly used measure of central tendency. The following measures of central tendency are discussed in this text:

 Mean  Median  Mode

2.2 MEAN

Arithmetic Mean

The arithmetic mean is what is commonly called the average. When the word "mean" is used without a modifier, it can be assumed that it refers to the arithmetic mean. The mean is the sum of all the scores divided by the number of scores. The formula in summation notation is: μ = ΣX/N where μ is the population mean and N is the number of scores. If the scores are from a sample, then the symbol M refers to the mean and n refers to the sample size. The formula for M is the same as the formula for μ. The mean is a good measure of

central tendency for roughly symmetric distributions but can be misleading in skewed

distributions since it can be greatly influenced by extreme scores. Therefore, other statistics such as the median may be more informative for distributions such as reaction time or family income that are frequently much skewed.

The sum of squared deviations of scores from their mean is lower than their squared deviations from any other number.

For normal distributions, the mean is the most efficient and therefore the least subject to sample fluctuations of all measures of central tendency.

The formal definition of the arithmetic mean is µ = E[X] where μ is the population mean of the variable X and E[X] is the expected value of X.

Geometric Mean

The geometric mean is the nth root of the product of the scores. Thus, the geometric mean of the scores: 1, 2, 3, and 10 is the fourth root of 1 x 2 x 3 x 10 which is the fourth root of 60 which equals 2.78.

The formula can be written as: Geometric mean = ΠX where ΠX means to take the product of all the values of X.

Geometric mean

GM n

x x x ... x

Example

Geometric mean n        x 1 x 2 x 3 ... x n

The geometric mean can also be computed by:

1. Taking the logarithm of each number

2. Computing the arithmetic mean of the logarithms

3. Raising the base used to take the logarithms to the arithmetic mean.

The example on the next page shows an example of this method using natural logarithms.

Ln(X)

2.302585 Arithmetic mean = 1.024.

Geometric mean = 2.78 EXP[1.024] = 2.78

The base of natural logarithms is 2.718. The expression: EXP [1.024] means that 2.718 is raised to the 1.024th power. Ln (X) is the natural log of X.

Naturally; you get the same result using logs base 10 as shown below.

Log(X)

1.00000 Arithmetic mean = 0.44454.

Geometric mean = 2.78

If any one of the scores is zero then the geometric mean is zero. The geometric mean does not make sense if any scores are less than zero. The geometric mean is less affected by extreme values than is the arithmetic mean and is useful as a measure of central tendency for some positively skewed distributions. The geometric mean is an appropriate measure to use for averaging rates. For example, consider a stock portfolio that began with a value of $1,000 and had annual returns of 13%, 22%, 12%, -5%, and -13%. The table below shows the value after each of the five years.

The question is how to compute annual rate of return? The answer is to compute the geometric mean of the returns. Instead of using the percents, each return is represented as

a multiplier indicating how much higher the value is after the year. This multiplier is 1.13 for

a 13% return and 0.95 for a 5% loss. The multipliers for this example are 1.13, 1.22, 1.12, 0.95, and 0.87. The geometric mean of these multipliers is 1.05. Therefore, the average annual rate of return is 5%. The following table shows how a portfolio gaining 5% a year would end up with the same value ($1,276) as the one shown above.

Harmonic Mean

The harmonic mean is used to take the mean of sample sizes. If there are k samples each of size n, then the harmonic mean is defined as:

For the numbers 1, 2, 3, and 10, the harmonic mean is:

= 2.069. This is less than the geometric mean of 2.78 and the arithmetic mean of 4.

Sample fluctuations: Sampling fluctuation refers to the extent to which statistic takes on different values with different samples. That is, it refers to how much the statistic's value fluctuates from sample to sample.

A statistic whose value fluctuates greatly from sample to sample is highly subject to sampling fluctuation.

2.3 MEDIAN

The median is the middle of a distribution: half the scores are above the median and half are below the median. The median is less sensitive to extreme scores than the mean and this makes it a better measure than the mean for highly skewed distributions. The median income is usually more informative than the mean income, for example.

The sum of the absolute deviations of each number from the median is lower than is the sum of absolute deviations from any other number.

The mean, median, and mode are equal in symmetric distributions. The mean is higher than the median in positively skewed distributions and lower than the median in negatively skewed distributions

Computation of Median

When there is an odd number of numbers, the median is simply the middle number. For example, the median of 2, 4, and 7 is 4. Remember to sort out the data values in ascending order first then calculate the median.

When there is an even number of numbers, the median is the mean of the two middle numbers. Thus, the median of the numbers 2, 4, 7, 12 is (4+7)/2 = 5.5.

2.4 MODE

The mode is the most frequently occurring score in a distribution and is used as a measure of central tendency. The advantage of the mode as a measure of central tendency is that its meaning is obvious. Further, it is the only measure of central tendency that can be used with nominal data.

The mode is greatly subject to sample fluctuations and is therefore not recommended to be used as the only measure of central tendency. A further disadvantage of the mode is that many distributions have more than one mode. These distributions are called "multimodal."

In a normal distribution, the mean, median, and mode are identical.

Summary: