Statistika 2

(1)

STATISTICS 2

Hotniar Siringoringo

Lembaga Penelitian

Kampus D, gd 4 lt. 1

http://staffsite.gunadarma.ac.id/hotniars

[email protected]

(2)

Statistics 1

• _{Role of Statistics on data analyses} • _{Statistics Terms}

• _{Frequency Distribution}

• _{Central Tendency and Measure of}

Variation

• _{Probability and Random Variables} • _{Probability Distribution}

(3)

Statistics 2

• _{Sampling Distribution} • _{Confidence Interval} • _{Hypothesis Testing}

• _{Statistical Inference Based on Two}

Samples

• _{Simple Linear Regression} • _{Multiple Regression}

(4)

Books

1. Bowerman, Bruce and O’Connell, Richard T. 1997. Applied Statistics: Improving Business Processes. Irwin Professional Publishing, USA

2. Walpole, Ronald E. 2011. Probability & Statistics for Engineers and Scientist. Prentice Hall.

(5)

SAMPLING

(6)

Populasi dan Sampel

• _{Populasi : totalitas dari semua}

objek/ individu yg memiliki

karakteristik tertentu, jelas dan lengkap yang akan diteliti

• _{Sampel : bagian dari populasi}

yang diambil melalui cara-cara tertentu yg juga memiliki

karakteristik tertentu, jelas dan lengkap yg dianggap bisa

(7)

Lambang Parameter dan

Statistik

Besaran Lambang

Parameter (Populasi)

Lambang Statistik (Sampel)

Rata-rata μ x bar

Varians σ2 S2

Simapangan

baku σ

Jumlah

Observasi N n

Proporsi P p

(8)

 _{Distribusi Sampling merupakan distribusi}

teoritis (distribusi kemungkinan) dari semua hasil sampel yang mungkin, dengan ukuran sampel yang tetap N, pada statistik

(karakteristik sampel) yang digeneralisasikan ke populasi.

 _{Distribusi Sampling memungkinkan untuk}

memperkirakan probabilitas hasil sampel tertentu untuk statististik tersebut

 Merupakan jembatan, karena melalui distribusi sampling dapat diketahui karakteristik populasi

(9)

Distribusi Sampling

_{Distribusi dari besaran-besaran}

statistik spt rata-rata, simpangan

baku, proporsi yg mungkin muncul dr sampel-sampel

_{Secara umum informasi yang perlu untuk}

mencirikan suatu distribusi secara cukup akan mencakup:

_{Ukuran Kecenderungan Memusat (mean,}

median, modus)

_{Ukuran Persebaran Data (range, standar}

deviasi)

_{Bentuk distribusi}

_{Strategi Umum penerapan statistik inferensial}

adalah pindah dari sampel ke populasi melalui distribusi sampling

(10)

Jenis-jenis Distribusi Sampling

1. Distribusi Sampling Rata-rata 2. Distribusi Sampling Proporsi 3. Distribusi Sampling yang Lain

• _{Distribusi Sampling Mean : Distribusi}

sampling dari mean-mean sampel adalah distribusi mean-mean aritmetika dari

seluruh sampel acak berukuran n yang mungkin yang dipilih dari sebuah populasi

(11)

• _{Distribusi sampling proporsi : Distribusi}_{sampling dari}

proporsi adalah distribusi proporsi-proporsi dari seluruh sampel acak berukuran n yang mungkin yang dipilih dari sebuah populasi

• _{Distribusi Sampling perbedaan/penjumlahan :} – _{Terdapat 2 populasi}

– _{Untuk setiap sampel berukuran}_{n1 dari populasi}

pertama dihitung sebuah statistik S1 dan

menghasilkan sebuah distribusi sampling dari statistik

S1 yang memiliki mean μs1 dan deviasi standard σs1

– _{Dari populasi kedua, untuk setiap sampel berukuran}

n2 dihitung statistik S2 yang akan menghasilkan sebuah distribusi sampling dari statistik S2 yang memiliki mean μs2 dan deviasi standard σs2

(12)

Distribusi Sampling Rata-rata

a. Pemilihan sampel dari populasi terbatas

1. Utk pengambilan sampel tanpa pengembalian atau n/N > 5%

2. Utk pengambilan sampel dgn pengembalian atau n/N ≤ 5%

  



n N

n x

 

 

x x

 

 

 

(13)

Sebuah toko memiliki 5 Karyawan A,B,C,D,E dengan upah perjam: 2,3,3,4,5. Jika upah yang diperoleh dianggap sebagai populasi, tentukan: (tanpa Pengembalian)

a. Rata-rata sampel 2 unsur

b. Rata-rata dari rata-rata sampel

c. Simpangan baku dari rata sampel

Banyaknya sampel yang mungkin adalah = 10 buah

2!(5 2)!

! 5

5 2

 

(14)

b. Rata-rata dari sampel µ = 2+3+3+4+5 = 3.4

c. Simpangan baku

= 0.62 1 5 2 5 2 02 . 1 1       x x _N n N n    1 5 2 5 2 02 . 1 1       x x _N n N n   

(15)

Distribusi Sampling mean

 Teorema Sampling populasi terdistribusi normal:

Bila sampel-sampel random diulang-ulang dengan ukuran n diambil dari suatu populasi terdistribusi normal dengan rata-rata μ dan

standar deviasi σ, maka distribusi sampling rata-rata sampel akan normal dengan rata-rata-rata-rata μ dan standar deviasi

n



(16)

Distribusi Sampling

(17)

Distribusi Sampling

(18)

b. Pemilihan sampel dari populasi yg tidak terbatas

c. Daftar distribusi normal untuk distribusi sampling rata-rata

1. Utk populasi terbatas atau n/N > 5%

2. Utk populasi tdk terbatas atau n/N ≤ 5%

dan _x

 



  

   

n N

n X Z





n X Z

   

(19)

SOAL

• _{Upah per jam pekerja memiliki rata-rata}

Rp.500,- perjam dan simpangan baku

Rp.60,-. Berapa probabilitas bahwa upah rata-rata 50 pekerja yang merupakan

sampel random akan berada diantara 510,- dan 520,- ?

Diket:

µ = 500; Simp b: 60,- ; n = 50 ; X = 510 dan 520

(20)

X = 510 maka Z = 1.18 X = 520 maka Z = 2.36

P (1.18 < Z < 2,36) = P (0<Z<2,36) – P(0<Z<1.18)

= 0.4909 – 0.3810 = 0.1099

(21)

Distribusi Sampling

Proporsi

• _Distribusi_{sampling dari proporsi adalah}

distribusi proporsi-proporsi dari seluruh sampel acak berukuran n yang mungkin yang dipilih dari sebuah populasi

• _{proporsi kesuksesan desa yang mendapat}

bantuan program

• _{Perbedaan persepsi penduduk miskin dan kaya}

terhadap pembangunan mall, dilihat dari proporsi ketersetujuannya

(22)

Distribusi Sampling Proporsi

• _{Proporsi dr populasi dinyatakan} • _{Proporsi utk sampel dinyatakan}

1. Utk pengambilan sampel dgn pengembalian atau jika ukuran

populasi besar dibandingkan dgn ukuran sampel yi n/N ≤ 5%

N X P 

n X p 

n P P

p p

) 1

(  



 

(23)

2. Utk pengambilan sampel tanpa pengembalian atau jika ukuran populasi kecil dibandingkan dgn ukuran sampel yi n/N > 5%

1 )

1 (

  

 

n N

n P P

P p

 

(24)

Sebuah toko memiliki 6 karyawan, misalkan A,B,C untuk yang senang membaca dan X,Y,Z untuk yang tidak senang membaca. Jika dari 6 karyawan tersebut diambil sampel yang beranggotakan 4 karyawan (pengambilan sampel tanpa pengembalian), tentukan:

a. Banyaknya sampel yang mungkin diambil b. Distribusi sampling proporsinya

c. Rata-rata dan simpangan baku sampling proporsinya Jwb:

(25)

Distribusi Sampling yang

Lain

a. Distribusi sampling beda dua rata-rata

1. Rata-rata

2. Simpangan baku

3. Untuk n1 dan n2 dgn n1, n2 > 30

2 1

1  

    x x 2 2 2 1 2 1 2

1 x n n

x    _   2 1 ) ( )

( ₁ ₂ ₁ ₂

X X X X Z        

(26)

• _{Misalkan rata-rata pendapatan manajer dan karyawan,}

Rp. 50.000,- dengan simpangan baku Rp. 15.000,- dan 12.000,- dengan simpangan baku 1.000,-. Jika diambil sampel random manajer sebanyak 40 orang dan

karyawan sebanyak 150 orang. Tentukan:

a. Beda rata-rata pendapatan sampel

b. Simpangan baku rata-rata pendapatan sampel

c. Probabilitas beda rata-rata pendapatan manajer dan karyawan biasa lebih dari 35.000,-

Diket:

µ = 50.000 µ = 12.000 Simp: 15.000 Simp b : 1.000

(27)

b. Distribusi sampling beda dua proporsi

1. Rata-rata

2. Simpangan baku

3. Untuk n1 dan n2 dgn n1, n2 ≥ 30

2 1

1 P

P

P 







2 2 2 1 1 1 2 1 ) 1 ( ) 1 ( n P P n P P P P       2 2 1 1 2 1 2 1 2 1 2

1 ) ( )

( n X n X p p P P p p Z P P         

(28)

Metode Sampling

• _{Cara pengumpulan data yg}

hanya mengambil sebagian elemen populasi

• _{Alasan dipilihnya metode ini :}

1. Objek penelitian yg homogen 2. Objek penelitian yg mudah

rusak

3. Penghematan biaya dan waktu 4. Masalah ketelitian

5. Ukuran populasi 6. Faktor ekonomis

(29)

Teknik pengambilan sampel dibagi atas 2 kelompok besar, yaitu :

1. Probability Sampling (Random Sample)

Dengan teknik ini, peneliti dapat menentukan derajat kepercayaan terhadap sebuah sampel. Selain itu, perbedaan dalam menafsirkan

parameter populasi dengan statistik sampel dapat diperkirakan.

2. Non Probability Sampling (Non Random Sample) Sedangkan pada non probability

sampel, penyimpangan nilai sampel terhada populasinya tidak mungkin diukur.

Pengukuran penyimpangan ini merupakan salah satu bentuk pengujian statistik.

Penyimpangan yang terjadi pada perancangan kuesioner, kesalahan petugas pengumpul data dan pengola data disebut Non Sampling Error.

(30)

Random sampling:

1. Pengambilan sampel acak sederhana (simple random sampling)

2. Pengambilan sampel acak stratifikasi (stratified random sampling)

3. Pengambilan sampel acak bertahap (multistage random sampling)

4. Pengambilan sampel acak sistematis (systematic random sampling)

5. Pengambilan sampel acak kelompok (cluster random sampling)

(31)

Pengambilan sampel tanpa acak

1. Pengambilan sampel seadanya (accidental sampling)

2. Pengambilan sampel berjatah (quota sampling)

3. Pengambilan sampel berdasarkan pertimbangan (purposive sampling) 4. Snow Ball Sampling

(32)

Sampling Acak Sederhana

1. Bentuk kerangka sampel

2. Pilih sampel menggunakan pengundian atau dengan menggunakan tabel

(33)

• _{Kerangka Sampel (Sampling Frame) → Suatu}

daftar unit-unit dari sebuah populasi yang sampelnya akan diambil.

• _{Unit Sampel (Sampling Unit) → Sebuah unit}

terkecil dari sebuah populasi yang akan diambil sampelnya.

• _{Rancangan Sampel → meliputi bagaimana cara}

mengambil sampel dan menentukan besar sampelnya.

• _Random.

Cara pengambilan sampel dimana setiap unit dalam populasi mempunyai kesempatan

(34)

1. Pengambilan sampel acak sederhana (simple random sampling)

→rancangan yang paling sederhana dan mudah, tetapi membutuhkan persyaratan tertentu, yaitu populasi yang benar-benar atau mendekati homogen dan sudah

teridentifikasi banyaknya subjek atau unit analisis

(35)

• _Keuntungan

1. Ketepatan yang tinggi dan setiap unit

sampel mempunyai probabilitas yang sama untuk diambil sebagai sampel

2. Sampling error dapat ditentukan secara kuantitatif

• _Kerugian

jika tidak terdapat unit dasar (sampling frame) dan populasi yang tersebar atau populasi yang sangat luas dengan prasarana yang tidak

menunjang, maka pengambilan sampel acak sulit dilaksanakan atau membutuhkan tenaga, waktu, dan biaya yang sangat besar.

(36)

• _{Teknik pelaksanaan}

1. dibuat daftar semua unit sampel, disusun dan diberi nomor secara berurutan

2. Semua unit sampel ditulis pada gulungan kertas atau kepingan dengan bentuk dan ukuran serta warna yang sama kemudian dimasukan kedalam kotak dan diaduk

sampai rata

3. Gulungan kertas atau keping diambil sesuai dengan jumlah sampel yang

diinginkan kemudian dicocokan dengan nomor urut daftar unit sampel

(37)

Sampling Acak Sistematis

1. Bentuk kerangka sampel 2. Tentukan jarak :

3. Pilih sampel sampel yang pertama dengan cara pengundian atau tabel acak=n1

4. : sampai semua sampel sudah terpilih

a el

jumlahsamp lasi jumlahpopu



a n

(38)

2. Pengambilan sampel acak stratifkasi (Stratified Random Sampling)

→rancangan ini dilakukan pada populasi yang heterogenitasnya diwarnai dengan adanya beberapa kelompok atau kelas (stratum)subjek dengan batas yang jelas antar kelompok tersebut.

(39)

• _Keuntungan:

→ketapatan yang lebih tinggi dengan simpangan baku yang lebih kecil

dibandingkan dengan pengambilan sampel secara acak sederhana.

• _Kerugian:

- Harus mengetahui kondisi populasi yang sering tidak diketahui

- Sulit untuk membuat kelompok yang homogen

(40)

Tahap-tahap rancangan stratifikasi:

1. Bagilah (kelompokkan) subjek populasi dalam beberapa stratum beranggotakan subjek yang sama atau hampir sama

karakteristisknya

2. Buatlah daftar subjek dari stratum (sub-populasi)

3. Pilihlah subjek sampel dari masing-masing sub-populasi dengan teknik random murni atau teknik

(41)

3. Pengambilan sampel acak bertahap (multistage

random sampling)

→Teknik pemilihan sampel dengan cara

menggabungkan dua atau lebih rancangan sampel sekaligus

• _{Keuntungan :}

1. Varians yang relatif kecil untuk biaya setiap unit

2. Kontrol terhadap kesalahan tak sampling menjadi lebih baik

3. Penelitian ulang membutuhkan biaya yang relatif kecil

4. Kontrol terhadap liputan penelitian lebih mudah dilakukan

(42)

• _Kerugian:

→ Pada Primary Sampling Unit

(PSU)besar,penggambaran terhadap populasi kurang baik, sedangkan

dengan PSU kecil hanya dapat

dilakukan bila individu dalam populasi tidak tersebar.

(43)

Tahap-tahap pengambilan sampel acak bertahap

1. Lakukan tahap-tahap rancangan klaster (pembagian daerah menjadi klaster,

penetapan jumlah klaster dan randomisasi klaster)

2. Buatlah daftar subjek dari semua klaster yang terpilih sebagai klaster sampel

3. Pilihlah subjek sampel dari daftar subjek tersebut, sebanyak yang dikehendaki

dengan menggunakan teknik random (randomisasi subjek)

(44)

4. Pengambilan sampel acak sistematis (sistematic random sampling)

→apabila pengmbilan sampel acak dilakukan secara berurutan dengan interval tertentu

→besarnya interval (i)dapat ditentukan dengan membagi populasi (N) dengan jumlah sampel yang diinginkan (n) atau

(45)

Keuntungan:

1. Sampling frame tidak mutlak dibutuhkan karena daftar responden dapat dilakukan bersamaan dengan pengambilan sampel 2. Cara ini relatif mudah dan dapat dilakukan

oleh petugas lapangan

3. Cara ini sangat praktis bila populasi dalam bentuk kartu

4. Variasi akan lebih kecil dibandingkan dengan cara lain

5. Membutuhkan waktu dan biaya yang relatif rendah dibandingkan dengan simple random sampling.

(46)

Kerugian:

1. Setiap unit sampel tidak mempunyai peluang yang sama untuk diambil

sebagai sampel

2. Bila terdapat suatu kecenderungan

tertentu maka cara pengambilan sampel acak sistematis menjadi kurang sesuai.

(47)

5. Sampel Random Berkelompok (Cluster Sampling)

→Suatu Klaster (cluster) adalah suatu kelompok dari subjek atau kesatuan

analisis yang berdektan satu dengan yang lain secara geometrik.

Keuntungan dari cara ini adalah tidak memerlukan daftar populasi sehingga tidak ada biaya transportasi.

Kerugiannya adalah sulit dalam menentukan estimasinya.

(48)

3 cara dalam pengambilan sampel yang dilakukan tidak secara random:

a. Sampel Dengan Maksud (Purposive Samping). Pengambilan sampel dilakukan dengan melihat unsur-unsur yang dikehendaki dari data yang sudah ada.

b. Sampel Tanpa Sengaja (Accidental Sampling). Sampel diambil berdasarkan keperluan saja. Tidak ada perencanaan ataupun pertimbangan khusus di dalamnya.Sampel diambil atas dasar seandainya saja, tanpa direncanakan lebih

dahulu.

c. Sampel Berjatah (Quota Sampling).

Besar dan criteria dalam pengambilan sampel telah ditentukan terlebih dahulu.

(49)

Teknik Penentuan Jumlah Sampel

1. Pengambilan sampel dengan pengembalian →Nn

Contoh:

untuk populasi berukuran 4 dengan

anggota-anggotanya A, B, C, D dan sampel yang diambil

berukuran 2 maka banyaknya sampel yang mungkin dapat diambil adalah 42₌₁₆

2. Pengambilan sampel tanpa pengembalian → _CN =

n !( )!

n N n

(50)

Contoh:

Untuk populasi berukuran 5 dengan anggota-anggotanya A, B, C, D, E dan sampel yang diambil berukuran 2 maka banyaknya sampel yang mungkin dapat diambil adalah

10 )!

2

5 (!

2 !

5 

_



(51)

(52)

Contoh Soal

1. Bola lampu produksi pabrik PHILLIPS memiliki umur rata-rata 1.600 jam dengan simpangan baku 225 jam, sedangkan bola lampu

produksi SHELL memiliki umur rata-rata

1.400 jam dengan simpangan baku 150 jam. Jika diambil sampel random sebanyak 150

bola lampu dari masing-masing merek untuk diuji, tentukan :

a. Beda rata-rata umur bola lampu tersebut

b. Simpangan baku rata-rata umur bola lampu tersebut

c. Probabilitas bahwa merek PHILLIPS memiliki umur rata-rata paling sedikit 175 jam lebih lama daripada merek SHELL

d. Probabilitas beda rata-rata umur bola lampu PHILLIPS dan SHELL lebih dari 160 jam

(53)

2. Empat persen barang di gudang A adalah cacat dan sembilan persen barang di gudang B adalah cacat. Jika diambil sampel random

sebanyak 150 barang dari gudang A dan 200 barang dari gudang B,

tentukan :

a. rata-rata beda dua proporsi sampel tersebut

b. Simpangan baku beda dua proporsi sampel tersebut

c. Probabilitas beda persentase barang yang cacat dalam gudang A 3% lebih besar dariapda gudang B

(54)

A medical clinic specializes in treating patient with allergies. Many of the clinic’s patients must receive allergy shots on a regular basis. The administrator of the clinic whishes to study (and eventually to reduce) the time it takes patients to get their

shots. When receiving a shot, a patient must:

• _{check in with a receptionist} • _{Wait for a nurse}

• _{Have the shot administered}

• _{Wait for a period of at least 15 minutes in case}

of an adverse reaction to the shot

(55)

• _{have a nurse check the patient for signs of}

reactions, and receive the nurse’s permission to check out

• _{Check out on the receptionist desk.}

In order to study the process, the clinic

administrator decides to observe patient’s

treatment times on a typical day. The administrator selects a day when a typical patient load is

expected and when no unusual delays are

anticipated. On the chosen day, the treatment time for each patient receiving an allergy shot is

measured and recorded. By the end of the day, 201 patient have received a shot.

(56)

• _{Suppose treatment time average for 201}

patients is 30 minutes with standard

deviation 3.47 minutes. Based on data plotted to histogram it appears as bell

shaped and symmetrically, the population of 201 patients appears to be

approximately normally distributed.

• _{Furthermore, the administrator wishes to}

monitor the effectiveness of the treatment process on a daily basis : choose 5

(57)

Examples

1. A chain of audio/video equipment discount store employs 36

salespeople. Daily dollar sales for

individual salespersons employed by the chain have a mound-shaped

distribution with a mean of $2,000 and a standard deviation equal to $300.

a. Suppose that the chain’s management decides to implement an incentive

program that awards a daily bonus to any salesperson who achieves a daily sales figure that exceeds $2,150.

calculate the probability that an

individual salesperson will earn the bonus on any particular day!!!

(58)

b. Suppose that (as an alternative) the

chain’s management decide to award a daily bonus on the entire sales force of 36 salespeople if all 36 achieve an average daily sales figure that exceed $2,150.

Calculate the probability that average daily sales for the entire sales force will exceed $2,150 (and, therefore, that the entire

sales force will earn the bonus) on any particular day.

c. Intuitively, it would be more difficult for an individual to achieve a daily sales figure that exceeds $2,150 or it would be more

difficult for the entire sales force to achieve an average sales figure that exceeds

$2,150? Are the probabilities you

computed in parts a and b consistent with your intuition?

(59)

Solution

Answer: 300 ; 2000 ;

36  

 x s

x 2150

 x 2150

p                                       _         _    3 36 300 2000 2150 2150 5 . 0 300 2000 2150 2150 z p z p n x z p x p z p z p x z p x p    

(60)

Cases

1. A resort hotel try to improve service by reducing variation in the time it takes to

clean and prepare rooms. In order to study the situation, five rooms are selected each day for 25 consecutive days, and the

required to clean and prepare each room is recorded. The data that’s obtained is given below:

a. Suppose the hotel whishes to use an chart to monitor the room cleaning and preparation process. Also suppose that, when the process in statistical control, the process mean is

μ=16 minutes and the process standard

deviation is σ=1.2 minutes. Find the center line, upper and lower control limit for the chart.

(61)

Cases

b. What assumption have you made in

calculating the control limits of part a?

how can you verify that this assumption is reasonable?

c. Plot the sample mean of the data versus the chart center line and control limit. Are any of the sample mean outside the control limits on the resulting chart?Hari 1

1 2 3 4 5 1 2 3 4 5 waktu 13 12.7 11.9 12.1 11.9 13.0 11.1 10.1 12.1 12

(62)

Solution

• _{95.45% :} • _97.

(63)

Day 1 2 3 4 5 6 7 8 9 10 1 15.6 15.0 16.4 14.2 16.4 14.9 17.9 14.0 17.6 14.6 2 14.3 14.8 15.1 14.8 16.3 17.2 17.9 17.7 16.5 14.0 3 17.7 16.8 15.7 17.3 17.6 17.2 14.7 16.9 15.3 14.7 4 14.3 16.9 17.3 15.0 17.9 15.3 17.0 14.0 14.5 16.9 5 15.0 17.4 16.6 16.4 14.9 14.1 14.5 14.9 15.1 14.2

(64)

Day 11 12 13 14 15 16 17 18 19 20 1 14.6 15.3 17.4 15.3 14.8 16.1 14.2 14.6 15.9 16.2 2 15.5 15.3 14.9 16.9 15.1 14.6 14.7 17.2 16.5 14.8 3 15.9 15.9 17.7 17.9 16.6 17.5 15.3 16.0 16.1 14.8 4 14.8 15.0 16.6 17.2 16.3 16.9 15.7 16.7 15.0 15.0 5 14.2 17.8 14.7 17.5 14.5 17.7 14.3 16.3 17.8 15.3

(65)

Day 21

22 23 24 25

16,3 15.0 16.4 16.6 17.0

15.3 17.6 15.9 15.1 17.5

14.0 14.5 16.7 14.1 17.4

17.4 17.5 15.7 17.4 16.2

14.5 17.8 16.9 17.8 17.9

(66)

Case

2. A company is using a control chart to monitor an electrical characteristic. The desired mean

measurement for this characteristic is 1.000 and the standard deviation of individual

measurements of this characteristic is 12. The company takes nine readings of this

characteristic every hour, computes the average of the nine readings, and plot this average as a point on the control chart. The control limits for this chart have been set at 990 and 1.010. We will assume that measurement of the electrical characteristic are normally distributed:

a. How many standard deviations of the average have the upper and lower control limits been set above and

below the desired mean measurement for this characteristic?

b. If the mean and standard deviation of the electrical characteristic are at their desired levels, what’s the

probability than an hourly average of nine readings will be outside the established control limits?

(67)

Example

• _{A food company processing company}

wishes to asses whether p, the

proportion of all current purchasers who would stop buying the cheese

spread if the new spout were used, is less than 0.10. Suppose from 1000

purchasers selected randomly and asked the question, 63 purchasers say that they would stop buying the cheese spread if the new spout were used.

(68)

• _{The interval is:}

• _{Since the interval doesn’t contain 0}

or 1, the sample size n is large

enough to assume that sampling distribution of is approximately a

normal distribution with mean and standard deviation

• _{So that}

    _ _ 1285 . 0 , 0715 . 0 1000 10 . 0 1 10 . 0 10 . 0 1

3 _ 

                n p p p pˆ 1 . 0

ˆ p 

p      0094868 . 0 1000 9 . 0 1 . 0 1

ˆ  p _n p  













3.9



ˆ 063

. 0

ˆ  

              

 P z

n p p p p P p P

(69)

1. Bila semua kemungkinan contoh

berukuran 16 ditarik dari sebuah populasi normal dengan nilai tengah 50 dan

simpangan baku 5, berapa peluang

bahwa suatu nilaitengah contoh akan jatuh dalam selang waktu dari ?

Asumsikan bahwa nilaitengah-nilaitengah contoh itu akan dicatat sampai ketelitian berapapun.

(70)

Penyelesaian

• Dik : n=16; μ=50; σ=5 distribusi normal • Dit.

• Jawab :

 _x _x _x _x 

p   1.9    0.4

 

 1.9 0.4

25 . 1 50 5 . 49 25 . 1 50 626 . 47 5 . 49 625 . 47 4 5 4 . 0 50 4 5 96 . 1 50 4 . 0 96 . 1 4 5 16 5 ; 50             _ _            _ _ _ _ _ _           z p z p x p x p x p n x x x x x x        

(71)

CONFIDENCE

INTERVAL

(72)

Terms

• _{Ruang keputusan: himpunan semua}

kemungkinan nilai dugaan yang dapat diambil oleh suatu penduga

• _{Penduga tak bias: statistik} _dikatakan

sebagai penduga tak bias bagi  bila μ_=E()=

• _{Interval of Confidence :}

• _{Point estimation :} • _{Parameter : p,}_,

(73)

Estimation (Continued)

• _{Estimation of a population mean:}

Large-sample case : Point estimate for a

population mean: 

– _{Large-sample (1-}_{) 100% Confidence interval}

for a population mean (use the fact that For sufficient large sample size n>=30, the sampling distribution of the sample mean, ,

(74)

Estimation (Continued)

• _{100 (1-}_{)% Confidence interval of}

population normally distributed and the sample size n is large:

In the case  is unknown:









_

_









 

n

z

x

n

z

x

n

z

x

_



_

₂



_

₂



2 ,









_

_









 

n

s

z

x

n

s

z

x

n

s

z

x

₂

2 

,



(75)

Estimation (Continued)

eg. A company that produces and markets trash bags has developed an improved 30-gallon bag. The new bag is produced using a specially formulated plastics that is both stronger and more biodegradable then previously used plastics, and the company wishes to evaluate the strength of this bag. The breaking strength of a trash bag is considered to be the amount (in pounds) of a representative trash mix that when loaded into a bag that is suspended in the air will cause the bag to sustain significant damage.

(76)

Eg. Large number sample

The company has decided to carry out a 40-hour pilot production run of the new bags. Each hour, at randomly selected time during the hour, a bag is taken off the production line. The bag is then subjected to a breaking strength test. The 40 breaking strength obtained during the pilot production run are given below:

48.5 52.5 50.7 49.4 52.3 47.5 48.2 51.9 53.5 50.9 51.5 52.0 50.5 49.8 49.0 48.8 50.3 50.0 51.7 46.8 49.6 50.8 53.2 51.3 51.0 53.0 51.1 49.3 48.3 50.9 52.6 54.0 50.6 49.9 51.2 49.2 50.2 50.1 49.5 51.4

(77)

X_i x_i2 X

i xi2 Xi xi2

48.50 2352.25 51.50 2652.25 49.60 2460.16 52.50 2756.25 52.00 2704.00 50.80 2580.64 50.70 2570.49 50.50 2550.25 53.20 2830.24 49.40 2440.36 49.80 2480.04 51.30 2631.69 52.30 2735.29 49.00 2401.00 51.00 2601.00 47.50 2256.25 48.80 2381.44 53.00 2809.00 48.20 2323.24 50.30 2530.09 51.10 2611.21 51.90 2693.61 50.00 2500.00 49.30 2430.49 53.50 2862.25 51.70 2672.89 48.30 2332.89 50.90 2590.81 46.80 2190.24 50.90 2590.81

(78)

X_i x_i2

52.60 2766.76

54.00 2916.00

50.60 2560.36

49.90 2490.01

51.20 2621.44

49.20 2420.64

49.30 2430.49

48.30 2332.89

50.90 2590.81

(79)

Confidence interval for breaking strength

40 64493 .

1 96

. 1 5477

. 50 40

64493 .

1 96

. 1 5477

(80)

Eg. Large number sample

From previous examination, it is known that breaking strength is distributed normally. It is also known that standard deviation of population is 1.598:

Solution:

40 598 .

1 96

. 1 5477

. 50 40

598 .

1 96

. 1 5477

(81)

Exercise

1. The mean and standard deviation of the sample of 100 bank customer

waiting times is 5.46 and 2.475 respectively.

a. Calculate 95% and 99% confidence of inteval for population means

b. Using 95% confidence interval, can the bank manager be 95% confident that population mean is less than 6

minutes?

c. Using 99% confidence interval, can the bank manager be 99% confident that population mean is less than 6

(82)

Estimation error and sample size

• _{In the case}

• _e=error             e z n e x         2 2 2 n z n z above go not error will its that % 100 -1 believed is it , estimate to used is

(83)

Example

A random sampling of 36 students on final semester was chosen with GPA mean 2.6 and deviation standard 0.3 how big the

sample should be drawn if we wanna

believe 95% that estimation value is not deviate more than 0.05.

(84)

Estimation (Continued)

• _{Estimation of a population mean: small}

sample case (n<30)

– _{Problems arising for small sample sizes and}

Assumption: the population has an approximate normal distribution.

– _(1-_{) 100% Confidence interval using}

(85)

Estimation (Continued)

• _{100 (1-}_{)% Confidence interval of}

population normally distributed with the sample size n is small (<30):









_

_









 

n

s

t

x

n

s

t

x

n

s

t

x

₂

2 

,



(86)

eg

A survey was conducted to 20 households in a small city in order to predict education expenditure. Data collected is shown on Table below:

Household

s 1 2 3 4 5 6 7 8 9 10

cost (million

Rp) 2,30 4,50 4,00 5,00 3,80 7,20 6,25 5,75 6,70 7,80

household

s 11 12 13 14 15 16 17 18 19 20

cost (million

Rp) 6,80 5,30 8,00

15,1 0

13,2

0 4,50 2,00 4,70 5,75

10,1 0

a. Define mean estimation for education expenditure yearly per household

b. Construct 95% confidence interval for the case,, with assumption education expenditure normal.ly distributed

(87)

Solution

a. Mean estimation for education cost :

b. 95% confidence interval

44 .

6 ˆ



x





093 ,

2 732407

,

0

20 /

275422

,

3 /

) 19 ; 2 / 05 , 0 (



 db x

t

n

s

970 ,

7

905 ,

4

732 ,

0

093 ,

2

44 ,

6

732 ,

0

093 ,

2

44 ,

6 









x

(88)

Determining sample size

100(1-α) percent confidence interval for μ equal to B:

=B=error bound In the case σ is unknown, use preliminary sample.

• When n large enough, s replace σ • When n small, s replace σ and t

distribution replace z.

2 2

   

  

B z

n   _

 

n z_ ₂ 

(89)

Eg.

1. Consider a population having standard deviation equal to 10. We wish to

estimate 95.44 percent confidence

interval for the mean of this population with error bound equal to 1.

2. Suppose now that we take a random

sample of the sample determined in no 1. if we obtain a sample mean equal to 295, calculate the 95.44 percent confidence

interval for the population mean. What is the interval’s error bound.

(90)

Two Independent Samples

Difference Estimation of Population Mean (µ₁-µ₂)

Point Estimation:

With standard error:

a. In the case first ( ) and second population variances ( ) available, then

) (

ˆ ₁ ₂

1 



 x  x



2 2 2 1

2 1 )

( ₁ ₂

n n

x x

 

 _  

2 1



2 2

(91)

Two Independent Samples

b. Population Variances are unknown but equally, then

c. Population Variances are unknown but equally, then





) 1 2 ( ) 1 1 ( 1 1 2 1 2 2 2 1 2 1 ) ( ₁ ₂

         n n s n s n s n n s s g g x x 2 2 2 1 2 1 )

( ₁ ₂

n s n

(92)

Interval confidence (1-)100% for₁-₂: a. Known:

b. : are unknown but assume equally, then

Interval Confidence

2 2 2 1 2 1 2 1 2 1 2 2 2 1 2 1 2

1 ) ₂ ( ) ₂

( n n z x x n n z x

x                

                     2 1 2 int ) ( 2 1 2 1 2 1 2 int ) ( 2

1 ) 1 1 ( ) 1 1

(

2 s n n x x t s n n

t x

x  _v _jo    _v _jo

2 and 2 ) 1 ( ) 1 ( 2 1 2 1 2 2 2 2 1 1 2

int   

 



 v n n

n n s n s n s _jo 2 1 and



2 1 and

(93)

c. ₁ and ₂ are unknown and assumed not equally :                      2 2 2 1 2 1 ) ( 2 1 2 1 2 2 2 1 2 1 ) ( 2

1 ) ₂ ( ) ₂

( n s n s t x x n s n s t x

x  _v    _v

                                           1 1 ₂ 2 2 2 2 1 2 1 2 1 2 2 2 2 1 2 1 n n s n n s n s n s v

(94)

Two dependent samples

Confidence interval (1-α)100% for μ_D = μ₁ – μ₂ paired observation :

n s t

d n

s t

d   2 d  _D    2 d

standard deviation

: s

mean difference

(95)

example

Penelitian ingin membuktikan dampak suatu diet baru, yang dinyatakan dapat

mengurangi bobot badan seseorang 4.5 kg per 2 minggu. Sebanyak 7 wanita menguji penggunaan metode diet tersebut. Berat badan 7 wanita sebelum dan sesudah

mengikuti diet ditunjukkan oleh tabel berikut:

1 2 3 4 5 6 7

Sebelum 58.5 60.3 61.7 69.0 64.0 62.6 56.7 sesudah 60.0 54.9 58.1 62.1 58.5 59.9 54.4

(96)

Solution

1 2 3 4 5 6 7 total

Sebelum 58.5 60.3 61.7 69.0 64.0 62.6 56.7

Sesudah 60.0 54.9 58.1 62.1 58.5 59.9 54.4

d_i

-1.5 5.4 3.6 6.9 5.5 2.7 2.3 24.9 2.25 29.16 12.96 47.61 30.25 7.29 5.29

        70619 . 7 ) 1 7 ( 7 9 . 24 81 . 134 7 1 ; 56 . 3 7 9 . 24 2 2 2 2 2            d i i d s n n d d n s d 2 i d

(97)

Eg.

Two companies in cardboard industry compete and claim to be the best in the area. A researcher interested to prove

which one is the best so that he tested cardboard strength of 10 sheets from each company and the data is listed

below :

– _{Estimate cardboard strength difference, and}

calculate standard error!!!

– _{Construct 95% confidence interval for differences!!}

A

3

0

3

5

0

4

5

6

0

2

5

4

5

4

5

0

4

0 B

5

0

6

0

5

4

0

6

5

6

0

6

5

6

5

0

5

(98)

Solution

Mean difference estimator And standard error









66.94 10(9) (565) -32525) ( 10 ) 1 ( 5 , 56 10 55 60 50 106.94 10(9) (425) -19025) ( 10 ) 1 ( 5 , 42 10 40 35 30 2 2 2 2 2 2 2 2 2 2 1 2 1 1                    



n n x x n s x n n x x n s x i i   14 5 , 56 5 , 42 ˆ ₁ ₂

1   x  x     17 . 4 10 173,88 10 94 , 66 10 94 , 106 2 2 2 1 2 1 )

( ₁ ₂

       n s n s s _x _x

(99)

95% Confidence Interval

7442 , 6 ; 2558 , 21 2558 . 7 14 17 , 4 740 , 1 14 )

( ₁ ₂ ₍₀_.₀₅_/ ₂_; ₎ ₍ ₎

2 1         _ x s t x

x _dbeff _x _x

17 10 , 17 9 / ) 10 / 8.18 ( 9 / ) 10 / 10.34 ( ) 10 / 8.18 10 / 10.34 ( ) 1 /( ) / ( ) 1 /( ) / ( ) / / ( 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 1 2 2 2 2 1 2 1           n n s n n s n s n s db

(100)

Two Dependent Samples

Mean estimator dependent population (µ_d)

Pairs 1 2 3 … n

Sample 1 (X₁) x₁₁ x₁₂ x₁₃ x_1n Sample 2 (X₂) x₂₁ x₂₂ x₂₃ x_2n D = (X₁-X₂) d₁ d₂ d₃ d_n

  i i i i d n i i d x x n d d s n d d d 2 1 i 2 2 1 d and 1 , ˆ           

(1)





2 







i

x

n

y

x

y

x

n

b

x

b

y

(2)

contoh

• _{Penggunaan internet dipengaruhi oleh pendapatan masyarakat. Berdasarkan}

data historis, jumlah pengguna internet untuk berbagai pendapatan per kapita

masyarakat ditunjukkan tabel berikut:

Pendapatn (jt Rp.) 1 1.2 1.1 1.5 0.9 1.3 1.1 Jlh pengguna (rb orang) 75 80 80 85 70 80 79 Pendapatn (jt Rp.) 1.3 1.2 2.5 2.0 1.9 1.8 1.5 Jlh pengguna (rb orang) 79 89 99 90 87 85 80 Pendapatn (jt Rp.) 1.4 1.5 1.2 1.1 1.0 1.5 1.9 Jlh pengguna (rb orang) 85 89 79 76 73 80 89