Partial Least Square Structural Equation

www.ijemr.net

ISSN (ONLINE): 2250-0758, ISSN (PRINT): 2394-6962
Volume-6, Issue-4, July-August 2016
International Journal of Engineering and Management Research
Page Number: 327-332

Partial Least Square Structural Equation Modeling (PLS-SEM) with
Biner Data (Case Study: Knowledge Creation on Dairy Cooperative in
Indonesia)
Riwi Dyah Pangesti1, I Made Sumertajaya2, Anggraini Sukmawati3
Department of Statistics, Faculty of Mathematics and Natural Science, Bogor Agricultural University, Bogor,
INDONESIA
3
Department of Management, Faculty of Economics and Management, Bogor Agricultural University, Bogor,
INDONESIA

1,2

ABSTRACT
Structural Equation Modeling (SEM) is an

analysis method that consists of two models: the
measurement and structural model. The assumtion of SEM
modeling is multivariate normally distribution and large
relatively sample size. In some cases there are data that
doesn’t meet these assumptions so that required some
handeling. In this study, the handling is done by using the
approach Partial Least Square (PLS). Furthermore,
changing the likert scale into binary scale. Finally, compare
the model of knowledge creation in Indonesian dairy
cooperatives using PLS-SEM analysis of data likert scale
questionnaire and the likert scale that has been converted
into binary categories.The author uses the data creation
knowledge for example the application of PLS-SEM. The
results obtained that the binary data is no less good than
the likert scale data. It is shown from the R-square value,
F-square, Q-square, RMSEA, SRMR, NFI, and GFI these
two models are not much different.Likewise indicated by
the Composite Reliability and Cronbach Alpha was
good.Based on the t-statistic value, a likert scale of only 14
of the 24 indicators were valid. Whereas the binary scale,

there are 21 valid indicators. Thus, the contruction of the
questionnaire can use the binary scale.

Keywords— PLS-SEM, Binary Data Data, Likert Scale

I.

INTRODUCTION

Unobserved variables are variables that cannot
be observed directly. Yet, they can be measured through
indicators (observed variables) which reflect those
observed variables. Indicators should be able to be
explained theoritically, have an acceptable logical value
and also high degree of validity and reliability.
Unobserved variables
are measured in
questionnaire format with indicator in the form of items
of question from each construct. Most questionnaires
apply Likert scale. However, it has disadvantage as

respondents have to choose one of many options.

327

Further, these multiple choices lead to several impacts:
respondents become confuse and lazy to fill the
questionnaire (human error), respondents avoid extreme
answers and only want to satisfy researchers (not firm
answer), bias on data since respondents are more
concerned with values prevailing in society than their
real condition (not honest answer), and possibility of the
existence of validity which is difficult to be
demonstrated. A solution to handle those problems is
questionnaire applying Guttman scale which only
provides two choices of answer, yes/no, agree/disagree,
right/wrong or other opposite options. Guttman scale
will easily facilitate the respondents in filling out
questionnaires because respondents should only choose
one of two options to produce a firm answer.
When indicators are used to measure an

observed variable, there will be two main problems,
those are problem in measurement and problem due to
causal relationships between variables [1]. Statistical
technique that can solve these problems is Structural
Equation Modeling (SEM) which can measure or
analyze patterns of relationships and influence
simultaneously, either directly or indirectly. It also can
identify the indicators that can determine whether the
observed variables are valid and reliable or not (Mattjik
[1], Latan [2], Ghozali [3]).
SEM requires a sample size of 10 times larger
than the number of indicators or more than 100 units of
observation and data should follow the multivariate
normal distribution (Ghozali [3], Jaya [4]). New
approach in this modelling that is Partial Least Square
(PLS) is intended to address relatively small data which
have a non-normal distribution. PLS is a poweful
method of SEM analysis since it can be applied in all
scale of data, does not require assumptions and large
sample size. PLS can be applied to both reflexive and

formative indicator. PLS also can be used to build a

Copyright © 2016. Vandana Publications. All Rights Reserved.

www.ijemr.net

ISSN (ONLINE): 2250-0758, ISSN (PRINT): 2394-6962

relationship that has no theoretical basis yet (Latan [2],
Ghozali [3], Jaya [4] mengutip dari Wald, 1982).
Dairy cooperative in Indonesia is a modern
organization which follows the principles of
management in carrying out its functions. It also
accommodates the cow milk from farmers and
distributes it directly to the Dairy Processing Industry
(IPS). The cooperative is a bridge between farmers and
IPS thus it is necessary to develop the Knowledge
Creation.
Sustainable Knowledge Creation is intended to
improve the performance and output of the cooperative

and expected to be an extremely important source of
innovation. Indicators affecting Knowledge Creation are
personal skills development, tacit knowledge sharing,
conceptualization, crystallization, assessment and
dissemination of knowledge (Purwanto [5]; Rahayu [6]
in Sangkala, 2007).
Five factors in Knowledge Creation enabler in
an organization are Shared Vision, Conversational
Management, Driving Knowledge Mobilization,
Provision of a Condusive Environment, and Internal
Knowledge Dissemination (Purwanto [5] in Irsan 2005
in Von Krogh, 2000 ).
This study applied PLS-SEM in modeling
Knowledge Creation on dairy cooperative in Indonesia.
Despite research on PLS-SEM used many non-binary
data, PLS-SEM has not been applied on binary data yet.
Therefore, this study focused on the analysis of PLSSEM on binary data. It is expected that study using
binary data is useful for the preparation of questionnaire.
In addition, it is also expected that measurement of the
scale of attitudes do not necessarily have to use the

Likert scale but Guttman scale instead. The preparation
of such questionnaire is expected to be easier and more
efficient.

II.

METHODOLOGY

PLS-SEM modelling was performed on Likert
scale and binary data by these following steps:
Conceptualization of the model including the design of
structural model and measurement model
The design of the structural model was based on
the problem formulation or the research hypothesis. In
this research, unobserved variables were divided into
two, exogenous and indigenous. Exogenous unobserved
variable was Shared Vision (�1 ) while indigenous
unobserved variables were Internal Knowledge
Dissemination (�1 ), Dairy Cooperatives Knowledge
(�2 ), and Knowledge Creation (�3 ). Previous studies

revealed that �1 was influenced by �1 ,�2 was influenced
by �1 and �1 , while�3 was influenced by �2 .
The design of the measurement model becomes
very important in PLS-SEM modeling since it aims to
determine whether the indicator is reflexive or formative.
In this study, all indicators were assumed to be reflexive.
Constructing the path diagram

328

After structural model and measurement model
had been designed, path diagram of those models was
constructed as shown below.

Figure 1: Path Diagram of Knowledge Creation Model
Conversion of path diagram to equation system
Measurements Models
�9 = ��9 �3 + �9
�1 = ��1 �1 + �1
�10 = ��10 �3 + �10

�2 = ��2 �2 + �2
�3 = ��3 �3 + �3
�11 = ��11 �3 + �11
�4 = ��4 �4 + �4
�12 = ��12 �3 + �12
�1 = ��1 �1 + �1
�13 = ��13 �3 + �13
�2 = ��2 �1 + �2
�14 = ��14 �3 + �14
�3 = ��3 �1 + �3
�15 = ��15 �3 + �15
�4 = ��4 �1 + �4
�16 = ��16 �3 + �16
�5 = ��5 �2 + �5
�17 = ��17 �3 + �17
�6 = ��6 �2 + �6
�18 = ��18 �3 + �18
�7 = ��7 �2 + �7
�19 = ��19 �3 + �19
�8 = ��8 �2 + �8

�20 = ��20 �3 + �20
Structural Models
�1 = �11 �1 + �1
�2 = �12 �1 + �12 �1 + �2
�3 = �23 �1 + �3

Estimation: weight, loading, and average and constants
Step 1 weight estimate
One function of PLS is to estimate the weights
used to create the score of unobserved variable(Ghozali
[3], Jaya [4]).
�� = � ��� ���
��

�� = � ��� ���

1)
2)

��


Where w kb was the weight k to form the
unobserved variable estimation �� and w ki was weight k
to form the unobserved variable estimation �� .
The estimation of unobserved variables was
linear aggregate of indicator which its weight value was
obtained from the PLS method.
Exogenous variable on each reflexive indicator,
weight w kb , was regression coefficient of �� , while ��
was standardized variable from the following equation:
��� = ��� �� + ���
(3)

thus, estimation by OLS method minimized
∑ ��� 2 . Equation (3) can be expressed as:

Copyright © 2016. Vandana Publications. All Rights Reserved.

www.ijemr.net

ISSN (ONLINE): 2250-0758, ISSN (PRINT): 2394-6962

��� = ��� − ��� ��

− ��� �� )2

� ��� = �(���
2

4)

SV

0.824

0.543

5)

IKD

0.734

0.669

DCK

0.703

0.726

KC

0.823

0.77

Minimizing ∑ ��� 2 by deriving ∑ ��� 2 on w kb
which further resulted in:
cov(��� , �� )
6)
��� =
var(�� )
Step in weight estimation for indigenous variables
was also similar. Iteration process would be finish when
it was convergent, with limit as follows:

− ���
���
≤ 10−5
���
On formative indicators, weight w kb was a
multiple regression coefficient of �� . Furthermore,
process of estimation coefficient of vectors w k was
similar to the process in multiple regression.
Step 2 path and loading estimate
Path estimate was coefficient that linked among
unobserved variables, while loading was coefficient that
link between unobserved variables and indicators.
Step 3 average and parameters location
(constants) estimate
Estimate of the third step was based on
preliminary data matrix, the weight estimate at step 1,
and the path estimate at step 2. The parameter location
was a constant of b k0 for indigenous unobserved
variables while the average (�
�) was for the unobserved
exogenous variables.
Comparing Knowledge Creation Model in Data of
Likert Scale and Binary
PLS-SEM Likert scale and PLS-SEM binary data
comparison was based on fitness testing of the
measurement model (AVE, Composite reliability, and
Cronbachs Alpha) and the structural model (R-square, fsquare, Q2, Chi-Square, RMSEA, SRMR , NFI, and
GoF).
Measurement Model
TABLE1
FITNESS SIZE OF MEASUREMENT MODEL
AVE
Variables
Likert
Binary
SV

0.648

0.369

IKD

0.648

0.589

DCK

0.387

0.416

KC

0.648

0.303

Variables

Likert

Medium

0.02

0.19
Low
Table 3 and 4 are depicted the R-square value and
f-square on Likert scale data and the binary scale data,
respectively.
TABLE3
VALUE OFR-SQUAREON LIKERT SCALE
DATAAND BINARY SCALE DATA
Likert
Note
Binary
Note
IKD

0.024

Low

0.007

Low

DCK

0.302

Medium

0.215

Low

0.88

0.677

IKD

0.846

0.81

DCK

0.79

0.81

KC

0.857

0.817

Cronbach's Alpha
Likert

0.15

0.33

Binary

SV

Variables

329

Composit Reliability

Value of AVE described how much variance of
unobserved variable can be explained by the
measurement model. AVE value of 0.5 means that the
model was good enough [3]. Likert Scale generated
better AVE value of each unobserved variable than
binary data.
Composite reliability showed the consistency of
an indicator in measuring unobserved variables.
Composite reliability value > 0.7 shows that the
indicator could reflect an unobserved variable [3].
Overall, both Likert scale and binary data resulted in a
good Composite Reliability value, and so did the value
of Cronbach's Alpha.
Based on the criteria of AVE, Composite
Reliability and Cronbach's Alpha value, it can be
concluded that the binary scale was as good as Likert
scale.
Structural Model
Goodness test of structural model was based on
the value of R-square, f-square, Q-square, and Goodness
of Fit Index (GoF). Value of R-square and f-square
closed to 1 indicated better model in fitting the data. Rsquare was used to measure how close the relationship
among indigenous variables, while the f-square was used
to measure how close the relationship between
unobserved exogenous variables and the inidigenous
unobserved variables [3]. The criteria were described as
follow:
TABLE2
KRITERIA R-SQUAREANDF-SQUARE
R-square
f-square
Note
0.35
0.67
Strong

Biner

KC
0.319
Medium 0.149
Low
All R-square value on binary scale data showed a
weak value. However, only NOTE variable that had a
weak value of R-square on Likert scale, whereas other
variables had moderate value. It also showed that the
Likert scale data provided better model than the binary
scale data.

Copyright © 2016. Vandana Publications. All Rights Reserved.

www.ijemr.net

ISSN (ONLINE): 2250-0758, ISSN (PRINT): 2394-6962

TABLE4
VALUE OF F-SQUAREON LIKERT SCALE
DATAAND BINARY SCALE DATA
Likert
Note
Binary
Note
SVvsIKD

0.025

Low

0.007

Low

SVvsDCK

0.079

Low

0.067

Low

IKDvsDCK

0.297

Medium

0.187

Medium

DCKvsKC 0.469
Kuat
0.175 Medium
Goodness criteria of the model using the fsquare value indicated that the Likert scale data was
better when compared with the binary scale data.
Model evaluation using Q2 value was obtained
through the following calculation:
2
2 )]
2
2
)(1 − �KC
)(1 − �DCK
= 1 − [(1 − �����
�likert
= 1 − [(0.976)(0.698)(0.681)]
= 0.536
2
2 )]
2
2
)(1 − �KC
)(1 − �DCK
= 1 − [(1 − �����
�binary
= 1 − [(0.993)(0.785)(0.851)]
= 0.663
Q2 value that was higher than zero means that
the model had predictive relevance, vice versa. Q2 value
for both Likert and binary model could explain more
than 50% of predictive relevance.
Structural model on PLS-SEM was evaluated by
GoF obtained from the following calculation
(Tenenhause [7], Hussein [8], Henseler [9]).
������ × ����2�
���likert = ����
= √0.583 × 0.215
= 0.354
������ × ����2�
���binary = ����

= √0.419 × 0.124
= 0.228
Both GoF Likert and GoF binary could be regarded as
moderate, as their values ranged between 0.2
0.33
0.05-0.08
> 0.90

t-Statistics Value
The estimated value of path coefficient in the
structural model should be significant. The significant
value can be obtained through bootstraNoteng procedure
with the t-statistics value. Table 6 and 7 were
respectively t-statistics value on the measurement model
and the structural model.
TABLE 6
T-STATISTICS VALUE OF MEASUREMENT
MODEL
t-statistics value
Latent
Relationship
Likert
Binary
Variables
Scale
Scale
1.524
1.363
SV1