Negative and cyclic association rule
Negative and Cyclic
association rules
Presented by
Saurabh Singh Mathuriya (2013MCS2581)
Under the guidance of
Dr.S.K.Gupta
Department of Computer Science & Engineering
Negative Association Rules
describes relationships between
item sets and implies the occurrence
of some item sets characterized by
the absence of others In contrast to
positive association rules.
Need
of Negative
Need
of Negative
ARAR
:
• Find the set of items which do not
present in a transaction together.
Unexpected patterns and exceptional patterns are
referred to as exceptions of rules in positive
association.
Example, while ‘bird(x) ⇒ flies(x)’ is a well-known
fact, but an exceptional rule is
‘bird(x), penguin(x)⇒ ¬ flies(x)’.
This exception indicates that unexpected patterns
and can involve negative terms and therefore
treated as a special case of negative rules.
Comparison
with Positive
PositiveAR
AR
Comparison with
Positive association rules consider only
items enumerated in transactions like
people buying milk & wheat bread
together.
Negative association rules might also
consider the same items, but in addition do
consider negated items (i.e. absent from
transactions) like people buying milk &
bread together but not cold-drink as well.
What is Negative Association Rules
If X and Y are set of items
and {X} => {~Y} is a NAR
Negative Association rule.
Rereerfr
• It means itemset X and Y are
negatively correlated.
•
In most of the case where X is
present
there Y is absant.
What is Negative AR contd.
¬ B also has a measure of
its strength, conf, defined as the ratio supp(A∪
¬ B)/supp(A).
a negative rule A ⇒
Support-confidence framework for negative rules
—A and B are disjoint itemsets, that is, A ∩ B = ∅;
—supp(A) ≥ minsup, supp(B) ≥ minsup;
—supp(A⇒ ¬ B) = supp(A∪ ¬ B);
—conf(A⇒ ¬ B) = supp(A∪ ¬ B)/supp(A) ≥ minconf.
Negative Rules Forms
A⇒ ¬ B,
- Ex, People of age < 30 (less than 30) would NOT buy sedan
¬ A⇒ B
- Ex, People of age > 30 (NOT less than 30) would buy sedan
¬ A⇒ ¬ B
Ex, People of age > 30 (NOT less than 30) would NOT buy
sedan
(Note that although rule in the form of :X -> :Y contains
negative elements, it is equivalent to a positive association
rule in the form ofY -> X. Therefore it is not considered as a
negative association rule.)
Assumption 1: The minimum support is 30% and
minimum confidence is 70%.
Assumption 2: The numeric attribute AGE ranges from
18 to 70 and is quantized into two groups - less than
thirty and over thirty.
The rule that satisfies both minimum
support and minimum confidence criterion
is “{age < 30} -> {coupe}”, the confidence of
which is 75%. negative association rule
exists: “{age >30}->{not purchasing
coupe}”’, which has a confidence of 83.3%.
For the purpose of identifying purchase
pattern, it is obvious that the latter has
better predictive ability.
The preceding example illustrates that
negative association rules are as important
as positive ones.
Confidence of Negative AR
•To avoid counting them directly, we can
compute using
Locality of similarity
We cannot find the positive rules with small
support and confidence values because that will
result in many uninteresting rules.
To eliminate unwanted rules and focus on
potential interesting ones, we predict possible
interesting negative AR by incorporating
domain knowledge of the data sets.
We use Taxonomy T for this.Which consist
vertex and directed edge.
Every vertex is a class and vertex which with
degree 0 is most general class and which
having out degree 0 is most specific class
LOS contd.
Taxonomy T consists of vertexes and directed edges. Each
vertex represents a class.
vertical relationship semantics is that the lower level vertex
values are instances of the values of immediate predecessor
vertexes, i.e., the is-a relationship. In a vertical relationship
is used to discover generalized association rules.
semantics of the horizontal relationship is that the vertexes
on the same level having the same immediate predecessor
(siblings to borrow from rooted tree terminology)
encapsulate similarity among classes.
LOS Contd.
Items belonging to the same LOS tend to participate in
similar association rules. This is generally true because
members of such groups tend to have similar activity
patterns.
For example, in a retail database, instances are items
involved in transactions and customers are participants. If
there is no preference for each person, the purchase
probability of each item will be evenly distributed over all
brands.
LOS can be extended to different levels following the same
parent node. For instance, it is more reasonable to put ‘IBM
Aptiva’, ‘Compaq Deskpro’, ‘Notebook’, and ‘Parts’ into one
LOS when viewing the database at a more abstract level.
Intuitively, siblings are in the same LOS.
LOS Contd.
Discovering Negative Rules
to qualify as a negative rule, it must satisfy
two conditions:
first, there must exist a large deviation
between the estimated and actual
confidence ,that is similarity measure (SM).
second, the support and confidence are
greater than the minima required.
Pruning
In constructing candidate negative rules, there
are possibilities that an equivalent or similar
pair is generated
Another redundancy exists when items from
a LOS and all are sibling rules.
The pruning will either keep all positive ones
or keep all negative ones that have high
confidence.
An example is the pruning between rule
“Female -> `BuyHat”, and “!Male -> `BuyHat”.
Algorithm
// finding all positive rules
1. FreqSet1 = {frequent 1-item sets}
2. Find all positive rules
// Generate Negative rules
3. Delete all items from taxonomy, t which are not frequent
4. For all rules r (positive rules)
5.
6.
7.
TmpRuleSet = genNegCan(r)
For all rules tr (TmpRuleSet)
If SM(tr.conf,t.conf) > minConf
Rule = {Rule, Neg(tr) | Neg(tr).supp > minsup,Neg(tr).conf > minconf
8.
};
9.
endif;
10. endfor;
// pruning
11. Prune those result which have same meaning
Example
Results
Results contd.
Conclusion
Given the number of positive rules P and the average size
of the LOS L, the complexity of the algorithm is O(P x L).
Complexity does not depend on the number of
transactions since it is assumed that the supports of item
sets have been counted and stored for use in this as well as
other mining applications.
The complexity of discovering positive rules depends on not
only the number of transactions, but also the sizes of
attribute domains as well as the number of attributes.
The overall complexity of finding Negative ARwill be
proportional to that of discovering positive rules. The
performance is also affected by the choice of minimum
support.
Applications
Large DBs output results
- Helps to limit the search space in
huge databases by combining the
known positive associations with
negative rules as well based on domain
knowledge.
Example, the positive association of
buying milk & bread can be combined
with NOT buying a bottle of beer.
Limitations
First, we cannot simply pick threshold
values for support and confidence that
are guaranteed to be effective in sifting both
Impractical
volume
of negative
rules if not chosen
positive and
negative
rules.
appropriately, which might impact performance
- For example, if there are 50,000 items in a store
then the possible combinations of items is 2 50000
wherein a majority of them will never appear together
even once in the entire database. Now if the absence
of a certain item combination is taken to mean
negative association, then we can generate millions of
negative association rules. However, most of these
rules are likely to be extremely uninteresting.
- Solution : There is a need to explicitly find out only
Cyclic Association Rules
Some item sets occurs after a certain period
of times.
The rule has the minimum confidence and
support at
regular time intervals. It need not hold for the
entire transactional database.
Overview
Step1: The dataset is divided into time segments.
Step2: Existing methods for discovering frequent item sets
in each segment.
Step3: Then pattern matching algorithms to detect cycles
in
association rules is applied.
Step4: techniques called cycle pruning and cycle skipping
which allow us to significantly reduce the amount
of
wasted work performed during the data mining
process
Problem Definition
We denote the ith time unit, i
0; by ti .
That is ti corresponds to the time interval [i.t,(i+1).t)
where t is the unit of time.
We denote the set of transactions executed in t i by D[i].
support of an itemset X in D[j] is the fraction of
transactions in D[j] that contain the itemset.
confidence of a rule X → Y in D[j] is the fraction of
transactions in D[j] containing X that also contain Y.
An association rule X→Y holds in time unit tj if the
support of XUY in D[j] exceeds supmin and the
confidence of X→Y in T[j] exceeds conmin
>=
Problem Definition contd.
A cycle c is a tuple (l, o) consisting of a length l
and an offset o (the first time unit in which the
cycle occurs), 0 < o < l. We say that an association
rule has a cycle c = (l, o) if the association rule
holds in every lth time unit starting with time unit to
For example, if the unit of time is an hour and
“Tea=>Biscuit” holds during the interval 7AM-8AM
every day (i.e., every 24 hours), then “Tea =>
Biscuit” has a cycle (24, 7).
A cycle (li, oi) is a multiple of another cycle (lj , oj )
if lj divides li and (oj = oi mod lj ) holds.
Problem Definition contd.
A time unit ti is said to be “part of cycle c” or
“participate in cycle c” if o = i mod l holds.
Exampleif the binary sequence 001100010101
represents
the association rule X -> Y ; then X -> Y holds
in
D[2], D[3], D[7], D[9], and D[11].
In this sequence, (4, 3) is a cycle
Modifying existing algorithm
The existing algorithms for discovering
association rules cannot be applied directly.
extend the set of items with time attributes,
and then generate the rules.
The Sequential Algorithm
Step1 : Finding association rules(in each time segment)
– Maxima frequent item sets are generated.
– Association rules are generated from the large
itemsets
Step2 : Cycle detection
– By patern matching algorithm
Complexity of the cycle detection phase has an upper
bound of
O( r * n * lmax ).
- r- no of rules detected
-n is the no of segment
-lmax the maximum lengh of cycle of interest.
The Sequential Algorithm contd.
Cycle-Pruning, Cycle-Skipping and cycle
elimination
The major portion of the running time of the
sequential algorithm is spent to calculate the
support for itemsets.
A cycle of the rule X-> Y is a multiple of a cycle of
itemset XUY
Cycle Skipping:
If time unit ti is not part of a cycle of an itemset
X,then there is no need to calculate the support for
X in time segment D[i].
Cycle pruning :
If an itemset X has a cycle (l,o),
then any of the subsets of X has the
cycle (l,o).
Cycle elimination:
If the support for an itemset X is
below the minimum support
threshold supmin in time segment D[i],
then X cannot have any of the cycles
(j, i mod j) lmin
association rules
Presented by
Saurabh Singh Mathuriya (2013MCS2581)
Under the guidance of
Dr.S.K.Gupta
Department of Computer Science & Engineering
Negative Association Rules
describes relationships between
item sets and implies the occurrence
of some item sets characterized by
the absence of others In contrast to
positive association rules.
Need
of Negative
Need
of Negative
ARAR
:
• Find the set of items which do not
present in a transaction together.
Unexpected patterns and exceptional patterns are
referred to as exceptions of rules in positive
association.
Example, while ‘bird(x) ⇒ flies(x)’ is a well-known
fact, but an exceptional rule is
‘bird(x), penguin(x)⇒ ¬ flies(x)’.
This exception indicates that unexpected patterns
and can involve negative terms and therefore
treated as a special case of negative rules.
Comparison
with Positive
PositiveAR
AR
Comparison with
Positive association rules consider only
items enumerated in transactions like
people buying milk & wheat bread
together.
Negative association rules might also
consider the same items, but in addition do
consider negated items (i.e. absent from
transactions) like people buying milk &
bread together but not cold-drink as well.
What is Negative Association Rules
If X and Y are set of items
and {X} => {~Y} is a NAR
Negative Association rule.
Rereerfr
• It means itemset X and Y are
negatively correlated.
•
In most of the case where X is
present
there Y is absant.
What is Negative AR contd.
¬ B also has a measure of
its strength, conf, defined as the ratio supp(A∪
¬ B)/supp(A).
a negative rule A ⇒
Support-confidence framework for negative rules
—A and B are disjoint itemsets, that is, A ∩ B = ∅;
—supp(A) ≥ minsup, supp(B) ≥ minsup;
—supp(A⇒ ¬ B) = supp(A∪ ¬ B);
—conf(A⇒ ¬ B) = supp(A∪ ¬ B)/supp(A) ≥ minconf.
Negative Rules Forms
A⇒ ¬ B,
- Ex, People of age < 30 (less than 30) would NOT buy sedan
¬ A⇒ B
- Ex, People of age > 30 (NOT less than 30) would buy sedan
¬ A⇒ ¬ B
Ex, People of age > 30 (NOT less than 30) would NOT buy
sedan
(Note that although rule in the form of :X -> :Y contains
negative elements, it is equivalent to a positive association
rule in the form ofY -> X. Therefore it is not considered as a
negative association rule.)
Assumption 1: The minimum support is 30% and
minimum confidence is 70%.
Assumption 2: The numeric attribute AGE ranges from
18 to 70 and is quantized into two groups - less than
thirty and over thirty.
The rule that satisfies both minimum
support and minimum confidence criterion
is “{age < 30} -> {coupe}”, the confidence of
which is 75%. negative association rule
exists: “{age >30}->{not purchasing
coupe}”’, which has a confidence of 83.3%.
For the purpose of identifying purchase
pattern, it is obvious that the latter has
better predictive ability.
The preceding example illustrates that
negative association rules are as important
as positive ones.
Confidence of Negative AR
•To avoid counting them directly, we can
compute using
Locality of similarity
We cannot find the positive rules with small
support and confidence values because that will
result in many uninteresting rules.
To eliminate unwanted rules and focus on
potential interesting ones, we predict possible
interesting negative AR by incorporating
domain knowledge of the data sets.
We use Taxonomy T for this.Which consist
vertex and directed edge.
Every vertex is a class and vertex which with
degree 0 is most general class and which
having out degree 0 is most specific class
LOS contd.
Taxonomy T consists of vertexes and directed edges. Each
vertex represents a class.
vertical relationship semantics is that the lower level vertex
values are instances of the values of immediate predecessor
vertexes, i.e., the is-a relationship. In a vertical relationship
is used to discover generalized association rules.
semantics of the horizontal relationship is that the vertexes
on the same level having the same immediate predecessor
(siblings to borrow from rooted tree terminology)
encapsulate similarity among classes.
LOS Contd.
Items belonging to the same LOS tend to participate in
similar association rules. This is generally true because
members of such groups tend to have similar activity
patterns.
For example, in a retail database, instances are items
involved in transactions and customers are participants. If
there is no preference for each person, the purchase
probability of each item will be evenly distributed over all
brands.
LOS can be extended to different levels following the same
parent node. For instance, it is more reasonable to put ‘IBM
Aptiva’, ‘Compaq Deskpro’, ‘Notebook’, and ‘Parts’ into one
LOS when viewing the database at a more abstract level.
Intuitively, siblings are in the same LOS.
LOS Contd.
Discovering Negative Rules
to qualify as a negative rule, it must satisfy
two conditions:
first, there must exist a large deviation
between the estimated and actual
confidence ,that is similarity measure (SM).
second, the support and confidence are
greater than the minima required.
Pruning
In constructing candidate negative rules, there
are possibilities that an equivalent or similar
pair is generated
Another redundancy exists when items from
a LOS and all are sibling rules.
The pruning will either keep all positive ones
or keep all negative ones that have high
confidence.
An example is the pruning between rule
“Female -> `BuyHat”, and “!Male -> `BuyHat”.
Algorithm
// finding all positive rules
1. FreqSet1 = {frequent 1-item sets}
2. Find all positive rules
// Generate Negative rules
3. Delete all items from taxonomy, t which are not frequent
4. For all rules r (positive rules)
5.
6.
7.
TmpRuleSet = genNegCan(r)
For all rules tr (TmpRuleSet)
If SM(tr.conf,t.conf) > minConf
Rule = {Rule, Neg(tr) | Neg(tr).supp > minsup,Neg(tr).conf > minconf
8.
};
9.
endif;
10. endfor;
// pruning
11. Prune those result which have same meaning
Example
Results
Results contd.
Conclusion
Given the number of positive rules P and the average size
of the LOS L, the complexity of the algorithm is O(P x L).
Complexity does not depend on the number of
transactions since it is assumed that the supports of item
sets have been counted and stored for use in this as well as
other mining applications.
The complexity of discovering positive rules depends on not
only the number of transactions, but also the sizes of
attribute domains as well as the number of attributes.
The overall complexity of finding Negative ARwill be
proportional to that of discovering positive rules. The
performance is also affected by the choice of minimum
support.
Applications
Large DBs output results
- Helps to limit the search space in
huge databases by combining the
known positive associations with
negative rules as well based on domain
knowledge.
Example, the positive association of
buying milk & bread can be combined
with NOT buying a bottle of beer.
Limitations
First, we cannot simply pick threshold
values for support and confidence that
are guaranteed to be effective in sifting both
Impractical
volume
of negative
rules if not chosen
positive and
negative
rules.
appropriately, which might impact performance
- For example, if there are 50,000 items in a store
then the possible combinations of items is 2 50000
wherein a majority of them will never appear together
even once in the entire database. Now if the absence
of a certain item combination is taken to mean
negative association, then we can generate millions of
negative association rules. However, most of these
rules are likely to be extremely uninteresting.
- Solution : There is a need to explicitly find out only
Cyclic Association Rules
Some item sets occurs after a certain period
of times.
The rule has the minimum confidence and
support at
regular time intervals. It need not hold for the
entire transactional database.
Overview
Step1: The dataset is divided into time segments.
Step2: Existing methods for discovering frequent item sets
in each segment.
Step3: Then pattern matching algorithms to detect cycles
in
association rules is applied.
Step4: techniques called cycle pruning and cycle skipping
which allow us to significantly reduce the amount
of
wasted work performed during the data mining
process
Problem Definition
We denote the ith time unit, i
0; by ti .
That is ti corresponds to the time interval [i.t,(i+1).t)
where t is the unit of time.
We denote the set of transactions executed in t i by D[i].
support of an itemset X in D[j] is the fraction of
transactions in D[j] that contain the itemset.
confidence of a rule X → Y in D[j] is the fraction of
transactions in D[j] containing X that also contain Y.
An association rule X→Y holds in time unit tj if the
support of XUY in D[j] exceeds supmin and the
confidence of X→Y in T[j] exceeds conmin
>=
Problem Definition contd.
A cycle c is a tuple (l, o) consisting of a length l
and an offset o (the first time unit in which the
cycle occurs), 0 < o < l. We say that an association
rule has a cycle c = (l, o) if the association rule
holds in every lth time unit starting with time unit to
For example, if the unit of time is an hour and
“Tea=>Biscuit” holds during the interval 7AM-8AM
every day (i.e., every 24 hours), then “Tea =>
Biscuit” has a cycle (24, 7).
A cycle (li, oi) is a multiple of another cycle (lj , oj )
if lj divides li and (oj = oi mod lj ) holds.
Problem Definition contd.
A time unit ti is said to be “part of cycle c” or
“participate in cycle c” if o = i mod l holds.
Exampleif the binary sequence 001100010101
represents
the association rule X -> Y ; then X -> Y holds
in
D[2], D[3], D[7], D[9], and D[11].
In this sequence, (4, 3) is a cycle
Modifying existing algorithm
The existing algorithms for discovering
association rules cannot be applied directly.
extend the set of items with time attributes,
and then generate the rules.
The Sequential Algorithm
Step1 : Finding association rules(in each time segment)
– Maxima frequent item sets are generated.
– Association rules are generated from the large
itemsets
Step2 : Cycle detection
– By patern matching algorithm
Complexity of the cycle detection phase has an upper
bound of
O( r * n * lmax ).
- r- no of rules detected
-n is the no of segment
-lmax the maximum lengh of cycle of interest.
The Sequential Algorithm contd.
Cycle-Pruning, Cycle-Skipping and cycle
elimination
The major portion of the running time of the
sequential algorithm is spent to calculate the
support for itemsets.
A cycle of the rule X-> Y is a multiple of a cycle of
itemset XUY
Cycle Skipping:
If time unit ti is not part of a cycle of an itemset
X,then there is no need to calculate the support for
X in time segment D[i].
Cycle pruning :
If an itemset X has a cycle (l,o),
then any of the subsets of X has the
cycle (l,o).
Cycle elimination:
If the support for an itemset X is
below the minimum support
threshold supmin in time segment D[i],
then X cannot have any of the cycles
(j, i mod j) lmin