AprioriTID algorithm

首页 > 代码库 > AprioriTID algorithm

2024-07-28 17:38:01 219人阅读

What is AprioriTID?

AprioriTID is an algorithm for discovering frequent itemsets (groups of items appearing frequently) in a transaction database. It was proposed by Agrawal & Srikant (1993).

AprioriTID is a variation of the Apriori algorithm. It was proposed in the same article as Apriori as an alternative implementation of Apriori. It produces the same output as Apriori. But it uses a different mechanism for counting the support of itemsets.

比较Apriori与AprioriTID如下：

数据结构方面：

在Apriori算法中，首先利用HashMap<Integer,Integer>存储每个项与其出现的次数之间的映射关系，取出频繁项构成List集合：frequent1. 将此List集合作为生成k=2时候选项的输入。　　

除了k=1外，其余k值的每个候选项存储在每个Itemset类的对象中，由List<Itemset>集合candidates统一存储。Itemset类中拥有存、取候选项，存储候选项支持度(support)的各函数。全部的频繁项集对象由List<Itemset>集合level存储。（level自然作为k>2时生成候选项函数的输入）

在AprioriTID算法中，用HashMap<Integer,Set<Integer>>存储每个项item与其出现的位置(transaction ID)之间的映射关系，从k=1时，直接将频繁项集存储在Itemset对象中(在对象中有集合存储TID)，并用List<Itemset>集合level存储各Itemset对象。Itemset类中增添了transaction ID集合，保存项集所对应的transaction ID。

在算法方面：

在AprioriTID算法中，当k>=2时，依旧通过we compare items of itemset1 and itemset2.If they have all the same k-1 items and the last item of itemset1 is smaller than the last item of itemset2, we will combine them to generate a candidate来生成候选项集。查看结合在一起的候选集的共同的tid(common tids)，当common tids中元素个数满足minsup则结合在一起的候选集为频繁项，(相比apriori效率提高了一些，apriori是将候选项不断与transaction作比较，计算各候选项支持度)保存频繁项和其对应的common tids到Itemset对象中，统一由List<Itemset>集合candidates存储，通过saveItemset()函数保存频繁项集之后，candidates作为下一次计算k+1时频繁项的输入。　　

AprioriTID algorithm

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > AprioriTID algorithm

AprioriTID algorithm

看完仍有疑问？有类似问题直接问程序猿