Design and Analysis of JIC Algorithm on Big Data

Ilamchezhian J

doi:10.52783/cana.v32.5658

PDF

Published: Jun 19, 2025

DOI: https://doi.org/10.52783/cana.v32.5658

Keywords:

Frequent Itemset Mining (FIM), Big Data, Jagged Itemset Counting (JIC), Geometric Progression (GP), Geometric Progression Label Number (GPLN), Cumulative Geometric Progression Label Number (CGPLN), Apriori, Eclat, FP Growth

Ilamchezhian J, Kannan A, Cyrilraj V, N. Padmapriya

Abstract

Computers have significantly impacted various fields, but managing vast amounts of information remains challenging. Artificial intelligence helps machines make data-driven decisions, but large datasets still pose difficulties for researchers. To tackle challenges and gain deep insights from large datasets, we proposed a novel technique using Geometric Progression series numbers (GPLN) for labeling singleton frequent itemsets and Cumulative Geometric Progression series numbers (CGPLN) for labeling itemsets with multiple frequent items. Initially, the algorithm used 2 as the constant 'r' for generating the series, but that was inadequate for large datasets. So, this paper proposes the Jagged Itemset Counting (JIC) algorithms 1 and 2 by reducing the value of ‘r’ and introduced dotted pairs for CGPLN labels to represent frequent itemsets. This redefined methodology requiring two passes over the transaction database: the first pass involves pre-processing, identifying singleton (1-k) frequent itemsets, determining GPLN, and partitioning the dataset. The partitions are processed sequentially: JIC-Algorithm-1 is applied to the first partition and JIC-Algorithm-2 to the remaining partitions. The n-k frequent itemsets from all partitions are combined using the Join algorithm, alongside 1-k itemsets found earlier. For small and medium-sized data, the JIC methodology outperforms Apriori and Eclat algorithms, showing better execution time even at low support thresholds. In Big Data scenarios, while FP-Growth and Eclat struggled, the proposed methodology excelled in execution time, main memory consumption, and disk memory utilization.

Issue

Vol. 32 No. 1 (2025)

Section

Articles

Announcements

Call for Papers

Call for Papers for the Upcoming Issue.

Last Date of Submission: June 30^th, 2025

Call for Reviewers

Call for Editorial Member/ Reviewers Submitting your Application
If you would like to apply for the position of an Editorial Board Member on the journal, please contact the Editor including your CV and a brief covering letter detailing why you are a suitable candidate, to editor@internationalpubls.com. Your cover letter should be no longer than one page and should cover where you believe the research field is going (and the journal's place within it), as well as details of any previous relevant journal editorial and peer review management experience.