Implementation of a Plagiarism Detection SystemText Based
Main Article Content
Abstract
Plagiarism, the act of plagiarizing or stealing work without acknowledgment, is a serious challenge in the academic world. Scientific work, as a common target for plagiarism, is increasingly influenced by information technology. This research implements a text-based plagiarism detection system by comparing the level of similarity between the Cosine Similarity and Jaccard Similarity algorithms against winnowing for text similarity detection related to variations in N-gram values 3, 5, and 7. Testing was carried out using the Python programming language and its supporting libraries on 20 dataset sentences. The test results show that Cosine Similarity is better at detecting similarities between texts. Accuracy analysis using the confusion matrix produces an accuracy value of 50%. The comparison results of different n-gram variations have a total performance similarity of 15.89% and an average of 0.26%. Meanwhile, the total performance of Jaccard similarity is 13.59% and the average is 0.23%. Although Cosine Similarity has higher accuracy than Jaccard Similarity, the stability does not reach 100%.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.