DETEKSI DATA PENCILAN MENGGUNAKAN   K_MEANS CLUSTERING

Naniek Widyastuti

Authors

Naniek Widyastuti Institut Sains & Teknologi AKPRIND Yogyakarta

Keywords:

clustering, data pencilan, k_means

Abstract

Deteksi data pencilan sangat penting dan mempunyai banyak aplikasi diantaranya adalah identifikasi adanya pengacauan dan sumbatan dalam jaringan komputer, aktivitas kriminal dalam e-commerce, deteksi pemalsuan kartu kredit dan aktivitas-aktivitas yang mencurigakan. Dalam tulisan ini dibicarakan deteksi data pencilan menggunakan metode clustering k_means, dengan jumlah cluster dianggap parameter dan secara incremental ditambah sampai didapat cluster kecil yang kemudian dianggap sebagai data pencilan. Akhirnya diberikan ilustrasi bagaimana metode tersebut diterapkan pada beberapa kelompok data.

Downloads

Download data is not yet available.

References

Breunig, M., H. Kriegel, R. Ng and J. Sander, 2000, Lof: identifying density-based local outliers. In Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, ACM Press, 93-104
Han, J. and M. Chamber , 2006. Data Mining: Concepts and Techniques, Morgan Kaufmann, 2nd ed.
Hawkins, D., 1980, Identifications of Outliers, Chapman and Hall, London
Hodge, V. and J. Austin, 2004. A Survey of Outlier Detection Methodologies, Artificial Intelligence Review, 22: 85–126.
Johnson,R.A,Wichren,D,2004.Applied Multivariate Analysis. Prentice Hall
Knorr, E., R. Ng, and V. Tucakov, 2000, Distance-based Outliers: Algorithms and Applications, VLDB Journal, 8(3-4): 237-253
Loureiro,A., L. Torgo and C. Soares, 2004. Outlier Detection using Clustering Methods: a DataCleaning Application, in Proceedings of KDNet Symposium on Knowledge-based Systems for the Public Sector. Bonn, Germany.
Niu, K., C. Huang, S. Zhang, and J. Chen, 2007. ODCC: Outlier Detection using Distance Distribution Clustering, T. Washio et al. (Eds.) : PAKDD 2007 Workshops, Lecture Notes in Artificial Intelligence (LNAI) 4819, pp. 332-343, Springer-Verlag.
Zhang, J. and H. Wang, 2007. Detecting outlying subspaces for high-dimensional data: the new Task, Algorithms, and Performance, Knowledge and Information Systems, 10(3): 333-355.

DETEKSI DATA PENCILAN MENGGUNAKAN K_MEANS CLUSTERING

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Menu