Dspace @ IIM Kozhikode

SVDD Variants for Anomaly Detection with Implementations using Hadoop & Spark

Show simple item record

dc.contributor.author Rekha, A G
dc.date.accessioned 2022-12-16T07:47:11Z
dc.date.available 2022-12-16T07:47:11Z
dc.date.issued 2016
dc.identifier.uri http://dspace.iimk.ac.in:80/xmlui/handle/2259/1082
dc.description Research Advisory Committee: Prof. Mohammed Shahid Abdulla (Chair-person), Prof. Asharaf S (Member), Prof. Saji Gopinath (Member):: Hardcopy of the thesis is available in the library. Please contact the help desk for reference. en_US
dc.description.abstract Big data analytics facilitates better informed business decisions through the analysis of large data sets that remain unexploited by traditional business intelligence systems. ‘Big Data’ as input enhances the inferential power of established algorithms, but it challenges even the state-of-the-art computation and analysis methods. Though machine learning is a solution to overcome these problems, its current techniques have to be improved to deal with the Big Data. Another drawback of big data analytics is the greater focus on aggregates over outliers. However, in many situations the insights gathered from outliers could be of more significance. In light of this, the focus of this work is on developing machine learning techniques to make outlier detection practical on large business datasets. For over a decade, Support Vector Data Description (SVDD) technique has shown good predictive accuracy on a wide range of outlier detection tasks. It has been adapted to numerous business problems also. Inspired by this trend, this thesis explores the scalability problems associated with SVDD and tries to address it. Three approaches, namely, LT-SVDD, ELT-SVDD, and PELT- SVDD have been proposed. The feasibility of these methods was assessed using a set of experiments on synthetic as well as benchmark data sets; many of these with an order-of- magnitude advantage in terms of running time. The application of these methods to three real world business problems is also demonstrated. This work contributes to the support vector literature by establishing these methods as efficient for outlier detection on large data sets. en_US
dc.language.iso en en_US
dc.publisher Indian Institute of Management Kozhikode en_US
dc.subject Anomaly Detection en_US
dc.subject Big Data en_US
dc.subject Support Vector Data Description en_US
dc.title SVDD Variants for Anomaly Detection with Implementations using Hadoop & Spark en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • Theses [5]
    Theses submitted in the area of Computer Science/ Information Technology.

Show simple item record

Search DSpace


Advanced Search

Browse

My Account