online gambling singapore online gambling singapore online slot malaysia online slot malaysia mega888 malaysia slot gacor live casino malaysia online betting malaysia mega888 mega888 mega888 mega888 mega888 mega888 mega888 mega888 mega888 Know Your Data: Part 2

摘要: To build an effective learning model, it is must to understand the quality issues exist in data & how to detect and deal with it. In general, data quality issues are categories in four major sets.

 


Noise

Many says if there is no noise in data, data mining would be too easy. Noise in data represent the modification of original values. Prof Jeff M. Phillips from University of Utah defines the main causes of noise in data as mentioned below.

......

Outliers

As the name implies, outliers are data objects which are considerably different than most of the other data objects. The object pointed in below image has different (X,Y) attributes than all other data objects hence qualifies for outlier.

.....

Missing values

It is very much possible to have data objects with missing one or multiple attribute values.

.....

Duplicate data

.....

Full Text: kdnuggets

 

若喜歡本文,請關注我們的臉書 Please Like our Facebook Page: Big Data In Finance

 


留下你的回應

以訪客張貼回應

0
  • 找不到回應