Essays.club - Get Free Essays and Term Papers
Search

Data Mining

Autor:   •  December 26, 2017  •  2,231 Words (9 Pages)  •  561 Views

Page 1 of 9

...

Note: Click here Missing value transformation for further details on handling missing values.

- PCA for variable reduction:

We removed useless variables and replaced the missing values of the operators, but there were still many attributes which needed to reduce. For this purpose, we used Principal component analysis on these set of attributes, so that we can derive the data in fewer number of vectors replacing the original set of variables.

PCA category: We have divided the attributes into 11 categories for PCA

- Giving history: We selected these variables to apply PCA because the giving history of a donor can be a contributing factor in determining a model to predict potential donors.Please find the rapid miner model in appendix. The parameter we used for PCA analysis is given in appendix table.

Attributes input to analyze Giving History PCA

AVGGIFT

CARDGIFT

LASTGIFT

MAXRDATE

MINRAMNT

MINRDATE

NEXTDATE

NGIFTALL

RAMNTALL

TIMELAG

- Promotion history: We considered the promotional history attributes to use this past data of the various promotions and responses of donors in our predictive modelling.

The parameter we used for PCA analysis is given in appendix table

Attributes input to analyze Promo_History

CARDPM12

CARDPROM

MAXADATE

MINRDATE

NUMPRM12

NUMPROM

- Neighbor attributes: Initially in the total of 480 attributes, there were 286 attributes which were about the characteristics of the donor’s neighborhood. So in order to derive meaningful insight from this segment and at the same time reduce the number of variables we performed PCA on this category to get cumulative effect of many attributes into few reduced variables. The parameter we used for PCA analysis is given in appendix table.

AGE907

POP901

CHIL1

POP902

CHIL2

POP903

CHIL3

POP90C1

AGEC1

POP90C2

AGEC2

POP90C3

AGEC3

POP90C4

AGEC4

POP90C5

AGEC5

AGE901

AGEC6

AGE902

AGEC7

AGE903

CHILC1

AGE904

CHILC2

AGE905

CHILC3

AGE906

CHILC4

CHILC5

MARR4

HHAGE1

HHP1

HHAGE2

HHP2

HHAGE3

HHN4

HHN1

HHN5

HHN2

HHN6

HHN3

MARR1

MARR2

MARR3

- Donor interest in other type of mail orders: We considered these attributes because these defined the interest of the donor and the response different type of mail orders received, considering this attribute of the donor interest and response can help us determine a better model, we used PCA on these attributes. The parameter we used for PCA analysis is given in appendix table

Attributes input to analyze PCA Response

MAGFAML

MAGFEM

MAGMALE

MBBOOKS

MBCOLECT

MBCRAFT

MBGARDEN

PUBCULIN

PUBDOITY

PUBGARDN

PUBHLTH

PUBNEWFN

PUBOPP

PUBPHOTO

- Military history: We created PCA for donor’s military history.

MALEMILI

MALEVET

VIETVETS

WWIIVETS

LOCALGOV

STATEGOV

FEDGOV

- Ethnicity: Based on neighbor population data, we created PCA for ethnicity variables ETH1-ETH16.

- Housing PCA: Based on neighbor population housing information .

- Income

...

Download:   txt (16.5 Kb)   pdf (77.1 Kb)   docx (26.3 Kb)  
Continue for 8 more pages »
Only available on Essays.club