r语言主成分分析biplot怎么看

513好学 2023-10-16 05:46 编辑:admin 253阅读

一、r语言主成分分析biplot怎么看

#R中作为主成分分析最主要的函数是princomp()函数#princomp()主成分分析 可以从相关阵或者从协方差阵做主成分分析#summary()提取主成分信息 #loadings()显示主成分分析或因子分析中载荷的内容#predict()预测主成分的值 #screeplot()画出主成分的碎石图 #biplot()画出数据关于主成分的散点图和原坐标在主成分下的方向

3、案例#现有30名中学生身高、体重、胸围、坐高数据,对身体的四项指标数据做主成分分析。#1.载入原始数据test<-data.frame( X1=c(148, 139, 160, 149, 159, 142, 153, 150, 151, 139, 140, 161, 158, 140, 137, 152, 149, 145, 160, 156, 151, 147, 157, 147, 157, 151, 144, 141, 139, 148),

二、R语言中JM包怎么实现可信区间

predict () 里面interval=confidence。

void select_sort(int*a,int n)

{

register int i,j,min,t;

for(i=0;i<n-1;i++)

{

min=i;//查找最小值

for(j=i+1;j<n;j++)

if(a[min]>a[j])

min=j;//交换

if(min!=i)

{

t=a[min];

a[min]=a[i];

a[i]=t;

}

}

}

三、数据分析之美 决策树R语言实现

数据分析之美:决策树R语言实现

R语言实现决策树

1.准备数据

[plain] view plain copy

> install.packages(tree)

> library(tree)

> library(ISLR)

> attach(Carseats)

> High=ifelse(Sales<=8,No,Yes) //set high values by sales data to calssify

> Carseats=data.frame(Carseats,High) //include the high data into the data source

> fix(Carseats)

2.生成决策树

[plain] view plain copy

> tree.carseats=tree(High~.-Sales,Carseats)

> summary(tree.carseats)

[plain] view plain copy

//output training error is 9%

Classification tree:

tree(formula = High ~ . - Sales, data = Carseats)

Variables actually used in tree construction:

[1] ShelveLoc Price Income CompPrice Population

[6] Advertising Age US

Number of terminal nodes: 27

Residual mean deviance: 0.4575 = 170.7 / 373

Misclassification error rate: 0.09 = 36 / 400

3. 显示决策树

[plain] view plain copy

> plot(tree . carseats )

> text(tree .carseats ,pretty =0)

4.Test Error

[plain] view plain copy

//prepare train data and test data

//We begin by using the sample() function to split the set of observations sample() into two halves, by selecting a random subset of 200 observations out of the original 400 observations.

> set . seed (1)

> train=sample(1:nrow(Carseats),200)

> Carseats.test=Carseats[-train,]

> High.test=High[-train]

//get the tree model with train data

> tree. carseats =tree (High~.-Sales , Carseats , subset =train )

//get the test error with tree model, train data and predict method

//predict is a generic function for predictions from the results of various model fitting functions.

> tree.pred = predict ( tree.carseats , Carseats .test ,type = class )

> table ( tree.pred ,High. test)

High. test

tree. pred No Yes

No 86 27

Yes 30 57

> (86+57) /200

[1] 0.715

5.决策树剪枝

[plain] view plain copy

/**

Next, we consider whether pruning the tree might lead to improved results. The function cv.tree() performs cross-validation in order to cv.tree() determine the optimal level of tree complexity; cost complexity pruning is used in order to select a sequence of trees for consideration.

For regression trees, only the default, deviance, is accepted. For classification trees, the default is deviance and the alternative is misclass (number of misclassifications or total loss).

We use the argument FUN=prune.misclass in order to indicate that we want the classification error rate to guide the cross-validation and pruning process, rather than the default for the cv.tree() function, which is deviance.

If the tree is regression tree,

> plot(cv. boston$size ,cv. boston$dev ,type=’b ’)

*/

> set . seed (3)

> cv. carseats =cv. tree(tree .carseats ,FUN = prune . misclass ,K=10)

//The cv.tree() function reports the number of terminal nodes of each tree considered (size) as well as the corresponding error rate(dev) and the value of the cost-complexity parameter used (k, which corresponds to α.

> names (cv. carseats )

[1] size dev k method

> cv. carseats

$size //the number of terminal nodes of each tree considered

[1] 19 17 14 13 9 7 3 2 1

$dev //the corresponding error rate

[1] 55 55 53 52 50 56 69 65 80

$k // the value of the cost-complexity parameter used

[1] -Inf 0.0000000 0.6666667 1.0000000 1.7500000

2.0000000 4.2500000

[8] 5.0000000 23.0000000

$method //miscalss for classification tree

[1] misclass

attr (, class )

[1] prune tree. sequence

[plain] view plain copy

//plot the error rate with tree node size to see whcih node size is best

> plot(cv. carseats$size ,cv. carseats$dev ,type=’b ’)

/**

Note that, despite the name, dev corresponds to the cross-validation error rate in this instance. The tree with 9 terminal nodes results in the lowest cross-validation error rate, with 50 cross-validation errors. We plot the error rate as a function of both size and k.

*/

> prune . carseats = prune . misclass ( tree. carseats , best =9)

> plot( prune . carseats )

> text( prune .carseats , pretty =0)

//get test error again to see whether the this pruned tree perform on the test data set

> tree.pred = predict ( prune . carseats , Carseats .test , type = class )

> table ( tree.pred ,High. test)

High. test

tree. pred No Yes

No 94 24

Yes 22 60

> (94+60) /200

[1] 0.77

顶一下
(0)
0%
踩一下
(0)
0%
相关评论
我要评论
点击我更换图片