ããå¨æ°æ®åæè¿ç¨ä¸ï¼å©ç¨åç§å¾è¡¨è¿è¡æ°æ®æ¢ç´¢æ¯å¿
è¦çåæå·¥ä½ãæè¿°æ§ç»è®¡ä¸å°±å
æ¬äºç´æ¹å¾ãæ£ç¹å¾çå·¥å
·æ¥æ¢ç´¢è¿ç»æ°æ®ï¼å¯¹äºåç±»æ°æ®ï¼åå¯ä»¥éç¨æ¡å½¢å¾ã交ååç»è¡¨çå·¥å
·ãExcelä¸æè°çâæ°æ®éè§è¡¨âï¼å
¶å®å°±æ¯ä¸ä¸ªäº¤äºå¼ç交ååç»è¡¨ãå¨Rè¯è¨ä¸å¯ä»¥å¾å®¹æçç¨table()çå½æ°å¾å°ç¸åºçç»æã对äºä¸äºæ´ä¸ºå¤æçä»»å¡ï¼å°±éè¦å
¶å®çå½æ°æå
æ¥å®æãæ¬ä¾å
以irisæ°æ®é为ç 究对象示èä¸äºåºæ¬å½æ°çç¨æ³ï¼åä»ç»reshapeå
ç强大åè½ã
irisæ°æ®éä¸æäºä¸ªåéï¼å
¶ä¸Species表示鸢尾å±è±çåç±»ï¼å
¶å®å个åéåå«æ¯è±ç£åè¼ççé¿åº¦å宽度ãä½ å¯ä»¥ç¨head(iris)æ¥è§å¯åå§æ°æ®çä¸äºæ ·æ¬ãæ们ç第ä¸ä¸ªä»»å¡æ¯æ³è®¡ç®ä¸åç§ç±»è±å¨å个ææ ä¸çå¹³åå¼ãç¨å°çå½æ°ætapplyï¼byåaggregateãè¿ç¯æç« å¯¹å®ä»¬æææ¶åã
å°æ°æ®è§£å
åï¼å
ç¨tapplyå½æ°å°è¯ï¼ä½ä¼åç°è¯¥å½æ°ä¸æ¬¡åªå
许è¾å
¥ä¸ä¸ªåéãå¦æè¦å®å
¨å个åéç计ç®å¯è½å¾ç¨å°å¾ªç¯ãæ¾å¼è¿ä¸ªå½æ°æ¥è¯è¯ç¨byå½æ°ï¼è¯¥å½æ°å¯ä»¥ä¸æ¬¡è¾å
¥å¤ä¸ªåéï¼ä½è¾åºç»æ为ä¸ä¸ªlistæ ¼å¼ï¼è¿éè¦ç¨do.callå½æ°è¿è¡æ´åï¼æç¹éº»ç¦ãææ¹ä¾¿å好çè¿æ¯aggregateå½æ°ï¼ç´æ¥è¾åºä¸ºæ°æ®æ¡æ ¼å¼ãå¦å¤å®è¿å
许ç¨å
¬å¼æ¥è®¾ç½®åç»å åã
attach(iris)
names(iris)
tapply(X=Sepal.Length,INDEX=Species,FUN=mean)
temp <-by(data=iris[,1:4],INDICES=Species,FUN=mean)
do.call(rbind,temp)
aggregate(x=iris[,1:4],by=list(Species),FUN=mean)
aggregate(.~ Species, data = iris, mean)aggreagateå½æ°è¡¨ç°å·²ç¶ä¸éï¼ä½è¿ä¸å¤å¼ºå¤§ãæ¯å¦è¯´å®æ²¡æ³ç´æ¥å¾åºè¡¨æ ¼çè¾¹é
å¼ï¼æ以ä¸é¢å°±è¯·åºæ¬åºç主è§ï¼å³reshapeå
ä¸ç两å大å°ï¼meltä¸castãè¿ä¸¤ä¸ªé常æ¯é
å使ç¨ï¼meltä¸é¨è´è´£âèåâåå§æ°æ®ï¼å½¢æé¿åï¼longï¼æ°æ®ç»æãcaståä¸èå°èååçæ°æ®âéé¸â为æ°çå½¢å¼ï¼è®©äººæ³èµ·äºâéç´¢è¿ç¯âï¼ãåºæ¬ä¸åªè¦æè¿ä¸¤ä¸ªå½æ°ï¼å°±è½ç»ä¸è§£å³ææçæ±æ»é®é¢ã
è¿æ¯ä»¥ä¸é¢çé®é¢ä¸ºä¾åï¼å
å è½½reshapeå
ï¼ç¶åç¨meltå½æ°è¿è¡èåæ°æ®ï¼å
¶ä¸åæ°idæå®äºç¨Species为ç¼å·åéï¼measureåæ°ç¨æ¥æå®åæåéï¼å³è¢«èåçåéï¼ï¼æ¬ä¾ä¸åªæå®äºåæ°idï¼æ以åå§æ°æ®ä¸æªå
æ¬å¨idä¸çå
¶å®åéåæå®ä¸ºåæåéãä½ å¯ä»¥è§å¯å°æ°çæ°æ®iris.meltå
¶å®å°±æ¯å å ï¼stackï¼åçæ°æ®ãç¶åæ们åç¨castæ¥éé¸ï¼castå½æ°ä¸å¯ä»¥ä½¿ç¨å
¬å¼ï¼æ³¢æµªå·å·¦ä¾§åéå°çºµåæ¾ç¤ºï¼å³ä¾§åéå°ä»¥æ¨ªè¡æ¾ç¤ºãmarginsåæ°è®¾å®äºä»¥åä½ä¸ºè¾¹é
æ±æ»æ¹åãå¦æå¸æå¨è®¡ç®ä¸åªå
æ¬ä¸¤ç§è±ï¼å¯ä»¥ä½¿ç¨subsetåæ°ã
library(reshape)
iris.melt <- melt(iris,id='Species')
cast(Species~variable,data=iris.melt,mean,margins="grand_row")
cast(Species~variable,data=iris.melt,mean,
subset=Species %in% c('setosa','versicolor'),
margins='grand_row')reshapeå
çä½è
ä¹æ¯ggplot2å
çå¼åè
ï¼è¿ä¸ªç人æ¯ä¸ªå®ç¾ä¸»ä¹è
ï¼å¨reshapeå
æ¨åºäºå¹´åï¼ä»éæ代ç æ¨åºäºæ°çreshape2å
ãè¿ä¸ªæ°å
çç¹æ§å¨äºï¼
ããæ¹è¿ç®æ³ï¼ä½¿è®¡ç®ä¸å
å使ç¨æè½å¢å¼ºï¼
ããç¨dcaståacast代æ¿äºåæ¥çcastå½æ°ï¼
ããç¨åéåæ¥è®¾å®è¾¹é
åæ°ï¼
ããå é¤castä¸çä¸äºç¹æ§ï¼å 为ä»ç¡®è®¤plyrå
è½æ´å¥½çå¤çï¼
ããææçmeltå½æ°æé½å¢å äºå¤ç缺失å¼çåæ°ã
ããä¸é¢æ们以diamondsæ°æ®ä¸ºä¾ï¼æ¥å®æä¸ä¸ªç¥ä¸ºå¤æçä»»å¡ãæ们å¸æ计ç®ä¸ååå·¥åä¸å纯å度æ¡ä»¶ä¸ï¼é»ç³çåä½å¹³åä»·æ ¼ï¼å¹¶å 以æ¯è¾ãé¦å
å è½½reshape2å
åggplot2å
ï¼ç¶åååéãå°åå§æ°æ®èåï¼ä»¥åå·¥ãé¢è²åå度为ç¼å·åéãåå©ç¨dcastå½æ°éé¸æ°æ®ï¼å¾å°æ±æ»ç»æã计ç®åºåä½ä»·æ ¼ï¼æåç¨æ¡å½¢å¾è¡¨ç°ç»æã
library(reshape2)
library(ggplot2)
data <- diamonds[1:7]
data.melt <- melt(data,id=c('cut','color','clarity'))
diam.sum <- dcast(data.melt,cut+clarity~variable,
subset=.(variable %in% c('price','carat')),mean)
diam.sum$average <- diam.sum$price/diam.sum$carat
p <- ggplot(diam.sum,aes(cut,average,fill=clarity))
p + geom_bar(position='dodge')
ããé¤äºreshapeå
以å¤ï¼Rè¯è¨ä¸è¿æstackãunstackãreshapeçå½æ°è½å®æ类似çå·¥ä½ï¼ä½è®ºåè½ç强大ï¼è¿æ¯é¦æ¨reshapeå
ä¸çå¼åäºå°ã
温馨提示:答案为网友推荐,仅供参考