âåèç½å1âä¸æå°å¦æåªæ¯å¯¹æ´æ°è¿ç®ï¼è¿ç®è¿ç¨åç»æé½åªä½¿ç¨æ´æ°ï¼ï¼æ²¡æå¿
è¦ä½¿ç¨âdoubleâ(8 byte)ï¼èåºè¯¥ç¨æ´å°çâintegerâ(4 byte)ã使ç¨storage.mode(x)æ¥ç对象åæ°ç模å¼ï¼storage.mode(x) <- è¿è¡èµå¼ï¼ä½¿ç¨format(object.size(a), units = 'auto')æ¥ç对象å ç¨çå
å空é´ï¼æ¤å¤æçé®ï¼å³å¨Rä¸æ¯ä¸ªintegerå°åºå ç¨äºå¤å¤§ç空é´ï¼ï¼ã
éè¦è§£égc()å½æ°ï¼å¯ä»¥æ¥çå
å使ç¨æ
åµãåæ ·ï¼å¨æ¸
é¤äºå¤§ç对象ä¹åï¼ä½¿ç¨gc()以éæ¾å
å使ç¨ç©ºé´ã
æèªå¨âåèç½å2âä¸æå°ï¼å¯¹äºå¤§ç©éµçæä½ï¼å°½éé¿å
使ç¨cbindårbindä¹ç±»ï¼å 为è¿ä¼è®©å
åä¸åå°åé
空é´ãâ对äºé¿åº¦å¢å çç©éµï¼å°½éå
å®ä¹ä¸ä¸ªå¤§ç©éµï¼ç¶åéæ¥å¢å âåâ注ææ¸
é¤ä¸é´å¯¹è±¡âã
使ç¨bigmemory家æï¼bigmemory, biganalytics, synchronicity, bigtabulate and bigalgebraï¼ åæ¶è¿æ
biglmã
bigmemory packageç使ç¨ï¼
1. 建ç«big.memory对象
bigmemoryéç¨C++çæ°æ®æ ¼å¼æ¥â模仿âRä¸çmatrixã
ç¼å大æ°æ®æ ¼å¼æ件æ¶åï¼å¯ä»¥å
建ç«filebacked.big.matrix
big.matrix(nrow, ncol, type = options()$bigmemory.default.type, init = NULL, dimnames = NULL, separated = FALSE, backingfile = NULL, backingpath = NULL, descriptorfile = NULL, shared = TRUE)
filebacked.big.matrix(nrow, ncol, type = options()$bigmemory.default.type, init = NULL, dimnames = NULL, separated = FALSE, backingfile = NULL, backingpath = NULL, descriptorfile = NULL)
as.big.matrix(x, type = NULL, separated = FALSE, backingfile = NULL, backingpath = NULL, descriptorfile = NULL, shared=TRUE)
使ç¨æ³¨æï¼
big.matrixéç¨ä¸¤ç§æ¹å¼å¨åæ°æ®ï¼ä¸ç§æ¯big.matrixé»è®¤çæ¹å¼ï¼å¦æå
å空é´æ¯è¾å¤§ï¼å¯ä»¥å°è¯ä½¿ç¨ï¼å¦å¤ä¸ç§æ¯filebacked.big.matrixï¼è¿ç§å¨åæ¹æ³å¯è½ä¼å¤ä»½æ件ï¼file-backingsï¼ï¼èä¸éè¦descriptor fileï¼
âinitâæç©éµçåå§åæ°å¼ï¼å¦æ设å®ï¼ä¼äºå
å°è®¾å®çæ°å¼å¡«å
å°ç©éµä¸ï¼å¦æä¸è®¾ç½®ï¼å°å¤ç为NA
"type"æ¯æå¨big.matrixä¸atomic elementçå¨åæ ¼å¼ï¼é»è®¤æ¯âdoubleâ(8 byte)ï¼å¯ä»¥æ¹ä¸ºâintegerâ(4 byte), "short"(2 byte) or "char"(1 byte)ã注æï¼è¿ä¸ªå
ä¸æ¯æå符串çå¨åï¼type = "char"æ¯æASCIIç åæ¯ã
å¨big.matrixé常大çæ¶åï¼é¿å
使ç¨rownamesåcolnames(并ä¸bigmemoryç¦æ¢ç¨å称访é®å
ç´ )ï¼å 为è¿ç§åæ³é常å ç¨å
åãå¦æä¸å®è¦æ¹åï¼ä½¿ç¨options(bigmemory.allow.dimnames=TRUE)ï¼ä¹åcolnames, rownames设置ã
ç´æ¥å¨å½ä»¤æ示符åè¾å
¥xï¼xæ¯ä¸ä¸ªbig matrixï¼ï¼å°è¿åxçæè¿°ï¼ä¸ä¼åºç°ææxä¸ææå
容ãå æ¤ï¼æ³¨æx[ , ](æå°åºç©éµå
¨é¨å
容)ï¼
å¦æbig.matrixæå¾å¤åï¼é£ä¹åºè¯¥å°å
¶è½¬ç½®åå¨åï¼ï¼ä¸æ¨èï¼æè
å°åæ°âseparatedâ设置为TRUEï¼è¿æ ·å°±å°æ¯ä¸ååå¼å¨åãå¦åï¼å°ç¨Rçä¼ ç»æ¹å¼ï¼column majorçæ¹å¼ï¼å¨åæ°æ®ã
å¦æ建ç«ä¸ä¸ªfilebacked.big.matrixï¼é£ä¹éè¦æå®backingfileçå称åè·¯å¾+descriptorfileãå¯è½å¤ä¸ªbig.matrix对象对åºå¯ä¸ä¸ä¸ªdescriptorfileï¼å³å¦ædescriptorfileæ¹åï¼æ以对åºçbig.matrixéä¹æ¹åï¼åæ ·ï¼decriptorfileéçbig.matrixçæ¹åèæ¹åï¼å¦ææ³ç»´æä¸ç§æ¹åï¼éè¦éæ°å»ºç«ä¸ä¸ªfilebacked.big.matrixãattach.big.matrix(descriptorfile or describe(big.matrix))å½æ°ç¨äºå°ä¸ä¸ªdescriptorfileèµå¼ç»ä¸ä¸ªbig.matrixãè¿ä¸ªå½æ°å¾å¥½ç¨ï¼å 为æ¯æ¬¡å¨å建ä¸ä¸ªfilebacked.big.matrixåï¼ä¿åR并éåºåï¼å
åå建çç©éµä¼æ¶å¤±ï¼éè¦åattach.big.matrix以ä¸
2. 对big.matrixçåçç¹å®å
ç´ è¿è¡æ¡ä»¶çé
对å
å没æéå¶ï¼èä¸æ¯ä¼ ç»çwhichæ´å çµæ´»ï¼èµï¼ï¼
mwhich(x, cols, vals, comps, op = 'AND')
xæ¢å¯ä»¥æ¯big.matrixï¼ä¹å¯ä»¥æ¯ä¼ ç»çR对象ï¼
colsï¼è¡æ°
valsï¼cutoffï¼å¯ä»¥è®¾å®ä¸¤ä¸ªæ¯å¦c(1, 2)
compsï¼'eq'(==), 'neq'(!=), 'le'(<), 'lt'(<=), 'ge'(>) and 'gt'(>=)
opï¼âANDâæè
æ¯âORâ
å¯ä»¥ç´æ¥æ¯è¾NAï¼Infå-Inf
3.bigmemoryä¸å
¶ä»å½æ°
nrow, ncol, dim, dimnames, tail, head, typeof继æ¿baseå
big.matrix, is.big.matrix, as.big.matrix, attach.big.matrix, describe, read.big.matrix, write.big.matrix, sub.big.matrix, is.sub.big.matrix为ç¹æçbig.matrixæ件æä½ï¼filebacked.big.matrix, is.filebackedï¼å¤æbig.matrixæ¯å¦ç¡¬çå¤ä»½ï¼ , flush(å°filebackedçæ件å·æ°å°ç¡¬çå¤ä»½ä¸)æ¯filebackedçbig.matrixçæä½ã
mwhichå¢å¼ºbaseå
ä¸çwhichï¼ morderå¢å¼ºorderï¼mpermuteï¼å¯¹matrixä¸çä¸åæç
§ç¹å®åºåæä½ï¼ä½æ¯ä¼æ¹ååæ¥å¯¹è±¡ï¼è¿æ¯ä¸ºäºé¿å
å
å溢åºï¼
big.matrix对象çcopy使ç¨deepcopy(x, cols = NULL, rows = NULL, y = NULL, type = NULL, separated = NULL, backingfile = NULL, backingpath = NULL, descriptorfile = NULL, shared=TRUE)
biganalytics packageç使ç¨
biganalytics主è¦æ¯ä¸äºbaseåºæ¬å½æ°çæ©å±ï¼ä¸»è¦æmax, min, prod, sum, range, colmin, colmax, colsum, colprod, colmean, colsd, colvar, summary, applyï¼åªè½ç¨äºè¡æè
åï¼ä¸è½ç¨è¡ååæ¶ç¨ï¼ç
æ¯è¾æç¹è²çæ¯bigkmeansçèç±»
å©ä¸çbiglm.big.matrixåbigglm.big.matrixå¯ä»¥åèLumley's biglm packageã
bigtabulate packageç使ç¨
并è¡è®¡ç®éå¶ççªç ´ï¼
使ç¨doMC家æï¼doMC, doSNOW, doMPI, doRedis, doSMPåforeach packages.
foreach packageç使ç¨
foreach(..., .combine, .init, .final=NULL, .inorder=TRUE, .multicombine=FALSE, .maxcombine=if (.multicombine) 100 else 2, .errorhandling=c('stop', 'remove', 'pass'), .packages=NULL, .export=NULL, .noexport=NULL, .verbose=FALSE)
foreachçç¹ç¹æ¯å¯ä»¥è¿è¡å¹¶è¡è¿ç®ï¼å¦å¨NetWorkSpaceåsnowï¼
%do%ä¸¥æ ¼æç
§é¡ºåºæ§è¡ä»»å¡ï¼æ以ï¼ä¹å°±é并è¡è®¡ç®ï¼ï¼%dopar%并è¡æ§è¡ä»»å¡
...ï¼æå®å¾ªç¯ç次æ°ï¼
.combineï¼è¿ç®ä¹åç»æçæ¾ç¤ºæ¹å¼ï¼defaultæ¯listï¼âcâè¿åvectorï¼ cbindårbindè¿åç©éµï¼"+"å"*"å¯ä»¥è¿årbindä¹åçâ+âæè
â*â
.initï¼.combineå½æ°ç第ä¸ä¸ªåé
.finalï¼è¿åæåç»æ
.inorderï¼TRUEåè¿åååå§è¾å
¥ç¸å顺åºçç»æï¼å¯¹ç»æç顺åºè¦æ±ä¸¥æ ¼çæ¶åï¼ï¼FALSEè¿å没æ顺åºçç»æï¼å¯ä»¥æé«è¿ç®æçï¼ãè¿ä¸ªåæ°éåäºè®¾å®å¯¹ç»æ顺åºæ²¡æéæ±çæ
åµã
.muticombineï¼è®¾å®.combineå½æ°çä¼ éåæ°ï¼defaultæ¯FALSE表示å
¶åæ°æ¯2ï¼TRUEå¯ä»¥è®¾å®å¤ä¸ªåæ°
.maxcombineï¼è®¾å®.combineçæ大åæ°
.errorhandlingï¼å¦æ循ç¯ä¸åºç°é误ï¼å¯¹é误çå¤çæ¹æ³
.packagesï¼æå®å¨%dopar%è¿ç®è¿ç¨ä¸ä¾èµçpackageï¼%do%ä¼å¿½ç¥è¿ä¸ªé项ï¼ã
getDoParWorkers( ) ï¼æ¥ç注åäºå¤å°ä¸ªæ ¸ï¼é
ådoMC packageä¸çregisterDoMC( )使ç¨
getDoParRegistered( ) ï¼æ¥çdoParæ¯å¦æ³¨åï¼å¦æ没æ注åè¿åFALSE
getDoParName( ) ï¼æ¥çå·²ç»æ³¨åçdoParçåå
getDoParVersion( )ï¼æ¥çå·²ç»æ³¨åçdoParçversion
===================================================
# foreachç循ç¯æ¬¡æ°å¯ä»¥æå®å¤ä¸ªåéï¼ä½æ¯åªç¨å
¶ä¸æå°ï¼ç
> foreach(a = 1:10, b = rep(10, 3)) %do% (a*b)
[[1]]
[1] 10
[[2]]
[1] 20
[[3]]
[1] 30
# foreachä¸.combineçâ+âæè
â*âæ¯cbindä¹åçæä½ï¼è¿ä¹å°±æ¯è¯´"expression"è¿åä¸ä¸ªåéï¼ä¼å¯¹åé+æè
*
> foreach(i = 1:4, .combine = "+") %do% 2
[1] 8
> foreach(i = 1:4, .combine = "rbind") %do% rep(2, 5)
[,1] [,2] [,3] [,4] [,5]
result.1 2 2 2 2 2
result.2 2 2 2 2 2
result.3 2 2 2 2 2
result.4 2 2 2 2 2
> foreach(i = 1:4, .combine = "+") %do% rep(2, 5)
[1] 8 8 8 8 8
> foreach(i = 1:4, .combine = "*") %do% rep(2, 5)
[1] 16 16 16 16 16
=============================================
iterators packageç使ç¨
iteratorsæ¯ä¸ºäºç»foreachæä¾å¾ªç¯åéï¼æ¯æ¬¡å®ä¹ä¸ä¸ªiteratorï¼å®é½å
å®äºâ循ç¯æ¬¡æ°âåâæ¯æ¬¡å¾ªç¯è¿åçå¼âï¼å æ¤é常éåç»åforeachç使ç¨ã
iter(obj, ...)ï¼å¯ä»¥æ¥åiter, vector, matrix, data.frame, functionã
nextElem(obj, ...)ï¼æ¥åiter对象ï¼æ¾ç¤ºå¯¹è±¡æ°å¼ã
以matrix为ä¾ï¼
iter(obj, by=c('column', 'cell', 'row'), chunksize=1L, checkFunc=function(...) TRUE, recycle=FALSE, ...)
byï¼æç
§ä»ä¹é¡ºåºå¾ªç¯ï¼matrixådata.frameé½é»è®¤æ¯ârowâï¼âcellâæ¯æåä¾æ¬¡è¾åºï¼æ以对äºâcellâï¼chunksizeåªè½æå®ä¸ºé»è®¤å¼ï¼å³1ï¼
chunksizeï¼æ¯æ¬¡æ§è¡å½æ°nextElemåï¼æç
§byç设å®è¿åç»æçé¿åº¦ãå¦æè¿åç»æä¸å¤ï¼å°åå©ä½çå
¨é¨ã
checkFunc=function(...) TRUEï¼æ§è¡å½æ°checkFunï¼å¦æè¿åTRUEï¼åè¿åï¼å¦åï¼è·³è¿ã
recycleï¼è®¾å®å¨nextElem循ç¯å°åºï¼âé误: StopIterationâï¼æ¯å¦è¦å¾ªç¯å¤çï¼å³ä»å¤´åæ¥ä¸éã
以function为ä¾
iter(function()rnorm(1))ï¼ä½¿ç¨nextElemå¯ä»¥æ ééå¤ï¼ä½æ¯iter(rnorm(1))ï¼åªè½æ¥ä¸ä¸ã
æ´æææçæ¯å¯¹è±¡å¦ææ¯iterï¼å³test1 <- iter(obj); test2 <- iter(test1)ï¼é£ä¹è¿ä¸¤ä¸ªå¯¹è±¡æ¯è¿å¨ä¸èµ·çï¼åæ¶ååã
==============================================
> a
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20
> i2 <- iter(a, by = "row", chunksize=3)
> nextElem(i2)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
> nextElem(i2) #第äºæ¬¡iterateä¹åï¼åªå©ä¸1è¡ï¼å
¨é¨è¿å
[,1] [,2] [,3] [,4] [,5]
[1,] 4 8 12 16 20
> i2 <- iter(a, by = "column", checkFunc=function(x) sum(x) > 50)
> nextElem(i2)
[,1]
[1,] 13
[2,] 14
[3,] 15
[4,] 16
> nextElem(i2)
[,1]
[1,] 17
[2,] 18
[3,] 19
[4,] 20
> nextElem(i2)
é误: StopIteration
> colSums(a)
[1] 10 26 42 58 74
> testFun <- function(x){return(x+2)}
> i2 <- iter(function()testFun(1))
> nextElem(i2)
[1] 3
> nextElem(i2)
[1] 3
> nextElem(i2)
[1] 3
> i2 <- iter(testFun(1))
> nextElem(i2)
[1] 3
> nextElem(i2)
é误: StopIteration
> i2 <- iter(testFun(1))
> i3 <- iter(i2)
> nextElem(i3)
[1] 3
> nextElem(i2)
é误: StopIteration
============================================
iterators packageä¸å
æ¬
irnorm(..., count)ï¼irunif(..., count)ï¼irbinom(..., count)ï¼irnbinom(..., count)ï¼irpois(..., count)ä¸å
é¨çæiteratorçå·¥å
·ï¼åå«è¡¨ç¤ºä»normalï¼uniformï¼binomialï¼negativity binomialåPoissonåå¸ä¸éæºéåN个å
ç´ ï¼è¿è¡count次ãå
¶ä¸ï¼negative binomialåå¸ï¼å
¶æ¦ç积累å½æ°(probability mass function)为æ·éª°åï¼æ¯æ¬¡éª°å为3ç¹çæ¦ç为pï¼å¨ç¬¬r+k次æ°å¥½åºç°r次çæ¦çã
icount(count)å¯ä»¥çæ1:conuntçiteratorï¼å¦æcountä¸æå®ï¼å°ä»æ ä¼æ¢çæ1:Inf
icountn(vn)æ¯è¾å¥½ç©ï¼vnæ¯æä¸ä¸ªæ°å¼åéï¼å¦ææ¯å°æ°ï¼åååä¸ä¸ªæ°åæ´ï¼æ¯å¦2.3 --> 3ï¼ã循ç¯æ¬¡æ°ä¸ºprod(vn)ï¼æ¯æ¬¡è¿åçåéä¸æ¯ä¸ªå
ç´ é½ä»1å¼å§ï¼ä¸è¶
è¿è®¾å® vnï¼ååéçä»å·¦åå³ä¾æ¬¡éå¢ã
idiv(n, ..., chunks, chunkSize)è¿åæªåä»1:nçç段é¿åº¦ï¼âchunksâåâchunkSizeâä¸è½åæ¶æå®ï¼âchunksâ为åå¤å°ç段ï¼é¿åº¦ä»å¤§å°å°ï¼ï¼âchunkSizeâ为å段çæ大é¿åº¦ï¼é¿åº¦ç±å¤§å°å°ï¼
iapply(X, MARGIN)ï¼ä¸applyå¾åï¼MARGINä¸1æ¯rowï¼2æ¯column
isplit(x, f, drop=FALSE, ...)ï¼æç
§æå®çfååç©éµ
=============================================
> i2 <- icountn(c(3.4, 1.2))
> nextElem(i2)
[1] 1 1
> nextElem(i2)
[1] 2 1
> nextElem(i2)
[1] 3 1
> nextElem(i2)
[1] 4 1
> nextElem(i2)
[1] 1 2
> nextElem(i2)
[1] 2 2
> nextElem(i2)
[1] 3 2
> nextElem(i2)
[1] 4 2
> nextElem(i2)
é误: StopIteration
温馨提示:答案为网友推荐,仅供参考