Monday, 9 March 2009

R script to look at data with a lot of columns

I had some performance data that I needed to analyse that had over 1000 rows. Excel cannot load such a file, so I resorted to some tools that mathematicians and scientists use. In this case I used the following R script. There are some interesting twists, for example when R comes across a character it does not recognise then you need to replace the character with ".". For example R does not like spaces, "(", ")" etc... Anyway this script is very usefull for analysing large csv files...

data <- read.csv(file="C:/Data/Work/EXposure+IT/ITBermuda/NetApp/PartnerRe_000005.csv",header=TRUE,sep=",")colnum <- c(116,137,97,118)

sstr <- c(

results <- list()
for(i in 1:length(sstr)){
results[[i]] <- grep(sstr[i],names(data))

# make a time series#options("digits.secs") <- 3
#datetime <- as.character(data[,1])
#datetime <- strftime(datetime, format="%m/%d/%Y %H:%M:%S")time <- seq(1,1565)

for(i in 1:length(results)){