Kubke Lab:Research/ABR/Notebook/2013/11/06

From OpenWetWare

Jump to: navigation, search
Hearing development in barn owls Main project page
Previous entry      Next entry

Contents


General Entries

  • Insert content here...

Personal Entries

Fabiana

  • From file 2013-11-06-MFK.Rmd in Sandbox

Trying to organise file management

Need to:

  • open the folder, grab the list of files
  • separate log files from text files
  • put them in a data.frame with one column having txt files and another having log files
  • make sure that the names of txt and log files actually match, and put NA where one of the file pairs is missing


Trying to organise file management

) Change directory to case #
2) get drectory list
3) subset files with log onto one column
3) subset files with txt onto a second column

Structure of files is different for different case numbers:

Owl189 -> 189###.LOG, 189###.TXT (99 objects: 98 files + WS_FTP.log)
Owl222 -> 222###.LOG, 222###.TXT (31 objects: 30 files + WS_FTP.log)
Owl224 -> 224###.LOG, 224###.TXT (39 objects: 38 files + WS_FTP.log)
Owl229 -> 229###.LOG, 229###.TXT (34 objects: 33 files + WS_FTP.log)
Owl230 -> 230###.LOG, 230###.txt (333 objects, ????)
Owl233 -> 233###.ABR.log, 233###,ABR.txt(311 objects)
Owl335 -> 335###.ABR.log, 335###.ABR.txt (172 objects)
Owl336 -> 336###.abr.log, 336###.abr.txt(20 objects)
Owl416 -> 416###.abr.log, 416###.abr.txt(358 objects)
Owl419 -> 419###.abr.log, 419###.abr.txt (396 objects)

Missing txt files in
Owl 189 (49 log, 49 txt)
Owl 222 (15 log, 15 txt, one txt too small size)
Owl 224 (19 log, 19 txt, one txt too small size)
Owl 229 (18 log, 15 txt)
Owl 230 (several 0 KB files and several small txt files) Folder 230(check has 206 files?)
Owl 233 (several 0KB files and small txt files) (folder 233(P)only has 64 objects)
Owl 335 (several 0KB files and small txt files)
Owll 336 (10 log files, 10 txt)
Owl 416 (several empty files)
Owll 419 (several empty files)

testing with Owl222

# basedir<- getwd() enter case to analyse: newdir <- readline('enter case
# number: ') create dir name as basedir\Analysis\datafiles\case: basedir
# <- getwd() casedir <- paste(basedir, newdir, sep = '/') setwd(casedir)

or

# dir.create(file.path(basedir, casedir), showWarnings = FALSE)
# setwd(file.path(basedir, casedir))

1) Need to get files list
2) Need to separate files as [whatever].log in one column and [whatever].log in another.
3) Somehow I need to know if whatever on the same line do not match.

Back to testing on folder Owl222
Owl222 -> 222###.LOG, 222###.TXT (31 objects: 30 files + WS_FTP.log)

setwd("~/Dropbox/OrisABR/Analysis/datafiles/OWL222")
files <- dir()
head(files)
## [1] "222L01.LOG" "222L01.TXT" "222L02.LOG" "222L02.TXT" "222L03.LOG"
## [6] "222L03.TXT"
# need to separate the txt from the logfiles
log <- regexpr("(.*)[L|l][O|o][G|g]", files)
logfiles <- regmatches(files, log)
length(logfiles)
## [1] 16
print(logfiles)
##  [1] "222L01.LOG" "222L02.LOG" "222L03.LOG" "222L04.LOG" "222L05.LOG"
##  [6] "222L06.LOG" "222L07.LOG" "222L08.LOG" "222L09.LOG" "222L0A.LOG"
## [11] "222L0B.LOG" "222L0C.LOG" "222L0D.LOG" "222L0E.LOG" "222R18.LOG"
## [16] "WS_FTP.LOG"

txt <- regexpr("(.*)[T|t][X|x][T|t]", files)
txtfiles <- regmatches(files, txt)
length(txtfiles)
## [1] 15
print(txtfiles)
##  [1] "222L01.TXT" "222L02.TXT" "222L03.TXT" "222L04.TXT" "222L05.TXT"
##  [6] "222L06.TXT" "222L07.TXT" "222L08.TXT" "222L09.TXT" "222L0A.TXT"
## [11] "222L0B.TXT" "222L0C.TXT" "222L0D.TXT" "222L0E.TXT" "222R18.TXT"

Now need to put those into a single data frame, but make sure that the file names are matched for log and txt. So I am trying to compare the first 6 characters for each

I can assume that if I do not have a txt file, it is irrelevant whether I have a log file or not - so can step through the txtfiles line by line and look for the match on the logfile and then dump that on a dataframe where column 1 is txt files and column 2 is logfiles and if a log file is missing, then I can put a NaN

n <- length(txtfiles)
i = 1
traces <- txtfiles[1:n]
headers <- logfiles[1:n]
casefiles <- data.frame(traces, headers)

# while(i<n+1){ get traces[i]

# extract first any 6 characters at beginning of string in txt files
# (traces): '^.{6}'
test <- regexpr("^.{6}", traces)
test2 <- regmatches(traces, test)
print(test2)
##  [1] "222L01" "222L02" "222L03" "222L04" "222L05" "222L06" "222L07"
##  [8] "222L08" "222L09" "222L0A" "222L0B" "222L0C" "222L0D" "222L0E"
## [15] "222R18"

# look for a match in logfiles (headers)
test3 <- regexpr("^.{6}", headers)
test4 <- regmatches(headers, test3)
print(test4)
##  [1] "222L01" "222L02" "222L03" "222L04" "222L05" "222L06" "222L07"
##  [8] "222L08" "222L09" "222L0A" "222L0B" "222L0C" "222L0D" "222L0E"
## [15] "222R18"

# i=i+1 }

grab first of test2, and move down through test4 until I find a match, when I do, write the pair into casefiles$traces, casefiles$headers - but need to add the parts of the strings that I stripped so need to grab the filenames not from test 2 and test4 but rather from the actual full file names that are stored in traces and headers (using the i, j for location). PErhaps I can do the regexpr, regmatch on the individual rather than creating a new vector? write txtfiles[i] onto casefiles$txt and casefiles$log

Andy

  • Enter content here

Oris

  • Enter content here


Personal tools