-
Notifications
You must be signed in to change notification settings - Fork 14
Scripts for review #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome function! The way you dealt with the qualifiers was especially good
R/Sharis/Ameriflux_functions.R
Outdated
|
||
|
||
# Initialize data frame | ||
table <- data.frame(attributeName = rep("NA", n), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a function in the base R package called table
that creates table objects. You generally want to avoid naming variables after functions (although the table
function is a bit obscure).
You could change this something like att_table
R/Sharis/Ameriflux_functions.R
Outdated
|
||
for (i in seq_len(n)) { | ||
# add attribute name | ||
table$attributeName[i] = colnames(data)[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move colnames(data) outside the for loop so it only runs once rather than n times.
Something like col_names <- colnames(data). Then you can reference col_names[i] in the for loop.
R/Sharis/Ameriflux_functions.R
Outdated
table$attributeName[i] = colnames(data)[i] | ||
|
||
## check if the name has a qualifier at the end | ||
if (any(endsWith(colnames(data)[i], suffix = qualifiers))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here about moving colnames(data) outside the for loop.
R/Sharis/Ameriflux_functions.R
Outdated
main_label<- substr(colnames(data)[i], 1, nchar(colnames(data)[i])-len) | ||
|
||
# get definition for main label | ||
main_def <- attributes$uniqueAttributeDefinition[attributes$uniqueAttributeLabel == main_label] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the beginning of your function you could specify colnames(attributes) <- ("category", "label", "definition", "unit", "SI_unit")
and get rid "uniqueAttribute" everywhere in your code.
The benefit is making your code more readable - also not your fault, the column names from that csv aren't great.
R/Sharis/Ameriflux_functions.R
Outdated
|
||
# case if there is no qualifier | ||
} else { | ||
table$attributeDefinition[i] = attributes$uniqueAttributeDefinition[attributes$uniqueAttributeLabel == colnames(data)[i]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra spaces by the ==
R/Sharis/Ameriflux_functions.R
Outdated
} else { | ||
table$measurementScale[i] = "ratio" | ||
table$domain[i] = "numericDomain" | ||
table$numberType[i] <- "real" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most style guides advocate using <-
as the assignment operator, and using =
only inside function calls.
If you're interested in reading more this blog post is pretty good https://renkun.me/2014/01/28/difference-between-assignment-operators-in-r/ .
@@ -0,0 +1,92 @@ | |||
generate_attributes_table <- function(csv_file_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you'd like you can document what this function does. Check out the multiline comments (#' is the multiline comment symbol in R) at the top of this function: https://github.com/NCEAS/datamgmt/blob/master/R/guess_member_node.R
``` | ||
|
||
## Data Objects | ||
### Adding data tables for a whole folder of files with the same attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is so awesome! I just had to do this but forgot you wrote it up until I mostly finished. In any case, interesting to see slightly different ways of doing the same thing!
No description provided.