-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate gzip AND txt files found in conc/raw #157
Conversation
@@ -25,9 +25,9 @@ def review_logs(log_file) | |||
|
|||
conc_dir = Settings.concordance_path | |||
|
|||
raw_gzip_files = Dir.glob("#{conc_dir}/raw/*txt.gz").sort | |||
raw_files = Dir.glob("#{conc_dir}/raw/*txt*").sort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concerns about this matching e.g. "this_is_not_a_txt_file"? I figure we control what goes in that directory, so it shouldn't be a big issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more than it matching "junk_that_shouldnt_be_in_this_directory.txt.gz". If it gets garbage it will end up erroring out anyway.
|
||
file_names_split = raw_gzip_files.map {|fname| fname.split("/").last.split("_").first } | ||
file_names_split = raw_files.map {|fname| fname.split("/").last.split("_").first } | ||
raw_dates = file_names_split.select {|d| d =~ /^\d+$/ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more clear here to use a regex here -- if nothing else, it would be worth a comment here as to what filename format this is expecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented expected file format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And now with regex instead of split("_").first
41fa652
to
4c6dfe9
Compare
4c6dfe9
to
665645d
Compare
No description provided.