Skip to content
This repository has been archived by the owner on Jan 6, 2023. It is now read-only.

Resort to lstat for FS not supporting dirent.d_type #25

Merged
merged 1 commit into from
Nov 16, 2016

Conversation

rojkov
Copy link
Contributor

@rojkov rojkov commented Jun 10, 2016

According to POSIX.1 only d_name and d_ino fields of struct
dirent are standardized. d_type isn't always correctly set on file
systems like XFS. In such cases it makes sense to resort to
lstat(). Otherwise a user has hard time figuring out what's
wrong with her setup.

Signed-off-by: Dmitry Rozhkov [email protected]


if (entry->d_type == DT_DIR) {
iterate_directory(manifest, pathprefix, file->filename, do_hash);
} else if (entry->d_type == DT_UNKNOWN) { /* fall back to lstat() */
struct stat sb;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

populate_file_struct() calls lstat(); why do we need to call it again here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice this populate_file_struct(). Why it's needed there then? I presumed it's called for every file in a separate thread (see get_hash()) to make these computations parallel. Otherwise there's no need for the DT_DIR optimization in the first place.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's populating the 'file' struct with the lstat details. That being the case, wouldn't it be easier to just call

if (S_ISDIR(file.st_mode)) {

and remove the new call to lstat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By referring to get_hash() I wanted to say that in the current code populate_file_struct() is called twice for every file: the first time it's called sequentially and then it's called in the parallel threads again. Which is probably not what you want. AFAIU this

if (entry->d_type == DT_DIR) {

check is an optimization postponing lstat() calls to the parallel threads.

So there are two solutions:

  1. use if (S_ISDIR(file.st_mode)) {, remove the new call to lstat() and remove populate_file_struct() from get_hash(). This is easier to understand, but suboptimal since lstat() is called sequentially for all files.
  2. drop the first call to populate_file_struct(). In this case the second lstat() is called only as a fallback for rare cases where DT_DIR is not supported.

@bradTpeters
Copy link

I think option 2 is indeed the best approach; please revise your patch and resubmit and I'll +1

According to POSIX.1 only d_name and d_ino fields of struct
dirent are standardized. d_type isn't always correctly set on file
systems like XFS. In such cases it makes sense to resort to
lstat(). Otherwise a user has hard time figuring out what's
wrong with her setup.

Also remove redundant populate_file_struct() as it's called
again in parallel threads.

Signed-off-by: Dmitry Rozhkov <[email protected]>
@rojkov
Copy link
Contributor Author

rojkov commented Jul 27, 2016

  • removed redundant call to populate_file_struct();
  • rebased and force pushed.

@phmccarty
Copy link
Contributor

Looks good to me. +1

@tmarcu tmarcu merged commit 983b17d into clearlinux:master Nov 16, 2016
@phmccarty
Copy link
Contributor

For the record:

In master, this change was reverted due to test suite failures. It caused file types to be calculated for Manifest.full, but not for bundle manifests.

@phmccarty
Copy link
Contributor

phmccarty commented Nov 16, 2016

The performance improvements from the earlier discussion on this PR are still worth considering.

In the interim, @tmarcu pushed 74c8f86 to address the d_type portability issue.

@pohly
Copy link
Contributor

pohly commented Nov 17, 2016

@phmccarty with performance improvements you mean removing the extra populate_file_struct()? Should this be tracked in a new issue?

@phmccarty
Copy link
Contributor

@pohly Exactly. I opened #39 for tracking.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants