Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement loadtxt and savetxt #23

Merged
merged 10 commits into from
Dec 21, 2019
Merged

Implement loadtxt and savetxt #23

merged 10 commits into from
Dec 21, 2019

Conversation

certik
Copy link
Member

@certik certik commented Dec 19, 2019

This also includes a minimal CMake build system. We can improve the
build system in further PRs.

Fixes #16.

So that it does not conflict with CMake generated makefiles.
To use them, execute make with:

    make -f Makefile.manual
This also includes a minimal CMake build system. We can improve the
build system in further PRs.

Fixes fortran-lang#16.
@certik certik mentioned this pull request Dec 19, 2019
@milancurcic
Copy link
Member

Thanks @certik. Should we make feature PRs to go into feature branches, for example feature/loadtxt?

@certik
Copy link
Member Author

certik commented Dec 19, 2019

My own preference is to simply send PRs against master. That seems to scale really well even for big projects and it's simple for newcomers to understand. But if others feel we should use another workflow, I am happy to adjust. How would the feature branch work? Would it be long lived?

What I was thinking is simply merging to master and we would use master as the latest version, that way people can easily test it out etc.

However, one thing that I am not sure if we should have an "experimental" section, perhaps src/experimental. And put new features there first. The module would be called stdlib_x_io, or something like that, to show that it is experimental (i.e. freshly implemented) and we need to gain experience actually using it to see if we are willing to commit to (forever) maintain backwards compatible API.

@jvdp1
Copy link
Member

jvdp1 commented Dec 19, 2019

This also includes a minimal CMake build system. We can improve the
build system in further PRs.

Fixes #16.

Nice.
Regarding extension to other types than dp (e.g. float, integer4/8,...), would it be beter using data polymorphism (unlimited polymorphic) or repeating the procedures and overloading (maybe a question for the issues)?

@certik
Copy link
Member Author

certik commented Dec 19, 2019

@jvdp1 good point, I think the double precision is the most common, so we can start with that. We either have to repeat it by hand, or use some templated systems (there are a few) that generate it for us.

@milancurcic
Copy link
Member

However, one thing that I am not sure if we should have an "experimental" section, perhaps src/experimental. And put new features there first. The module would be called stdlib_x_io, or something like that, to show that it is experimental (i.e. freshly implemented) and we need to gain experience actually using it to see if we are willing to commit to (forever) maintain backwards compatible API.

Could a git branch (devel or even experimental) be used for this purpose, like @zbeekman suggested in #5?

@jvdp1
Copy link
Member

jvdp1 commented Dec 19, 2019

@certik I often use single-precision ;) This could be also an issue for other modules (e.g. linalg?). So it would be maybe good to discuss it broadly at some point. I never work with templated systems, but I am not against. It would be probably more efficient (easier?) than using unlimited polymorphism.

@certik
Copy link
Member Author

certik commented Dec 19, 2019

@milancurcic it can. But things can stay experimental for quite some time (I can easily see something being experimental for a year). And managing two branches becomes painful. For example, with this PR, it requires some infrastructure setup (CMake, etc.), so if we put it in an experimental branch, we have to redo the CMake setup in another PR which will need it also. We can extract the CMake stuff, put into master, and try to keep the experimental branch small. But it's still going to be quite some work. Then how about releases? If everything is in master, we just need to create one release tarball and everybody can test it out. If we have an experimental branch, do we release two tarballs? We can, but it feels like a lot of administrative overhead.

I was thinking of doing something like C++ does:

https://en.cppreference.com/w/cpp/experimental

As an example: https://en.cppreference.com/w/cpp/experimental/parallelism_2, the experimental (new) feature is simply in an experimental/... header file, but my understanding is that it is part of the main standard library (e.g., part of "master").

@certik
Copy link
Member Author

certik commented Dec 19, 2019

I would suggest to develop like Microsoft develops the C++ standard library. Here is their main repo:

https://github.com/microsoft/STL

Only one branch (master). The experimental features are in master, in the experimental directory, e.g.:

https://github.com/microsoft/STL/blob/28ec9a32952e0d7443936f8d5ae5d675ba6cf65c/stl/inc/experimental/deque

Here is an example of a PR, against master, with an experimental feature:

microsoft/STL#361

It's simple, it's proven, it works for C++ and Microsoft. And then if we need to make some adjustment to a proven workflow, we can.

@certik
Copy link
Member Author

certik commented Dec 19, 2019

@jvdp1 good point, we should implement single precision version also in this PR.

@milancurcic
Copy link
Member

@certik Got it, I didn't consider all the separate infrastructure that would be needed, and indeed that would be a pain in a separate branch. Having a separate repo for this would also be a pain I think. Separate directory as you suggest seems reasonable.

src/stdlib_io.f90 Outdated Show resolved Hide resolved
@certik
Copy link
Member Author

certik commented Dec 19, 2019

Ok, so let's just use master with a separate directory called experimental, as the C++ stdlib. How should we rename the module? Because in C++, you import as #include <experimental/deque>, so in your code you know you are using an experimental API. In Fortran, we do not import using the path, so we need to rename the module (until j3-fortran/fortran_proposals#86 is implemented). Here are some ideas how to rename it:

  • stdlib_x_io
  • x_stdlib_io
  • stdlib_io_x
  • stdlib_experimental_io

The last one stdlib_experimental_io is probably in line with the C++ idea, i.e. #include <experimental/deque> would correspond to stdlib_experimental_deque. Also by spelling experimental fully would be consistent with the name of the directory.

src/stdlib_io.f90 Outdated Show resolved Hide resolved
logical function whitechar(char) ! white character
! returns .true. if char is space (32) or tab (9), .false. otherwise
character, intent(in) :: char
if (iachar(char) == 32 .or. iachar(char) == 9) then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdlib_ascii module will be useful here to not use literal constants. Minot nitpick as ascii constants won't change any time soon, but nevertheless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally didn't expose whitechar as public, as we might want to change the API. Once we implement stdlib_string we can put all these in it and polish it up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a "circular dependency" here. I would like to submit a pull request for stdlib_ascii but I was waiting to have some CMake machinery set up and so I could use assert in the unit tests.

Copy link
Member

@ivan-pi ivan-pi Dec 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdlib_ascii module will be useful here to not use literal constants. Minot nitpick as ascii constants won't change any time soon, but nevertheless.

Internally in stdlib_ascii I am also using both literal character and hexadecimal constants for the symbols in the ascii table. I see no other portable way. Of course compiler vendors targeting specific processors with other default collating sequences could implement their own low-level versions. I guess another option would be to hack something up using the transfer intrinsic and bit-mask operations, but I see no benefit.

Edit: probably I misunderstood your comment, which was implying to use something like char == ascii_tab .and. char == ascii_space instead of cryptic ascii sequence integers. The stdlib_ascii module will have a is_blank function, which can be used instead of whitechar.

src/stdlib_io.f90 Outdated Show resolved Hide resolved
@certik
Copy link
Member Author

certik commented Dec 20, 2019

Let's keep the discussion going, so that we can merge this.

Should I move the io module into: stdlib_experimental_io per the discussion above?

@certik
Copy link
Member Author

certik commented Dec 20, 2019

@milancurcic please let me know your opinion regarding stdlib_experimental_io, see above.

@certik
Copy link
Member Author

certik commented Dec 21, 2019

I moved all new code to stdlib_experimental in 559bfd7. If feels right, because now we don't need to get everything 100% right in each PR. We just need to get it mostly right, and then the rest we can improve collaboratively with subsequent PRs, and get some real world usage, until we are all convinced that the functionality is rock solid. Then we can move it to stdlib_io from experimental.

@certik
Copy link
Member Author

certik commented Dec 21, 2019

I also just added a single precision version.

@milancurcic, @marshallward, @jvdp1 would you mind giving it another review please?

Comment on lines +22 to +23
real(dp), allocatable :: tmp(:,:)
call dloadtxt(filename, tmp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*This implies a additional copy of the array d (in dp). This could be quite inefficient for large files/arrays.
*This way could be also difficult to generalize for quad precision qp.

I implemented a more general solution and extended it to qp. How could I propose these changes?

@certik
Copy link
Member Author

certik commented Dec 21, 2019 via email

@certik
Copy link
Member Author

certik commented Dec 21, 2019

The alternative workflow is that we merge this PR, and you simply send a PR with improvements against master. In fact I think I would prefer that --- simpler and it scales better, as there are a lot more improvements that we need to make:

  • you can work on better single and quadruple support
  • I can work on setting up a CI
  • somebody else can work on the ascii module, ...

@milancurcic would you be ok with merging this PR now so that others can send subsequent PRs?

@jvdp1
Copy link
Member

jvdp1 commented Dec 21, 2019

The alternative workflow is that we merge this PR, and you simply send a PR with improvements against master. In fact I think I would prefer that --- simpler and it scales better, as there are a lot more improvements that we need to make:

  • you can work on better single and quadruple support
  • I can work on setting up a CI
  • somebody else can work on the ascii module, ...

@certik , @milancurcic At this point, I think it is the easiest solution indeed. @ivan-pi ? is maybe also waiting on a first implementation for the ascii module

@milancurcic
Copy link
Member

I like the idea of implementations going into the staging area like experimental and getting further work there before becoming part of the "stable" stdlib. stdlib_experimental_io is okay with me.

To move faster, I also agree that we can merge PRs into stdlib/experimental. There will be quite a few things that we'll want to fix or change, and this could keep the PR from being merged and slowing down next contributions that depend on it (but should really be their own PRs).

I think that we can merge PRs into stdlib/experimental as soon as:

  • Community agrees on the API (that's why better keep PRs small);
  • Tests pass

Basically, treat is as an MVP -- minimum viable product. Then we can refine with additional PRs, but other contributions that depend on this can be made because the code is merged in master.

I will give this another review now.

@milancurcic
Copy link
Member

The only outstanding issue is that the test data (in src/tests/loadtxt/) are not copied or linked to where the test executables are built with CMake, if they're built in a separate directory. For example, my default habit was to do

mkdir build
cd build
cmake ..
make
ctest

In which case the tests fail because test data is not there.

We should somehow either ensure that test data is next to the test executables, or put an instruction in README.md that the library must be built in the top-level directory.

I will have a few suggestions for changes after merge, which can be one or more new PRs.

Copy link
Member

@milancurcic milancurcic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go. We can build with cmake from top-level directory for now and figure out the test data can be best kept accessible from executables later.

@certik certik merged commit bee64c5 into fortran-lang:master Dec 21, 2019
@certik
Copy link
Member Author

certik commented Dec 21, 2019

@jvdp1 it's merged! Go ahead and submit PRs against master now (into an experimental module).

@certik certik deleted the loadtxt branch December 21, 2019 22:11
@certik
Copy link
Member Author

certik commented Dec 21, 2019

@milancurcic what you wrote in #23 (comment) is I think exactly how we should do it. Merging to experimental modules can be treated similarly as merging to any other opensource project --- the community must agree on the API, tests must pass and it must pass review. It doesn't have the be 100% ready, as in this PR --- but we must be able to get to the 100% solution by sending subsequent PRs.

Once we get to the 100% solution in experimental, we'll have to figure out another workflow how to move it from experimental into main. For now our workflow is enough to keep going.

Thanks for the review!

@certik certik mentioned this pull request Dec 30, 2019
jvdp1 pushed a commit to jvdp1/stdlib that referenced this pull request Oct 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

loadtxt and savetxt
5 participants