Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main,win32: introduce an option replacing backslashes in input file names with slashes #2199

Merged

Conversation

masatake
Copy link
Member

(TODO: documentation)

Suggested by @k-takata in #1325.

Signed-off-by: Masatake YAMATO [email protected]

@masatake
Copy link
Member Author

Let's see how Appveyor says.

@masatake masatake changed the title main,win32: introduce an option replacing backslashes in input file names with slashes [WIP] main,win32: introduce an option replacing backslashes in input file names with slashes Sep 18, 2019
@coveralls
Copy link

coveralls commented Sep 18, 2019

Coverage Status

Coverage increased (+0.001%) to 86.178% when pulling 6616651 on masatake:use-slash-as-filename-separator into fed9107 on universal-ctags:master.

@masatake masatake force-pushed the use-slash-as-filename-separator branch 2 times, most recently from 151fce1 to 931a12f Compare September 19, 2019 02:57
@k-takata
Copy link
Member

How about making this as a default when --output-format=u-ctags?
(Is it possible to change the default depending on the output format?)

I feel the option name --use-slash-as-filename-separator is a bit long. (But I don't have a good alternative idea.)

@masatake
Copy link
Member Author

@k-takata, I need your help.
Can we use '/' as a file name separator on all windows + alpha environment?
Here alpha means cygwin, mingw, and so on.

I wonder whether I can remove UNIX_PATH_SEPARATOR or not. In our code, UNIX_PATH_SEPARATOR is used in ifdef condition. However, it is not defined anywhere.

I inspected Exuberant-ctags, and I found it is defined when ctags is built on OS-2.
Some years ago, I removed makefiles related to the minor platform.
From that point of view, UNIX_PATH_SEPARATOR is just garbage.

Quoted from our code:

#ifndef PATH_SEPARATOR
# if defined (MSDOS_STYLE_PATH)
#  define PATH_SEPARATOR '\\'
# else
#  define PATH_SEPARATOR '/'
# endif
#endif

#if defined (MSDOS_STYLE_PATH) && defined (UNIX_PATH_SEPARATOR)
# define OUTPUT_PATH_SEPARATOR	'/'
#else
# define OUTPUT_PATH_SEPARATOR	PATH_SEPARATOR
#endif

I wonder I can remove && defined (UNIX_PATH_SEPARATOR).

This is not directly related to this pull request.

I got this question during developing the code proposed here.

@masatake
Copy link
Member Author

How about making this as a default when --output-format=u-ctags?
(Is it possible to change the default depending on the output format?)

I feel the option name --use-slash-as-filename-separator is a bit long. (But I don't have a good alternative idea.)

I was thinking a much shorter name like... -/ or -s.
Using / in --output-format=u-ctags format is attractive.
However, I think we should solve this issue in one more higher layer because users may want to use / in other formats: JSON, etags and xref output.

@masatake
Copy link
Member Author

@k-takata, what I wrote misses the point. The issue about backslash is introduced in u-ctags format.
So, what you wrote makes sense.

@k-takata
Copy link
Member

Can we use '/' as a file name separator on all windows + alpha environment?
Here alpha means cygwin, mingw, and so on.

Of course, cygwin is a linux-like environment, thus it uses /.
On normal Windows environment, Win32 API and C runtime library support both \ and /.
So, if an application doesn't restrict a separator to \, it should be able to use /. It's totally depends on each application. (But I think most applications support /.)

The issue about backslash is introduced in u-ctags format.

Yes, so I think that an application which explicitly supports u-ctags format should support / even on Windows.

@masatake masatake force-pushed the use-slash-as-filename-separator branch 2 times, most recently from f25c777 to 3efaed1 Compare September 19, 2019 20:26
{
#ifdef WIN32
if (Option.useSlashAsFilenameSeparator == FILENAME_SEP_UNSET)
Option.useSlashAsFilenameSeparator = FILENAME_SEP_USE_SLASH;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that casting is needed to remove const or need to define OPTION_WRITE before including "options_p.h".

Suggested change
Option.useSlashAsFilenameSeparator = FILENAME_SEP_USE_SLASH;
((optionValues *)&Option)->useSlashAsFilenameSeparator = FILENAME_SEP_USE_SLASH;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.
However, I will take bit non-efficient technique as the first trial.
I remember the following commit when I see code like #define SOMETHING const.

commit 585d60d5a4ea4c6f2e96c9cd65dae87fe52d25c2
Author: Masatake YAMATO <[email protected]>
Date:   Tue Apr 29 16:13:57 2014 +0900

    Remove const modifier from File global variable declared in read.h
    
    This is a temorary bug fix.
    
    const modifier is added to `File' global variable declared in read.h.
    The intent is `File' should be modifiable only from read.c.
    From the outside the variable can be read.
    
    However, it causes the bug if -O[23] option is used when compiling.
    I found it in fortran.c.
    
      read_File();
      read_c_func_modifying_File();
      read_File();
    
    Though a field in `File' is modified in read_c_func_modifying_File.
    However, here gcc treats File as constant. gcc assigns the field to a
    CPU register. So the 2nd read_File cannot access to the value of the
    field updated by read_c_func_modifying_File.

Fixing the above issue took so much time.

@masatake masatake force-pushed the use-slash-as-filename-separator branch from 3efaed1 to 63ae305 Compare September 20, 2019 04:41
@masatake
Copy link
Member Author

Instead of struggling with the const macro, I introduce one more method to writer that allows the writer overrides the way to handle separators. The method is called each time when writing a tag. So it is not efficient. I will measure the performance of the new code with universal-ctags/codebase. If we find serious degradation, I will introduce function static variable returned from the method.

@masatake masatake force-pushed the use-slash-as-filename-separator branch from 63ae305 to eb3eb02 Compare September 20, 2019 05:17
@k-takata
Copy link
Member

https://ci.appveyor.com/project/universalctags/ctags/builds/27542036/job/wcetr16qhl00xitj#L2093
Still two tests are failing. Maybe expected data need to be updated?

@k-takata
Copy link
Member

Tmain/maxdepth.d/run.sh converts \\ to / by using convsep.
Maybe we can remove this now?

@masatake
Copy link
Member Author

@k-takata, thank you. Updated.

@masatake
Copy link
Member Author

Before documenting this feature, I have to review 79c6a94.
This change was merged once by my mistake. It was reverted.

@k-takata
Copy link
Member

LGTM (except for the documents).

It was reverted.

Is it #1745? Was it reverted just for lacking of documentation, or any other reasons?

@masatake
Copy link
Member Author

Is it #1745? Was it reverted just for lacking of documentation, or any other reasons?
Merge state

79c6a94 explains why format-3 is needed. The readtags library doesn't provide the way to pass newly introduced pseudo tags (e.g. TAG_OUTPUT_MODE) to its application.

New, I would like to introduce new pseudo tag, TAG_OUTPUT_FILESEP. I can say it is a new variant of the format-3.
It means eliminating TAG_OUTPUT_MODE by introducing format-3 as I proposed in 79c6a94 is not enough. Further more eliminating TAG_OUTPUT_MODE is just bad solution.

I think what we have to do is extending readtags to allow to pass any pseudo tags to application.

I am streamlining my TODO list for merging this. Please, wait for a while.

@masatake
Copy link
Member Author

I have good way to exnted readtags API for supporting extra ptags.
So I will throw away the format-3 branch.

Documentation is the next task.

  • write about backslash escaping in docs/format.rst,
  • write about --use-slash-as-filename-separator, and
  • write about TAGS_OUTPUT_FILESEP.

At the first time, I thought the name of option is bad. However, I find it is easy to remember the name.
I can introduce a short option for the long option later, and shell completion rules may help user.

I think none cares this option. The idea, using slash as default input for u-ctags output, suggested by @k-takata is so perfect.

@k-takata
Copy link
Member

The commit log of e6ee6e6 has a typo.
s/TAGS_OUTPUT_FILESEP/TAG_OUTPUT_FILESEP/

@k-takata
Copy link
Member

  • write about TAGS_OUTPUT_FILESEP.

I'm not sure whether introducing TAG_OUTPUT_FILESEP is really useful, but it won't cause any problems, so I'm not against this.
(Do you have any use cases?)

I can introduce a short option for the long option later,

I think it is low priority, because the default setting is reasonable now and most people may not want to change the default.

@masatake masatake force-pushed the use-slash-as-filename-separator branch from 4cfe48f to 6417a1d Compare September 26, 2019 15:16
@codecov-io
Copy link

Codecov Report

Merging #2199 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2199      +/-   ##
==========================================
+ Coverage   86.04%   86.05%   +<.01%     
==========================================
  Files         176      176              
  Lines       35503    35506       +3     
==========================================
+ Hits        30550    30553       +3     
  Misses       4953     4953
Impacted Files Coverage Δ
main/options.c 82.62% <ø> (ø) ⬆️
main/strlist.c 79.2% <ø> (ø) ⬆️
main/writer-ctags.c 98.43% <ø> (ø) ⬆️
main/writer.c 97.43% <100%> (+0.21%) ⬆️
main/entry.c 83.98% <100%> (ø) ⬆️
main/writer-json.c 93.67% <100%> (ø) ⬆️
main/ptag.c 92.64% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9757e4f...6417a1d. Read the comment docs.

@masatake
Copy link
Member Author

I'm not sure whether introducing TAG_OUTPUT_FILESEP is really useful, but it won't cause any problems, so I'm not against this.
(Do you have any use cases?)

I don't have any serious use case. However, I image there is a case that a client tool wants to know how given tags file is generated. The pseudo tag gives a hint to the client tool.

@k-takata
Copy link
Member

k-takata commented Oct 1, 2019

I found that the option --output-format is not described in man/ctags.1.rst.in. This should be also documented, I think.

@k-takata
Copy link
Member

k-takata commented Oct 1, 2019

A proposal for --output-format:

--- a/man/ctags.1.rst.in
+++ b/man/ctags.1.rst.in
@@ -1389,6 +1389,12 @@ are not listed here. They are experimental or for debugging purpose.
 	Add an optlib *directory* to or reset **optlib** path list.
 	By default, the optlib path list is empty.
 
+``--output-format=u-ctags|e-ctags|etags|xref|json``
+	Specify the output format. The default is "u-ctags".
+	See tags(5) for "u-ctags" and "e-ctags".
+	See ``-e`` for "etags".
+	This option must appear before the first file name.
+
 ``--print-language``
 	Just prints the language parsers for specified source files, and then exits.
 

The descriptions for "xref" and "json" are still missing, though.

masatake added a commit to masatake/ctags that referenced this pull request Oct 1, 2019
@masatake
Copy link
Member Author

masatake commented Oct 1, 2019

@ k-takata, thank you.

I must write about the backslash conversion in tags(5) and --use-slash-as-file-separator.
Then, I will merge this.

@masatake masatake changed the title [WIP] main,win32: introduce an option replacing backslashes in input file names with slashes main,win32: introduce an option replacing backslashes in input file names with slashes Oct 2, 2019
@masatake
Copy link
Member Author

masatake commented Oct 2, 2019

I give up updating tags(5) in THIS pull request.
I will open the documentation specific pull requests.

main/writer-ctags.c Outdated Show resolved Hide resolved
Copy link
Member

@k-takata k-takata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Will you squash the "fixup!" commits before merging?

@masatake
Copy link
Member Author

masatake commented Oct 2, 2019

Yes, I will. ...However, it seems that buiding on centos8 on circleci has a trouble.

There is no '#define' for UNIX_PATH_SEPARATOR.
It was defined in the makefile for OS/2.
However, we dropped the support for OS/2 ago.

Signed-off-by: Masatake YAMATO <[email protected]>
…lash

Now, u-ctags running on win32 platform converts backslashes in input fields
to slashes. The conversion in the test case side is not needed anymore.

Suggested by @k-takata.
… it is common in parsers

Signed-off-by: Masatake YAMATO <[email protected]>
@masatake masatake force-pushed the use-slash-as-filename-separator branch from 996a79a to 6616651 Compare October 2, 2019 02:33
@masatake masatake merged commit f49b34b into universal-ctags:master Oct 2, 2019
@masatake
Copy link
Member Author

masatake commented Oct 2, 2019

@k-takata, thank you very much.

@k-takata
Copy link
Member

k-takata commented Oct 2, 2019

👍

k-takata added a commit to k-takata/ctags that referenced this pull request Oct 23, 2019
`parser-c.r/line_directives.c.d/filter` is not needed anymore because we
have merged universal-ctags#2199.
@k-takata k-takata mentioned this pull request Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants