No more clang-format
#3376
Replies: 2 comments 2 replies
-
This has been considered before, e.g. see https://www.stroustrup.com/gdr-bs-macis09.pdf. |
Beta Was this translation helpful? Give feedback.
-
One of the Carbon language project's key goals is to support developer tooling. On one level, that's providing things like a toolchain, and eventually supporting CMake. Beyond that, it includes IDEs and other tooling that developers are familiar with and use daily. While something like For example, there's a breadth of IDEs developers use for C++ (VSCode, CLion, Eclipse, vim, emacs, etc). There's been a lot of unification around LSP for cross-language support, but IPR/XPR would probably require new LSP features for saving/loading content. If an IDE relies too heavily on displaying the on-disk representation, it could also be infeasible to get it to support editing the human-readable format. In that case, developers would need to either switch to an IDE that supports Carbon or manually convert between IPR/XPR in order to edit files. But IDEs are just one tool that developers use. There's viewing code, such as with For each such tool, IPR/XPR creates a choice: the tool gets Carbon-specific support (either from tool owners or Carbon contributors), Carbon provides an alternative tool (e.g., The paper tkoeppe linked says IPR/XPR provides a significant benefit for C++ in gcc. There aren't numbers in the paper, but I believe it: clang's C++ frontend is around two thirds of compile time (with the LLVM backend being the other third). Carbon's more efficient. The Carbon frontend should be less than a third of compile time, and LLVM backends more than two thirds. Within the frontend, checking is higher cost than lexing or parsing; IPR also just changes these steps to lex and parse the IPR. I'd expect no significant compile time performance benefit for Carbon. The tooling to convert Carbon to a parsed format will exist: similar can already be done with |
Beta Was this translation helpful? Give feedback.
-
I am an HPC and GPGPU developer and not a compiler developer, so bear with me if this idea is nonsense, however it had been in the back of my mind for a while now.
Code formatting has become such a hassle in the C++ world, that it's often enfuriating. If you don't enforce any type of formatting, it descends into bikeshedding and meaningless changes, but if you do, you impose a great deal of burden on your contributors to keep a specific Clang version on their system for stable formatting, or have it be a moving version which makes code render differently at parts of the file if you use "Format changes only"... projects will cook commit hooks for checking formatting so one doesn't burn CI time for trivial formatting mistakes, people cook CI jobs tailored to check formatting... There is SOO MUCH effort going into this without any real value.
Take SVG however, which just about nobody edits manually, people don't care or even know that's it's XML and yet editors present them in a consumable, workable fashion. Why can't source code be the same?
If the language were stored as some normalized serialization format (like MSVC's IPR has XPR) and have the IDE-editor convert it to some human readable format based on user preference. By storing the sources as XPR (let's go with this for a moment), version control and projects need not be conerned how contributors like their source code to render. Storing XPR would also turbocharge semantic diff tools for various purposes, version control being the prime consumer.
This also has some interesting side-effects as not having to store source location information, because that location depends on the configured rendering of the source code, error messages would refer to token ids which would be translated to rendered line/column numbers. Lexing/parsing becomes much simpler (much like turning XPR into IPR). Also, saving ill-formed code would not be possible, as the serializer wouldn't know how to complete the transformation. (It could fallback to the textual format to save unfinished work, but it's a solvable problem.)
(Taking it to an arguable extreme, one could go as far as to write renderers/deserializers for the language that render it in a Python-like indentation sensitive fashion or as a regular curly brace-style language. I'm not saying one should go this far, but it' a possibility.)
I've long been interested in serialized data structures and it's on my shortlist to write a SAX-enabled coroutine-driven EXI-capable XML parser in C++ just to learn coroutines properly. I would probably give EXI which was specifically meant for this, exchanging XML-like tree structures with as little storage as possible. (EXI isn't tied to XML, there exists EXI for JSON, and really, any tree-like data serialization format.) EXI even has a schema-assisted optimized storage mode which a versioned programming language with a fixed grammar could even tap into. In the context of Carbon, I really wouldn't care what normalized serialization format it used, XML/EXI is just one option, it may as well just dump
TokeniedBuffer
on the disk.In short: I think it would be interesting to explore this space, of a programming language sticking with non-textual representation as the default and have tooling assist with rendering it.
Beta Was this translation helpful? Give feedback.
All reactions