Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscellaneous cleanup to formatting #69209

Merged
merged 4 commits into from
Feb 26, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions src/libcore/fmt/float.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ where
*num,
sign,
precision,
false,
buf.get_mut(),
parts.get_mut(),
);
Expand Down Expand Up @@ -59,7 +58,6 @@ where
*num,
sign,
precision,
false,
buf.get_mut(),
parts.get_mut(),
);
Expand Down
45 changes: 23 additions & 22 deletions src/libcore/fmt/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -238,16 +238,8 @@ pub struct Formatter<'a> {
// NB. Argument is essentially an optimized partially applied formatting function,
// equivalent to `exists T.(&T, fn(&T, &mut Formatter<'_>) -> Result`.

struct Void {
_priv: (),
/// Erases all oibits, because `Void` erases the type of the object that
/// will be used to produce formatted output. Since we do not know what
/// oibits the real types have (and they can have any or none), we need to
/// take the most conservative approach and forbid all oibits.
///
/// It was added after #45197 showed that one could share a `!Sync`
/// object across threads by passing it into `format_args!`.
_oibit_remover: PhantomData<*mut dyn Fn()>,
extern "C" {
type Opaque;
Comment on lines +241 to +242
Copy link
Member

@eddyb eddyb Mar 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's funny is that we merged the PR that adds doc comments with it still using Void (@anyska went to rename Void -> Opaque and found it just in comments, PR coming soon).

Also "C" is redundant, I just realized that now. EDIT: ah no, rustfmt adds it.

}

/// This struct represents the generic "argument" which is taken by the Xprintf
Expand All @@ -259,16 +251,23 @@ struct Void {
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", issue = "none")]
#[doc(hidden)]
pub struct ArgumentV1<'a> {
value: &'a Void,
formatter: fn(&Void, &mut Formatter<'_>) -> Result,
value: &'a Opaque,
formatter: fn(&Opaque, &mut Formatter<'_>) -> Result,
}

impl<'a> ArgumentV1<'a> {
#[inline(never)]
fn show_usize(x: &usize, f: &mut Formatter<'_>) -> Result {
Display::fmt(x, f)
}
// This gurantees a single stable value for the function pointer associated with
// indices/counts in the formatting infrastructure.
//
// Note that a function defined as such would not be correct as functions are
// always tagged unnamed_addr with the current lowering to LLVM IR, so their
// address is not considered important to LLVM and as such the as_usize cast
// could have been miscompiled. In practice, we never call as_usize on non-usize
// containing data (as a matter of static generation of the formatting
// arguments), so this is merely an additional check.
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", issue = "none")]
static USIZE_MARKER: fn(&usize, &mut Formatter<'_>) -> Result = |_, _| loop {};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @eddyb -- am I correct that this reasoning is correct / this should work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it's much more different from the previous approach.
LLVM will replace loads from USIZE_MARKER with the fn pointer, which is still unnamed_addr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my reading of your comment in a different issue, my understanding is that the non-determinism is strictly in the casting from a function (or closure) to the function pointer.

But even if the closure here is unnamed_addr, the function pointer produced is guaranteed to be stable. I guess what we don't guarantee is that this closure did not get merged with a different function. But that's fine, we don't technically need to -- anywhere that this could have gotten merged, would have to be associated with a usize in the "value" part of Argument, right? So the safety property would still hold.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what the argument is. Indeed, we shouldn't introduce non-determinism and USIZE_MARKER == USIZE_MARKER should always hold.

The funny thing though is that because you used |_, _| loop {}, anyone doing the same thing (loop {} in the body) could end up being collapsed to this (I don't think the signature matters, because usize is not passed by value).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature not mattering is an interesting point. There's probably no way to prevent that collapsing, though, right? We could plausibly do something weird (e.g., print the address of the passed references shifted to the right or something) and just say "no one else will do this", I guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should at least load an usize and blackbox it.
Otherwise there is no guarantee there is an usize (or a same-size integer) behind the opaque pointer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to do this in a follow-up PR to avoid getting this bogged down (and r? you on that one).


impl<'a> ArgumentV1<'a> {
#[doc(hidden)]
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", issue = "none")]
pub fn new<'b, T>(x: &'b T, f: fn(&T, &mut Formatter<'_>) -> Result) -> ArgumentV1<'b> {
Expand All @@ -278,11 +277,13 @@ impl<'a> ArgumentV1<'a> {
#[doc(hidden)]
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", issue = "none")]
pub fn from_usize(x: &usize) -> ArgumentV1<'_> {
ArgumentV1::new(x, ArgumentV1::show_usize)
ArgumentV1::new(x, USIZE_MARKER)
}

fn as_usize(&self) -> Option<usize> {
if self.formatter as usize == ArgumentV1::show_usize as usize {
if self.formatter as usize == USIZE_MARKER as usize {
// SAFETY: The `formatter` field is only set to USIZE_MARKER if
// the value is a usize, so this is safe
Some(unsafe { *(self.value as *const _ as *const usize) })
} else {
None
Expand Down Expand Up @@ -1356,11 +1357,11 @@ impl<'a> Formatter<'a> {
let mut align = old_align;
if self.sign_aware_zero_pad() {
// a sign always goes first
let sign = unsafe { str::from_utf8_unchecked(formatted.sign) };
let sign = formatted.sign;
self.buf.write_str(sign)?;

// remove the sign from the formatted parts
formatted.sign = b"";
formatted.sign = "";
width = width.saturating_sub(sign.len());
align = rt::v1::Alignment::Right;
self.fill = '0';
Expand Down Expand Up @@ -1392,7 +1393,7 @@ impl<'a> Formatter<'a> {
}

if !formatted.sign.is_empty() {
write_bytes(self.buf, formatted.sign)?;
self.buf.write_str(formatted.sign)?;
}
for part in formatted.parts {
match *part {
Expand Down
6 changes: 3 additions & 3 deletions src/libcore/fmt/num.rs
Original file line number Diff line number Diff line change
Expand Up @@ -369,11 +369,11 @@ macro_rules! impl_Exp {
flt2dec::Part::Copy(exp_slice)
];
let sign = if !is_nonnegative {
&b"-"[..]
"-"
} else if f.sign_plus() {
&b"+"[..]
"+"
} else {
&b""[..]
""
};
let formatted = flt2dec::Formatted{sign, parts};
f.pad_formatted_parts(&formatted)
Expand Down
32 changes: 15 additions & 17 deletions src/libcore/num/flt2dec/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ impl<'a> Part<'a> {
#[derive(Clone)]
pub struct Formatted<'a> {
/// A byte slice representing a sign, either `""`, `"-"` or `"+"`.
pub sign: &'static [u8],
pub sign: &'static str,
/// Formatted parts to be rendered after a sign and optional zero padding.
pub parts: &'a [Part<'a>],
}
Expand All @@ -259,7 +259,7 @@ impl<'a> Formatted<'a> {
if out.len() < self.sign.len() {
return None;
}
out[..self.sign.len()].copy_from_slice(self.sign);
out[..self.sign.len()].copy_from_slice(self.sign.as_bytes());

let mut written = self.sign.len();
for part in self.parts {
Expand Down Expand Up @@ -402,38 +402,38 @@ pub enum Sign {
}

/// Returns the static byte string corresponding to the sign to be formatted.
/// It can be either `b""`, `b"+"` or `b"-"`.
fn determine_sign(sign: Sign, decoded: &FullDecoded, negative: bool) -> &'static [u8] {
/// It can be either `""`, `"+"` or `"-"`.
fn determine_sign(sign: Sign, decoded: &FullDecoded, negative: bool) -> &'static str {
match (*decoded, sign) {
(FullDecoded::Nan, _) => b"",
(FullDecoded::Zero, Sign::Minus) => b"",
(FullDecoded::Nan, _) => "",
(FullDecoded::Zero, Sign::Minus) => "",
(FullDecoded::Zero, Sign::MinusRaw) => {
if negative {
b"-"
"-"
} else {
b""
""
}
}
(FullDecoded::Zero, Sign::MinusPlus) => b"+",
(FullDecoded::Zero, Sign::MinusPlus) => "+",
(FullDecoded::Zero, Sign::MinusPlusRaw) => {
if negative {
b"-"
"-"
} else {
b"+"
"+"
}
}
(_, Sign::Minus) | (_, Sign::MinusRaw) => {
if negative {
b"-"
"-"
} else {
b""
""
}
}
(_, Sign::MinusPlus) | (_, Sign::MinusPlusRaw) => {
if negative {
b"-"
"-"
} else {
b"+"
"+"
}
}
}
Expand Down Expand Up @@ -462,7 +462,6 @@ pub fn to_shortest_str<'a, T, F>(
v: T,
sign: Sign,
frac_digits: usize,
_upper: bool,
buf: &'a mut [u8],
parts: &'a mut [Part<'a>],
) -> Formatted<'a>
Expand Down Expand Up @@ -679,7 +678,6 @@ pub fn to_exact_fixed_str<'a, T, F>(
v: T,
sign: Sign,
frac_digits: usize,
_upper: bool,
buf: &'a mut [u8],
parts: &'a mut [Part<'a>],
) -> Formatted<'a>
Expand Down
Loading