A prvalue is not a temporary

blog.knatten.org

29 points by ingve 7 hours ago

When people say "Rust is just as complex as C++," I think value categories are a great example of why it's actually simpler, even if it also seems complex. C++ has three primary categories: prvalue, xvalue, and lvalue. There's also glvalue and rvalue. Rust has two: a place (which is a glvalue) and a value (which is a prvalue).

C++ needs those extra categories, they exist for good reasons. But it is more complex.

Sharlin 11 hours ago

To be fair, though, Rust really needs something morally like prvalues, to solve the heap initialization problem aka `Box::new([42; 10_000_000])`
- tialaramex 10 hours ago
  I think what you'd most likely do here is something like:
  const HUGE: usize = 10_000_000; let mut values = Box::<[i32]>::new_uninit_slice(HUGE); for k in 0..HUGE { values[k].write(42); } let values = values.assume_init();
  Edited to fix early enter oops, typo in loop length that'd be caught if I'd tried this.
  - phire 5 hours ago
    
    Only problem with that approach is that assume_init() is unsafe (because it depends on you getting the initialisation loop correct), and many people are (correctly) hesitant to use unsafe.
    IMO, it would be really nice if the naive syntax was guaranteed to work, rather than requiring programmers to remember a new workaround syntax (new_uninit_slice() was only stabilised a year ago). This edge case is a little annoying because the naive approach will usually work for a release build, but then fail when you with back to a debug build.
    
    tialaramex an hour ago
    
    Yes, sorry, too late to edit now but it should use an unsafe block for the assume and provide the safety rationale explaining why we're sure this is correct.
    I am sympathetic to desire to make it "just work" in proportion to how much I actually see this in the wild. Large heap allocations, initialized in place in a single sweeping action. It does happen, but I don't reach for it often.
    Often people say "Oh, well I will finish initializing it later but..." and now we're definitely talking about MaybeUninit 'cos if you've only half finished the Goose, that's not a Goose yet, that's a MaybeUninit<Goose> and it's crucial that we remember this or we're writing nonsense and may blow our feet off with a seemingly harmless mistake elsewhere.
  - csmantle 5 hours ago
    
    This will soon gets cumbersome if we're trying to construct some large struct literals (rather than arrays) directly on heap. Rust should be able to elide the unnecessary stack allocation here.
- steveklabnik 11 hours ago
  
  Yes, it is possible that Rust will add more complexity here specifically, but also just in general. Just how it goes :)
  - okanat 10 hours ago
    
    I am not so sure. A purely type-system-based API (in place initialization + MaybeUninit ?) or a new "static" Vec type can already solve most of the problems with Box::new([val; N]).
rich_sasha 3 hours ago

I'd say the complexities of Rust just lie elsewhere. Building a linked list is trivial in C++ (well, third time round, after two segfaults), whereas in Rust it requires a degree in borrow checker ology.
But yes, I used C++ extensively about 20 years ago and no longer understand any new developments in this language.
koito17 11 hours ago

Additionally, the "r" and "l" may lead one to incorrectly guess that rvalues and lvalues are related to their position in an expression. But alas, they aren't; there are lvalues that cannot appear in the left-hand side of an expression.
- mzajc 10 hours ago
  
  Are you referring to consts? Besides those, I can't really think of a counterexample.
  - tonyarkles 6 hours ago
    
    https://en.cppreference.com/w/cpp/language/value_category.ht...
    There’s some weird examples here involving functions, function pointers, template parameters, and a few other things.
- nutjob2 10 hours ago
  
  > cannot appear in the left-hand side of an expression
  Do you mean assignment?

spacechild1 10 hours ago

This is actually a very nice explanation!

I also enjoyed the linked article about the 5 value types: https://blog.knatten.org/2018/03/09/lvalues-rvalues-glvalues.... For some reason I never bothered to look up these terms as they sounded so obscure. Turns out the taxonomy is pretty clear and it's just a refinement of the existing two value types (lvalues and rvalues).

EDIT: I do think the naming is rather confusing and inconsistent, though.

dilawar 4 hours ago

Looks like a cultural problem in C++ land. I liked the article and it is very nicely written. But I am sure I am not gonna remember it after a week.
RAII is even worse!
Can't they have better descriptive names?
- cordenr 2 hours ago
  
  RAII is so before 2020s! Now it's SBRM*
  *Scoped bound resource management... which I had to look up because I never remember it!

nixpulvis 11 hours ago

Being able to accidentally use a value after moving it is tragic.

malkia 10 hours ago

Yup - I wish that was not the case, I've learned relatively recently that the state the variable becomes (after moving) is "valid, but unspecified" - e.g. you can still destroy it (I guess how would RAII work otherwise), you can still assign to it, and even check some basic properties (!) - but you can't know anything about the actual value (contents) it carries (weird)
It's like a tombstone for a deleted file, object - something that tells you - "Here lived Obj B Peterson, was nice folk, but moved away to a higher place"
- vitus 10 hours ago
  
  > "valid, but unspecified"
  Annoyingly, it depends on the type, sometimes with unintuitive consequences.
  Move a unique_ptr? Guaranteed that the moved-from object is now null (fine). Move a std::optional? It remains engaged, but the wrapped object is moved-from (weird).
  Move a vector? Unspecified.
- senderista 10 hours ago
  
  Those semantics are entirely on you, the implementer, to enforce. Move semantics is basically a contract between move ctor/assignop impls and dtor impls: the former must somehow signal moved-out status (typically by nulling out a ptr or setting a flag), and the latter must respect it. And of course the client shouldn't use a moved-from object in invalid ways. All of this is completely ad-hoc and unenforceable.
  - malkia 10 hours ago
    
    Well, it's not on me, because I'm often the user of someones library, framework, etc, and then it's up to whether it was concious or unconcious decision that the type behaved that way, and then you have to mix things to work together and you end up with this really "unspecified" way, hence you put some rules - "Don't do this" - don't use object after std::move, even if it might be allowed.
    I'm still baffled.
    
    senderista 7 hours ago
    
    Well they couldn't do much better without supporting destructive moves. Rust shows how simple move semantics can be when they're designed into the language.
jandrewrogers 9 hours ago

It may seem inelegant but it pretty cleanly addresses real edge cases in systems-y code where the object necessarily has a shorter lifetime than its associated memory for strict safety.
In these cases, something resembling "use after free" is unavoidable if the move is destructive even though the object is semantically dead. Putting the object in a sentinel state where it is alive but not usable captures the semantics of deferred destruction pretty well.
- nixpulvis 9 hours ago
  
  In practice it just means a lot of sentinel values and unuseful potential accesses.
  - jandrewrogers 9 hours ago
    
    Not really. In the vast majority of cases this is all elided. There is no cost, either in lines of code or runtime.
    When you are decoupling memory lifetimes from object lifetimes it is pretty explicit. It isn't like this sneaks into your code on its own. You have to manage the implications of it yourself.
    
    nixpulvis 9 hours ago
    
    It's not about cost of correct code, it's about accidental wrong code.
    
    jandrewrogers 7 hours ago
    
    That’s really not a thing. By default it is correct, you have to go out of your way to make it incorrect.

halayli an hour ago

lvalue/rvalue are not defined by their movability. value categories are about identity vs non-identity. a pr value is a pure result that has no identity. you cannot reference it by name, and everything else is a consequence of that.

stonemetal12 11 hours ago

This is like saying 1 + 2 isn't addition because the compiler will optimize it away. It isn't an addition instruction in the emitted code but logically speaking it is addition.

Similarly just because a compiler may optimize a prvalue away doesn't change the fact that a prvalue by definition of the language is a temp.

Sesse__ 11 hours ago

The article specifically points out that this isn't about optimization. A temporary will not be created even with -O0 (you can observe this by putting logging into the copy and move constructors).
- quuxplusone 11 hours ago
  
  Or even =delete'ing them or (carefully) putting static_asserts inside them. They're not called, not instantiated, not nothing.

p0w3n3d 12 hours ago

C++ is so complicated that I had to almost fail my exam and few years later I had to relearn C, get some experience in a real business project, and then I could start learning C++.

I find that understanding how memory is layed out in executable, how the C works in terms of stack, symbols etc is the introductory knowledge I had to obtain to even think about C++. Not sure what's there now, because I saw recently some great compiler warnings, but I'm pretty sure that I did convert a prvalue to a pointer reference (&) at least once in my life and later failed with memory problems, but no compiler errors

saghm 12 hours ago

Getting failures later after coercing something to a reference is even easier than that; just deference a null pointer when passing to an argument that takes a reference; no warnings or errors! https://godbolt.org/z/xf5d9jKeh
- aidenn0 4 hours ago
  
  The funny thing is that if you enable ubsan and -O, it optimizes it to unconditionally call __ubsan_handle_type_mismatch_v1; I wonder if it would be tractable to warn when emitting unconditional calls to ubsan traps...
- actionfromafar 10 hours ago
  
  If you are working with C++ in this day and age, regardless of which compiler you use to output your actual binaries, you really owe it to yourself to compile as many source files as possible with other diagnostic tools, foremost clang-tidy.
  It will catch that and a lot of iffy stuff.
  If you want to go deeper, you can also add your own diagnostics, which can have knowledge specific to your program.
  https://clang.llvm.org/extra/clang-tidy/QueryBasedCustomChec...
- TuxSH 10 hours ago
  
  Interestingly, GCC (but not Clang) detects but doesn't warn the UB and emits "ud2" with -O2: https://godbolt.org/z/61aYox7EP

dzdt 7 days ago

[Edited] For anyone like me stuck on his language: the phrase "move from" should be understood as a technical term loosely related to the English language meaning of the words. I think the post would be better if he explained this terminology; as it is you have to know an awful lot about the topic he is writing about to even parse what he is saying.

There is a pretty good stack overflow post that quuxplusone linked below. How they explain it:

  Moving from lvalues

  Sometimes, we want to move from lvalues. That is, sometimes we want the compiler to treat an lvalue as if it were an rvalue, so it can invoke the move constructor, even though it could be potentially unsafe. For this purpose, C++11 offers a standard library function template called std::move inside the header <utility>. This name is a bit unfortunate, because std::move simply casts an lvalue to an rvalue; it does not move anything by itself. It merely enables moving. Maybe it should have been named std::cast_to_rvalue or std::enable_move, but we are stuck with the name by now.

quuxplusone 5 days ago

"Move" in the sense of https://stackoverflow.com/questions/3106110/what-is-move-sem...
Now, if you don't know what "move semantics" is, then "lvalues can't be moved from" isn't terribly helpful, and if you do then it's tautological, so I'm not saying you're wrong to criticize. :) But in a C++ context, "move" does have a single specific meaning — the one he's using properly if opaquely-to-non-C++ers.
cjensen 13 hours ago

He has a good article on that at [1]
But here's the gist: sometimes you have an object you want to copy, but then abandon the original. Maybe it's to return an object from a function. Maybe it's to insert the object into a larger structure. In these cases, copying can be expensive and it would be nice if you could just "raid" the original object to steal bits of it and construct the "copy" out of the raided bits. C++11 enabled this with rvalue references, std::move, and rvalue reference constructors.
This added a lot of "what the hell is this" to C++ code and a lot of new mental-model stuff to track for programmers. I understand why it was all added, but I have deep misgivings about the added complexity.
[1] https://blog.knatten.org/2018/03/09/lvalues-rvalues-glvalues...
- neonz80 11 hours ago
  
  I find that this can reduce overall complexity. It makes it possible to use objects that can not be copied (such as a file descriptor wrapper) and moving can in most cases not fail. Without move semantics you'd have to use smart pointers to get similar results but with extra overhead.
aidenn0 4 hours ago

Presumably, if you already (mistakenly) believe that a prvalue is a temporary, then you probably have at least a vague idea of C++ move semantics. If you don't already believe that then you are probably not the audience for the article.
Conscat 13 hours ago

"move" means to pass into an r-value reference function parameter, for instance a move constructor, move assignment operator, or forwarding reference.

malkia 10 hours ago

To this day, I'm still having trouble remembering these names: lvalue, rvalue, prvalue, xvalue, glvalue.

While there was only lvalue, and rvalue - it was easy - LEFT and RIGHT - gives you the right intuition.

But now - if it has identity - "glvalue" - why not "ivalue" or "idvalue"?

And then can be moved - "rvalue" ? Why "rvalue", why not "mvalue", or "move-value"? Why the language have to be so constricted when comes to such an important bit - I mean we spell "value" fully, but miss the important bit...

Anyway just a rant... I'm having the same issue understanding "math" sometimes because of all the cryptic notations, and archaic symbols used.

kccqzy 10 hours ago

I agree the names are confusing. But it's just a simple matter of having a diagram nearby to remember which is which. (I think I first saw that diagram on Stack Overflow.)
There have been a lot of times in science when I have trouble remembering names but have no trouble at all understanding the concepts behind these names. I just keep an index card nearby. I noticed this tendency of mine as early as high school. For example, in chemistry I sometimes couldn't remember which is dextro and which is levo, but I understand chirality and know how they are different. In physics I sometimes forget which is the magnetic B field and which is the magnetic H field, though I understand the difference between them. I don't use these concepts often so I haven't internalized the names. I think it's totally alright to have a name–concept dissociation for these.
- malkia 10 hours ago
  
  I completely get it, that the language science has developed works great for it's own practicioners, but not really well for outsiders.
  Good luck explaining these "*values" to someone that hasn't touched C++ in a while. Then again other languages have the same peculiarities though. meh.
spacechild1 10 hours ago
I think they wanted to preserve the meaning of the existing two value types. The naming is still bad, though, because it's inconsistent. For example, why should 'xvalue' belong to 'generalized lvalue' when it also belongs to 'rvalue'? Aren't 'l' and 'r' supposed to be opposites on the same dimension (movability)?
Here's a suggestion in retrospect:
```
         | ivalue  | pvalue
         ---------------------
  lvalue | ilvalue | (plvalue)
  rvalue | irvalue | prvalue
```
The current naming is confusing exactly because 'lvalue' should be a supertype in one dimension, just like 'rvalue', and not a subtype (here: 'ilvalue'). I think they took this shortcut because 'plvalue' doesn't really exist, but it's still inconsistent. Let's not even talk about 'xvalue' (here: 'irvalue').

Night_Thastus 13 hours ago

Interesting stuff! I knew about lvalues and rvalues but I never knew about concepts like "glvalue" or "prvalue" or "xvalue" that the linked page talks about.

It makes sense that C++ avoids unnecessary copying or object creation whenever possible, that's pretty much C++'s M.O.

im3w1l 13 hours ago

> People sometimes call this “a temporary”, but, as is the main point of this article, that’s not necessarily true.

Old habits die hard? It used to always create a temporary right?

gpderetta 12 hours ago

> It used to always create a temporary right?
Temporaries could still be elided in some cases before, but the semantics were still understood in terms of temporary objects.
Now some forms of elision are mandatory and elision+RVO semantics are understood as objects being directly created into the final named location.

gnabgib 7 hours ago

Discussion (40 points, 6 days ago, 38 comments) https://news.ycombinator.com/item?id=45770577

tomhow 7 hours ago

Not exactly. This was originally submitted 6 days ago, and was put in the SCP but took till now to get on the front page. Just as the discussion was developing, the original submission "expired" and disappeared from the front page.
Given that it's an active discussion with all comments posted in the past few hours, I've created a new copy to give it its full opportunity for visibility and discussion.
- gnabgib 7 hours ago
  
  Ah, well that's a strange maneuver (it wasn't a vote-less or discussion-less post before the SCP and you moving the comments)
  - tomhow 7 hours ago
    
    Only two of the upvotes and one of the comments was from before today, and it had no front page time until today. It's a lot stranger for a post to disappear completely from the rankings, just as the discussion is developing.