The Wayback Machine - https://web.archive.org/web/20200928032054/https://github.com/cuberite/cuberite/pull/4689
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove blocking GetHeight #4689

Open
wants to merge 8 commits into
base: master
from

Conversation

@peterbell10
Copy link
Member

peterbell10 commented Apr 22, 2020

Fixes #4688

As the issue shows, blocking GetHeight is very dangerous because we can't reliably unlock the chunk map mutex. Instead, we should return optional<int> and figure out how to respond when the chunk isn't loaded at the call site.

Since we don't have c++17, I'm just imitating the std::optional API with cpp17::optional.

WIP to figure out what to do with block handlers, entities and lua api which all use GetHeight either directly or indirectly.

  • Biggest use case is IsWeatherWetAtXYZ which doesn't really matter that much. We can probably get away with using a default value there.
  • Lua API is tricky because we can't make the descision for the plugin. Either we need to figure out something clever, or just change the API and raise an error.
  • Similarly, the spawn height is something that we can't just guess. One thought would be to use cChunkMap::GenerateChunk with a callback that sets the spawn. That might leave a gap where the spawn isn't set though which wouldn't be great.
@peterbell10 peterbell10 force-pushed the peterbell10:nonblocking-get-height branch from 8c15239 to 454c6aa Apr 23, 2020
src/Blocks/WorldInterface.h Outdated Show resolved Hide resolved
@tigerw
Copy link
Member

tigerw commented Apr 23, 2020

I'm not keen on an ad-hoc introduction of optional. It would be better if we prevented things that are ticked under the chunkmap lock from calling back into world. For example, the farmland handler by rights should be able to determine if its wet solely from the cChunk it gets.

Ideally GetHeight would simply return 0 when the chunk isn't loaded, as IsWeatherWetAt should return the global weather state. This is both consistent with what GetBlock does, as well as the physical interpretation that if the chunk is empty it's gonna be exposed to the sky and have no height.

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 23, 2020

It would be better if we prevented things that are ticked under the chunkmap lock from calling back into world.

I don't think that's relevant to GetHeight. If they went through cChunk::GetNeighbourChunk for example, they still need to deal with missing chunks. Avoiding reentering into cWorld is also made impossible by plugin hooks where cWorld may be the only way to write it.

the farmland handler by rights should be able to determine if its wet solely from the cChunk it gets.

Clearly not, or cWorld::GetHeight would always succeed. The problem is that it checks for rain on neighbouring blocks.

Ideally GetHeight would simply return 0 when the chunk isn't loaded, as IsWeatherWetAt should return the global weather state. This is both consistent with what GetBlock does, as well as the physical interpretation that if the chunk is empty it's gonna be exposed to the sky and have no height.

This comes back to @madmaxoft's point in #4680 (comment). Failing silently is not good because we can get incorrect behaviours as a result. e.g. a block might fail a CanBeAt check because of the default value and drop when it shouldn't.

As for GetBlock, I think it should return optional as well. A missing chunk should be treated specially, not just as air.

@tigerw
Copy link
Member

tigerw commented Apr 23, 2020

In response to not failing silently, GetBlock returns optional etc:

Why? It's practically ingrained into the fabric of Minecraft that chunks might be unloaded. If it's absolutely required that you need the data, use a chunkloader. And what do unloaded chunks contain, if not air?

About incorrect CanBeAt, I think it would suffice to stop ticking chunks with unloaded neighbours, or better yet when getting neighbours directly from a cChunk, simply "failing safe" and doing nothing.

Certainly the farmland handler is strange though, (unless I'm totally blind xD) the call to IsWeatherWetAt doesn't seem to modify a_RelPos at all, and yet it couldn't find its own chunk? wat?

Agreed with the point about going through cWorld in the general case; given that it's not feasible to remove the lock right now GetHeight has to change.

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 23, 2020

It's practically ingrained into the fabric of Minecraft that chunks might be unloaded.

Returning nullopt instead of air doesn't mean that unloaded chunks are no longer "ingrained into the fabric of Minecraft"; it just means we can differentiate between "unloaded" vs "loaded and has air at this block".

If it's absolutely required that you need the data, use a chunkloader. And what do unloaded chunks contain, if not air?

That's not the same at all: a chunk stay gives you the accurate data 100% of the time without failing. GetBlock fails silently giving you inaccurate data. optional is just encoding that failure into the return value. If treating it as air is fine, then just call GetBlock().value_or(E_BLOCK_AIR) and it's the same as before.

better yet when getting neighbours directly from a cChunk, simply "failing safe" and doing nothing.

That's exactly what this change allows. You can now fail safe when going through cWorld.

Certainly the farmland handler is strange though, (unless I'm totally blind xD) the call to IsWeatherWetAt doesn't seem to modify a_RelPos at all, and yet it couldn't find its own chunk? wat?

You're right. The original code was using a_RelPos as an absolute coordinate for IsWeatherWetAtXYZ so it was a bug that it went into different chunks. In general, it's still true that blocks depend on their neighbours though.

@tigerw
Copy link
Member

tigerw commented Apr 23, 2020

I disagree that getting the height of, or a block at an unloaded chunk and receiving 0/air is a failure. The concept is meaningless when for any typical server, GetBlock would 'fail' for most of its inputs.

I argue that, if one needs something more descriptive:

  1. Ask if the chunk is valid, and bail if not - it is after all the purpose of chunking
  2. Use a call that returns whether a chunk was there
  3. Use a chunkstay

To be clear I'm absolutely not against something like TryGetHeight or GetBlockTypeMeta which return bools. My main concern is the introduction of an optional type and the inconsistency this brings with regards to everything that already exists.

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 23, 2020

I disagree that getting the height of, or a block at an unloaded chunk and receiving 0/air is a failure. The concept is meaningless when for any typical server, GetBlock would 'fail' for most of its inputs.

I disagree with the premise. Failure doesn't have to be rare. A compiler for example has an infinite number of input files that would fail to parse. That doesn't make it's any less of a failure.

Whether you want to call it failure or not, distinguishing a loaded from an unloaded chunk is useful. Having to call a separate IsValid function first is just making the API more awkward.

My main concern is the introduction of an optional type and the inconsistency this brings with regards to everything that already exists.

The bool TryGet functions are completely equivalent to returning optional. It's entirely historical and now that we have better tools, we would move away from it.

@tigerw
Copy link
Member

tigerw commented Apr 23, 2020

About the last point, surely not. Migrating piecemeal to optional in getheight, when things like getblock will effectively never change doesn't seem right to me. (also I believe xoft strongly dislikes underscores, so value_or is questionable :p). We should strongly prefer to maintain consistency. If they are equivalent, I would really urge that GetHeight be made not to block its caller and load chunks, TryGetHeight left as-is, and we should figure out why the farmland call tried to load any chunks in the first place.

@tigerw
Copy link
Member

tigerw commented Apr 23, 2020

I am hugely concerned about consistency since this is still essentially a small project. A few years ago the cWorldInterface/ChunkInterfaces were introduced, but there was a lack of developer manpower to drive adoption or maintain them. This means that now multiple places have cWorldInterface and a cWorld, or both world and chunk interfaces etc. I feel keeping things simple is our best bet.

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 24, 2020

We should strongly prefer to maintain consistency.

I'm reminded of "a foolish consistency"

figure out the best way to express something and always express it the same way.

Until you learn a better way. Then embrace the improvement. This inconsistency is called growth.

I agree that consistency is good, but only if the pattern you're following is consistently good.

We already have several different ways of encoding "this operation might fail". Many follow the pattern of TryGetHeight: start with Try, return a bool and return the actual value with an out-parameter. However, some actually start with Unbounded and others like GetBlockTypeMeta don't give any hint they can fail.

Many callers forget to check the return value. When the function does fail, you end up using an unassigned value. If we step outside of cWorld, cuberite also has a common pattern of returning std::pair<bool, T> (or std::pair<T, bool>). Clearly looking for something like optional.

So, there is currently very little consistency anyway. And we should not just aim for consistency, but to consistently use the best option available to us. Not just the one that's most common now.

I propose we use optional moving forward because it clearly expresses the intent and embeds it in semantics of the type. It forces the programmer to acknowledge the failure case and makes it visible in code review. You might forget that GetBlockTypeMeta can fail, but you can't forget to dereference an optional. Otherwise, you would get a compile error.

@tigerw
Copy link
Member

tigerw commented Apr 24, 2020

The consistency of naming, or whether some function returns success in the first place can be discussed on another day, and besides, getblocktypemeta on a chunk cannot fail, I thought?

It's also not like our caller can't call value directly on the optional and end up with a bogus read in Release. Plus what is and isn't caught in code review is a complete unknown quantity, there's no evidence either way to suggest which is more visible.

I think the discussion has boiled down to a philosophical debate about the merits of bool vs optional, since from a technical perspective they both work. The main problem to be fixed was making getheight non-blocking, which you've done.

@tigerw
Copy link
Member

tigerw commented Apr 24, 2020

But wait....
there's a new challenger on the scene.

What about: returning -1 for a failed getheight?

This neatly sidesteps you concern about 0 being overloaded!

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 24, 2020

It's also not like our caller can't call value directly on the optional and end up with a bogus read in Release. Plus what is and isn't caught in code review is a complete unknown quantity, there's no evidence either way to suggest which is more visible.

I find this argument completely absurd. If you see Height.value() in code review, it's obvious that it's an optional. If you see m_World->GetBlockTypeMeta, there is absolutely no hint that it can fail. The proof that this will get past code review, is the fact that there are hundreds of places in cuberite where similar functions are called without checking for failure. Clearly the status quo isn't working.

What about: returning -1 for a failed getheight?

That has the exact same problem. It isn't obvious it can fail, so people will forget to check for failure.

@tigerw
Copy link
Member

tigerw commented Apr 24, 2020

Please bear with me, I like a technical debate :)

I was suggesting that if it gets past review, optional does nothing for the "failure" case where an empty chunk is interpreted as containing air. About visibility during review, I argue that just as you know .value implies optional, another hypothetical reviewer knows GetBlockTypeMeta implies a boolean return value, by virtue of knowing the signature of one of the most used class of functions.

And about the status quo not working, I haven't seen evidence of this. The deadlock was caused by GetHeight improperly blocking.

And your final point gets back to the definition of failure, which we couldn't agree on.

hm, what about something like

Unloaded chunk is failure Like in C#
unsigned GetHeight bool TryGetHeight out H
BLOCKTYPE GetBlock bool TryGetBlock out B
NIBBLETYPE GetMeta bool TryGetMeta out N
pair GetBlTyMe bool TryGetBlTyMe out B, out N

This way, if you know that your call always will succeed - no overhead, no checks, no visually jarring value_or, pay for what you use. But if you're wrong, assertion in debug, or fails your plugin. On the other hand, TryXX, as in C#. With bools, or, if you really insist, optional.

@tigerw
Copy link
Member

tigerw commented Apr 24, 2020

Then, ensure in the main server code that entities/blocks on chunk boundaries aren't ticked, so they can never make a failing call.

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 24, 2020

@madmaxoft as benevolent dictator, do you have any preference on bool TryGet vs optional<T> Get?

@madmaxoft
Copy link
Member

madmaxoft commented Apr 24, 2020

I'm in favor of Optional<T>; however, I don't like its underscore-y API (just as @tigerw predicted). I know it's mimicking the API of the C++17's std::optional, but still...

So, basically, two questions:

  • Why aren't we using C++17 already? If there's no blockers, I'd like to go that way, use the std::optional (and auto-ties :) Perhaps build servers? @bearbin ?
  • If that's impossible, can we make the symbol names consistent with the rest of the code (cClass::FunctionName)? Wrapping a std::optional into this API should be easy once we do move to C++17.
@madmaxoft
Copy link
Member

madmaxoft commented Apr 24, 2020

The only problem I could think of with GetHeight getting this overhaul is that it used to be advertised in the Lua API as the "ideal way to make sure a chunk is loaded / generated" (from the times before ChunkStay).

@bearbin
Copy link
Member

bearbin commented Apr 24, 2020

I think build servers are the only reason why we aren't using C++17. It should not be too difficult to change though, all that's required is to manually create a new platform definition and cross-compiler pack for the raspberry pi builds.

In terms of the actual substance, I strongly dislike introducing Optional<T> which is non-standard and used in only one place; but the general principle of optionals and using the standard C++17 form is something I very much support. So my feeling is that if we are in the mood for a larger refactor, we should make everything use the optionals; if not just change to return a boolean and output reference signature.

@tigerw
Copy link
Member

tigerw commented Apr 24, 2020

One other concern (trustme is guaranteed to always be passed not 0):

#include <optional>

std::optional<int> get(int x)
{
    if (x == 0)
    {
        return {};
    }

    return 1;
}

int trustme(int x) noexcept
{
   return *get(x); 
}

compiles to

trustme(int):
        mov     eax, 1
        test    edi, edi
        jne     .L5
        xor     eax, eax
        mov     BYTE PTR [rsp-1], 0
        mov     BYTE PTR [rsp-4], 0
        mov     WORD PTR [rsp-3], ax
        mov     eax, DWORD PTR [rsp-8]
.L5:
        ret

and calling .value instead

#include <optional>

std::optional<int> get(int x)
{
    if (x == 0)
    {
        return {};
    }

    return 1;
}

int trustme(int x) noexcept
{
   return get(x).value(); 
}

is even worse

trustme(int):
        sub     rsp, 8
        test    edi, edi
        je      .L11
        mov     eax, 1
        add     rsp, 8
        ret
trustme(int) [clone .cold]:
.L11:
        mov     edi, 8
        call    __cxa_allocate_exception
        mov     edx, OFFSET FLAT:std::bad_optional_access::~bad_optional_access() [complete object destructor]
        mov     esi, OFFSET FLAT:typeinfo for std::bad_optional_access
        mov     QWORD PTR [rax], OFFSET FLAT:vtable for std::bad_optional_access+16
        mov     rdi, rax
        call    __cxa_throw

Compare with

#include <cassert>

int get(int x)
{
    if (x == 0)
    {
       assert(!"you goofed");
    }

    return 1;
}

int trustme(int x) noexcept
{
   return get(x); 
}

on Release

trustme(int):
        mov     eax, 1
        ret

Are you sure you want to pay this penalty for every Get, even those you either know will absolutely succeed, in Debug and Release? Not to mention littering every call with either an asterisk or .value()

Paying only for what you use is a core C++ principle after all.

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 25, 2020

Why aren't we using C++17 already? If there's no blockers, I'd like to go that way, use the std::optional (and auto-ties :) Perhaps build servers? @bearbin ?

I would love to move to C++17. I believe, as @bearbin said, the main reason was that the raspberry pi toolchain we use is too old. If that's something we can fix, then absolutely go for that.

If that's impossible, can we make the symbol names consistent with the rest of the code

I was going by the precedent of cpp14::make_unique that std compatibility features should follow the standard naming. Hopefully we will upgrade our toolchains at some point. When that happens, consistent naming should make it easier to convert. I don't have a strong opinion on this though.

@tigerw that's not a fair comparison. You compile an assert in release mode and it compiles to nothing. That's obviously true by the definition of assert. A fair comparison would return bool from get and use an out param. Or, at the very least you would use VERIFY instead of assert.

@tigerw
Copy link
Member

tigerw commented Apr 25, 2020

xoft: yeah Lua might need to be broken as peterbell points out, unless there's a way to detect the condition and unlock the mutex twice?

peterbell: I was thinking of the hot-path plain int returning GetHeight/GetBlock; of course a similar TryGetX would incur the same penalty as the return value needs to be checked. Wouldn't you agree that it would be good for callers who know the call will always succeed (loaded chunk, chunkstay, player is close, previous call succeeded) to have the choice of using the direct version, instead of checking optional?

@peterbell10
Copy link
Member Author

peterbell10 commented Apr 25, 2020

Here is a fairer comparison:
https://gcc.godbolt.org/z/TnRfrJ

The additional cost is 3 extra instructions. Accessing the chunk storage arrays in the first place is quite likely to miss the CPU cache, in which case memory accesses can be 100x more expensive than a normal instruction. So, I genuinely don't think this will be important performance-wise.

If you do really want direct access with minimal overhead (say for example to implement cBlockArea) then you would be better off accessing the cChunk object directly with cChunk::GetBlock etc.

In general, it's also very tricky to know for sure that a chunk will be valid. For example, a player being close means nothing if the server is under high load and the chunk generator is going slowly. Chunk stays also don't quite guarantee the chunk will be present, because of #3572 (comment).

Copy link
Member

madmaxoft left a comment

The Lua API docs for cWorld:GetHeight needs to be updated, too.

src/Bindings/ManualBindings_World.cpp Outdated Show resolved Hide resolved
@madmaxoft
Copy link
Member

madmaxoft commented Apr 26, 2020

None of the optionals are in such a hot path that any of this would matter greatly. I'd say merge this PR in, then work on getting C++17 work, then remove the optional and make_unique scaffolding.

@peterbell10 peterbell10 force-pushed the peterbell10:nonblocking-get-height branch from d13b72b to 56ad5c0 Apr 27, 2020
src/World.cpp Outdated Show resolved Hide resolved
@peterbell10 peterbell10 force-pushed the peterbell10:nonblocking-get-height branch from 93008fa to 8a968b5 Apr 27, 2020
@peterbell10 peterbell10 marked this pull request as ready for review Apr 27, 2020
@peterbell10 peterbell10 reopened this Apr 27, 2020
@peterbell10 peterbell10 force-pushed the peterbell10:nonblocking-get-height branch from a94905e to bf44e26 Apr 27, 2020
@peterbell10 peterbell10 changed the title WIP: Remove blocking GetHeight Remove blocking GetHeight Apr 27, 2020
@peterbell10 peterbell10 closed this May 7, 2020
@peterbell10 peterbell10 reopened this May 7, 2020
@peterbell10 peterbell10 requested a review from madmaxoft May 8, 2020
@peterbell10 peterbell10 force-pushed the peterbell10:nonblocking-get-height branch from 51b79dd to 0590ce4 May 9, 2020
@peterbell10 peterbell10 force-pushed the peterbell10:nonblocking-get-height branch from 0590ce4 to 98b778c May 13, 2020

// Call the implementation:
auto IsWet = self->IsWeatherWetAtXYZ(BlockPos);
L.Push(IsWet.value_or(true));

This comment has been minimized.

@bearbin

bearbin Aug 5, 2020 Member

not raining strikes me as a more useful default, in any case this should be documented in the API docs.

This comment has been minimized.

@peterbell10

peterbell10 Aug 9, 2020 Author Member

It only fails if it's raining in the chunk but the terrain is not loaded so it can't tell where the rain is falling. On the client that would render as rain falling all the way to void i.e. wet everywhere.

This comment has been minimized.

@peterbell10

peterbell10 Aug 9, 2020 Author Member

I could also add a Try variant if that's desirable.

This comment has been minimized.

@bearbin

bearbin Aug 10, 2020 Member

I guess it doesn't really matter that much, the issue shouldn't come up that often. It's just that everywhere else that IsWeatherWetAtXYZ is used, the default on nullopt is false, rather than true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants
You can’t perform that action at this time.