Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "video: Prefer Wayland over X11 (take 2!)" #9345

Closed
wants to merge 1 commit into from

Conversation

Joshua-Ashton
Copy link
Contributor

@Joshua-Ashton Joshua-Ashton commented Mar 24, 2024

Description

This reverts commit f9f7db4

Wayland has a myriad of unresolved problems regarding surface suspension blocking presentation and the FIFO (vsync) implementation being fundamentally broken leading to reduced GPU-bound performance.

That is not to say "we should fix FIFO in Mesa/other drivers," but rather that it is completely unfixable without an additional protocol, in this case fifo-v11.

Without this protocol, vkQueuePresent or glSwapBuffers must stall for the 'frame' callback after presenting an image. The only reason we can get away with this on SteamOS is because Gamescope implements what is essentially fifo-v1 and we use that there.

The other side is surface suspension -- a very similar issue to the above wrt the frame callback being used in that way and blocking. If the SDL window is obscured, vkQueuePresent will block in FIFO, which games typically do NOT like. This is solved by the combination of fifo-v1 and commit-timing-v1.

There is no advantage to games and average applications preferring Wayland over X11 -- only severe performance and unusability regressions right now.
Thus, we must revert this change until fifo-v1 and commit-timing-v1 are released and at least in a stable release for major compositors.

@Conan-Kudo
Copy link
Contributor

If we do this, we are basically accepting these issues are unfixable for the next ten years (SDL4). Having this as the default in SDL3 (which isn't being used yet!) is important for signaling to other stakeholders that we actually do need to get solutions in place for detected issues.

@Joshua-Ashton
Copy link
Contributor Author

Joshua-Ashton commented Mar 24, 2024

If we do this, we are basically accepting these issues are unfixable for the next ten years (SDL4).

We are absolutely accepting they are unfixable right now yes, because they are -- but probably not for the next ten years.

We can flip the default later in SDL3's lifecycle, or even based on whether fifo-v1 and commit-timing are exposed in the wl_registry.

is important for signaling to other stakeholders that we actually do need to get solutions in place for detected issues.

SDL is not your tool for "signaling to stakeholders" about what is important. It's an actual library used by real developers and users!

You are suggesting we do those users and developers a disservice by forcing an fundamentally broken default onto them purely for your own ideological reasons.

It's really important as a developer is to be pragmatic and separate your own desires from what actually benefits your customers/users.

@pac85
Copy link
Contributor

pac85 commented Mar 24, 2024

If we do this, we are basically accepting these issues are unfixable for the next ten years (SDL4). Having this as the default in SDL3 (which isn't being used yet!) is important for signaling to other stakeholders that we actually do need to get solutions in place for detected issues.

  1. The issues pointed at by Joshua are fundamental, having broken FIFO by default on Wayland means shipping a native game with it is simply not viable and would make linux as a whole look less appealing to a game developer. I don't think what you propose is worth this risk and neither is creating a big practical problem for users of existing games that would get ported to SDL3.
  2. Whereas the issues talked by Joshua are very technical in nature and it is absolutely nontrivial for people experiencing it to determine it is due to Wayland, "SDL won't switch to Wayland until this is in place" IMHO sends a much clearer signal.

@slime73
Copy link
Contributor

slime73 commented Mar 24, 2024

Should this be an issue report (talking about the specific problems, and options for what to do about them) rather than a pull request (already picking a specific option seemingly without prior discussion from people involved in this)...?

@pac85
Copy link
Contributor

pac85 commented Mar 24, 2024

Should this be an issue report (talking about the specific problems, and options for what to do about them) rather than a pull request (already picking a specific option seemingly without prior discussion from people involved in this)...?

IMHO it wouldn't make much sense as the issue exists entirely outside of SDL, it has to do with how Vulkan and GL calls behave when running under native Wayland and since, at least in the case of Vulkan, those are calls are made by the user directly I really doubt there can be a different solution. In other words this is a Wayland WSI issue and not an SDL issue.

@slouken
Copy link
Collaborator

slouken commented Mar 24, 2024

Given that we just released the SDL 3.0 preview, I'm leaning towards making sure gamers and developers have a good experience. We can change this default again in the future.

@Joshua-Ashton, can you add a code comment briefly describing the issues and the criteria for making Wayland the default again?

@icculus, @Kontrabant, do you have any concerns before we merge this?

@slime73
Copy link
Contributor

slime73 commented Mar 24, 2024

IMHO it wouldn't make much sense as the issue exists entirely outside of SDL

It makes as much sense (or more) than a SDL pull request, I think. :)

I really doubt there can be a different solution.

I haven't heard anyone weigh in on how long it may take for those issues to be resolved in Wayland and in drivers, and whether their timelines would miss SDL3's full release or not.

There is no advantage to games and average applications preferring Wayland over X11

Personally I'm looking forward to improved DPI scale support.

@Conan-Kudo
Copy link
Contributor

Aside from the FIFO wayland protocol (which can be implemented whenever anyone wants), I don't have any references of upstream discussions related to these problems which point to the FIFO protocol or similar things. Personally, I didn't have any idea the protocol was so important until today.

@ximion
Copy link

ximion commented Mar 24, 2024

@Joshua-Ashton I understand your opinion and conclusion, but given that SDL3 has not yet been released, do you think any decision on this could be deferred until very shortly before the release?

I am coming from a different angle, of scientific applications and vendors of proprietary apps, and many do not want or can not support Wayland due to its limitations. And that will never change unless someone makes some noise about it, so I have a few protocols proposed to fix the raised issues. One annoying thing is that the affected app vendors do not want to engage with Wayland protocol discussions, so there's no weight or urgency behind the protocol additions.
Vendors just paper over the Wayland issues with crude workarounds or by simply continuing to use X11.
And I thing that's the thing where @Conan-Kudo is right.

So, maybe this decision can be deferred for now and instead the SDL project could throw its weight behind the respective Wayland protocols so they get implemented faster? If nothing happens in the required timeframe and if at SDL3's scheduled release time the issues still persist, you can for sure switch back to X11. But then the project has tried its best to collaborate and fix issues at the source, rather than papering over them too early.

This reverts commit f9f7db4

Wayland has a myriad of unresolved problems regarding surface suspension
blocking presentation and the FIFO (vsync) implementation being
fundamentally broken leading to reduced GPU-bound performance.

That is not to say "we should fix FIFO in Mesa/other drivers," but rather
that it is completely unfixable without an additional protocol, in this
case fifo-v1[1].

Without this protocol, vkQueuePresent or glSwapBuffers *must* stall for
the 'frame' callback after presenting an image.
The only reason we can get away with this on SteamOS is because
Gamescope implements what is essentially fifo-v1 and we use that there.

The other side is surface suspension -- a very similar issue to the
above wrt the frame callback being used in that way and blocking.
If the SDL window is obscured, vkQueuePresent *will* block in FIFO,
which games typically do NOT like. This is solved by the combination of
fifo-v1 and commit-timing-v1.

There is no advantage to games and average applications preferring
Wayland over X11 -- only severe performance and unusability regressions
right now.
Thus, we must revert this change until fifo-v1 and commit-timing-v1 are
released and at least in a stable release for major compositors.

[1]: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/256
@Kontrabant
Copy link
Contributor

Rolling back just means you're running through XWayland instead, which has the same underlying issues with things like throttling occluded windows, and loses things like scaled framebuffer support vs running natively.

The only real way to completely avoid this is to run a native X session until the issues are addressed.

@Joshua-Ashton
Copy link
Contributor Author

@Joshua-Ashton, can you add a code comment briefly describing why X11 is preferred when available?

Done.

Aside from the FIFO wayland protocol (which can be implemented whenever anyone wants), I don't have any references of upstream discussions related to these problems which point to the FIFO protocol or similar things. Personally, I didn't have any idea the protocol was so important until today.

Most of these discussions re. fifo have just been Mesa developers speaking together over IRC or elsewhere out-of-band.

I've been making a big deal out of at least surface suspension for multiple years now, but Wayland progress is slow. :/

At least we are getting somewhere now... I hope. :)

@Joshua-Ashton I understand your opinion and conclusion, but given that SDL3 has not yet been released, do you think any decision on this could be deferred until very shortly before the release?

The reality of this will not change with time or before release.

The right solution is for us to prefer Wayland only if fifo-v1 + commit-timing is available -- otherwise default to X11.

We can easily test the wl_registry for those, and have two Wayland_bootstrap entries, one for _preferred, one for _fallback (ie. if no X11)

@slime73
Copy link
Contributor

slime73 commented Mar 24, 2024

The right solution is for us to prefer Wayland only if fifo-v1 + commit-timing is available -- otherwise default to X11.

If this PR isn't the right solution, can you sketch one up that is (or will be)?

@Joshua-Ashton
Copy link
Contributor Author

Joshua-Ashton commented Mar 24, 2024

Rolling back just means you're running through XWayland instead, which has the same underlying issues with things like throttling occluded windows, and loses things like scaled framebuffer support vs running natively.

XWayland doesn't have the issue wrt. fifo as the Vulkan X11 WSI implementation there can use a presentation thread and implement a commit queue on the client side. This is not possible in the Wayland WSI as it is ABI that QueuePresent = wl_surface_commit.

Wrt. throttling, that does not happen on XWayland or when using an X11 graphical compositor as every window there becomes CompositeRedirectManual and thus, always technically "offscreen" -- but they are still unthrottled because it checks for that.

If you are using X11 without a compositor, you will be throttled to 1Hz when your window is occluded, however if you are using a native Wayland window and are occluded, you will never be able to make forward progress at all, and your application will stall forever. Wayland WSI is actually violating the Vulkan forward progress guarantees for presentation right now because of this.

@Joshua-Ashton
Copy link
Contributor Author

If this PR isn't the right solution, can you sketch one up that is (or will be)?

It's the right solution until those protocols land in upstream wayland-protocols.

@Conan-Kudo
Copy link
Contributor

@Joshua-Ashton, can you add a code comment briefly describing why X11 is preferred when available?

Done.

Aside from the FIFO wayland protocol (which can be implemented whenever anyone wants), I don't have any references of upstream discussions related to these problems which point to the FIFO protocol or similar things. Personally, I didn't have any idea the protocol was so important until today.

Most of these discussions re. fifo have just been Mesa developers speaking together over IRC or elsewhere out-of-band.

I've been making a big deal out of at least surface suspension for multiple years now, but Wayland progress is slow. :/

At least we are getting somewhere now... I hope. :)

This is probably why things drag on for so long. If all the discussions are out-of-band in a way that other stakeholders (particularly integrators, desktop developers, etc.) have no visibility, then there will be no movement.

(Yes, I am aware of some issues with some stakeholders, but generally most of us didn't know anything until just now.)

@Joshua-Ashton I understand your opinion and conclusion, but given that SDL3 has not yet been released, do you think any decision on this could be deferred until very shortly before the release?

The reality of this will not change with time or before release.

The right solution is for us to prefer Wayland only if fifo-v1 + commit-timing is available -- otherwise default to X11.

We can easily test the wl_registry for those, and have two Wayland_bootstrap entries, one for _preferred, one for _fallback (ie. if no X11)

What is the timeline for the SDL 3 release?

@slouken
Copy link
Collaborator

slouken commented Mar 24, 2024

What is the timeline for the SDL 3 release?

That depends largely on the feedback on the Preview Release. Realistically not shorter than 3 months. We still have over 300 issues in the release milestone.

@Conan-Kudo
Copy link
Contributor

Could we hold this for a little while then while we try to get things sorted out in the various upstreams?

@pac85
Copy link
Contributor

pac85 commented Mar 24, 2024

@Joshua-Ashton, can you add a code comment briefly describing why X11 is preferred when available?

Done.

Aside from the FIFO wayland protocol (which can be implemented whenever anyone wants), I don't have any references of upstream discussions related to these problems which point to the FIFO protocol or similar things. Personally, I didn't have any idea the protocol was so important until today.

Most of these discussions re. fifo have just been Mesa developers speaking together over IRC or elsewhere out-of-band.
I've been making a big deal out of at least surface suspension for multiple years now, but Wayland progress is slow. :/
At least we are getting somewhere now... I hope. :)

This is probably why things drag on for so long. If all the discussions are out-of-band in a way that other stakeholders (particularly integrators, desktop developers, etc.) have no visibility, then there will be no movement.

(Yes, I am aware of some issues with some stakeholders, but generally most of us didn't know anything until just now.)

@Joshua-Ashton I understand your opinion and conclusion, but given that SDL3 has not yet been released, do you think any decision on this could be deferred until very shortly before the release?

The reality of this will not change with time or before release.
The right solution is for us to prefer Wayland only if fifo-v1 + commit-timing is available -- otherwise default to X11.
We can easily test the wl_registry for those, and have two Wayland_bootstrap entries, one for _preferred, one for _fallback (ie. if no X11)

What is the timeline for the SDL 3 release?

The surface suspension discussion happened partly publicly here https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/99

@Zamundaaa
Copy link

Zamundaaa commented Mar 24, 2024

There is no advantage to games and average applications preferring Wayland over X11

I must mention here that this statement is really not true. Going through Xwayland means wrong behavior with fractional scaling - with most compositors that means reduced resolution in the game, but even on KWin it causes problems with multiple monitors that have different scales (on displays with a lower scale you get an increased resolution in the game).

Another downside of Xwayland is that outside of fullscreen, it does a copy of every frame before passing it on to the compositor. It can thus reduce performance and increase latency, especially if the compositor can offload the surface to an overlay plane.

Xwayland can do only 8 bits per color as well, which is a minor benefit for going with native Wayland.

It's also worth noting that Xwayland is not completely immune to the surface suspension issue either, it throttles windows that don't get frame callbacks to 1Hz.

There are significant problems with running natively on Wayland, but please let's not pretend Xwayland is great either. There are benefits and downsides to both choices, and they need to be weighed against each other, it's not just an obvious choice for one side.

@pac85
Copy link
Contributor

pac85 commented Mar 24, 2024

Could we hold this for a little while then while we try to get things sorted out in the various upstreams?

Even assuming the protocol was somehow merged and also implemented by enough compositors in just 3 months (which is pretty much impossible) I don't think this could reasonably reach stable releases in 3 months? This issue is very well known AFAIK among those who work around the area.

@slime73
Copy link
Contributor

slime73 commented Mar 24, 2024

It's the right solution until those protocols land in upstream wayland-protocols.

I'm a little confused right now from your messages, I think this sums things up if I got it right?

  • These performance issues affect only a subset of Vulkan setups when Wayland is used (FIFO present mode, not Mailbox or others) and don't affect OpenGL since I haven't seen it mentioned.
  • There's at least one active merge request for a protocol that will fix some of the issues you raised, which you linked earlier.
  • You also said "The reality of this will not change with time or before release." which I don't quite yet given the above merge request and this PR also raising visibility to relevant people.
  • You have a proposal to make Wayland preferred only if the protocols which fix things are available.
  • The protocols are not finalized so SDL probably shouldn't put code in for them quite yet.
  • But you could still make a draft PR which does that – i.e. the 'right thing'.

So I think having this as an issue discussion, and making that draft PR seem like good ideas. Unless I misread something?

@samzanemesis
Copy link

That's just hiding the problem I think, can you detect easily if the compositor has fifov1

@pac85
Copy link
Contributor

pac85 commented Mar 24, 2024

It's the right solution until those protocols land in upstream wayland-protocols.

I'm a little confused right now from your messages, I think this sums things up if I got it right?

* These performance issues affect only a subset of Vulkan setups when Wayland is used (FIFO present mode, not Mailbox or others) and don't affect OpenGL since I haven't seen it mentioned.

* There's at least one active merge request for a protocol that will fix some of the issues you raised, which you linked earlier.

* You also said "The reality of this will not change with time or before release." which I don't quite yet given the above merge request and this PR also raising visibility to relevant people.

* You have a proposal to make Wayland preferred only if the protocols which fix things are available.

* The protocols are not finalized so SDL _probably_ shouldn't put code in for them quite yet.

* But you could still make a draft PR which does that – i.e. the 'right thing'.

So I think having this as an issue discussion, and making that draft PR seem like good ideas. Unless I misread something?

  • The issues affect OpenGL as well ( more specifically EGL since that is the api used for presentation )
  • Any game generally allows the user to pick between FIFO and IMMEDIATE (and MAILBOX sometimes)/gl's equivalent swap interval so this applies to any game shipped with SDL
  • The MR is just a protocol, a tiny piece of the puzzle
  • To check for the protocol it must first be finalized and added to the registry or else there is no protocol to check for

@Joshua-Ashton
Copy link
Contributor Author

@Zamundaaa

It's also worth noting that Xwayland is not completely immune to the surface suspension issue either, it throttles windows that don't get frame callbacks to 1Hz.

Doesn't seem true for me on Plasma 6 Wayland, labwc and Gnome -- I also don't see anywhere in xserver that'd do that. Everything is CompositeRedirectManual on XWayland, so it shouldn't.

@Kontrabant
Copy link
Contributor

@Zamundaaa

It's also worth noting that Xwayland is not completely immune to the surface suspension issue either, it throttles windows that don't get frame callbacks to 1Hz.

Doesn't seem true for me on Plasma 6 Wayland, labwc and Gnome -- I also don't see anywhere in xserver that'd do that. Everything is CompositeRedirectManual on XWayland, so it shouldn't.

Running testsprite (with --vsync) on Gnome + XWayland, it throttles to 1hz for me when the window is occluded with both the GL and Vulkan renderers.

@Joshua-Ashton
Copy link
Contributor Author

@Zamundaaa

Another downside of Xwayland is that outside of fullscreen, it does a copy of every frame before passing it on to the compositor. It can thus reduce performance and increase latency, especially if the compositor can offload the surface to an overlay plane.

This is also not true -- this only applies to child windows that do not fill the whole window, and applications using XCopyRect, or other XComposite extensions.

Games and most apps will not get an extra copy, as they fill the whole window, or have a child window that fills the whole window.

Xwayland can do only 8 bits per color as well, which is a minor benefit for going with native Wayland.

Sure, but that's an added bonus really. That's actually one of the main reasons the Gamescope WSI layer exists. The other regressions are pretty fundamental and pretty bad.

10-bit color for your game isn't really useful if the GPU-bound performance is totally frogged frame-pacing wise :D

@polarathene
Copy link

  • To check for the protocol it must first be finalized and added to the registry or else there is no protocol to check for

So that's the main concern right? Not so much the eventual support by compositors?

If the protocol is sorted out that you can reliably check for it's support at runtime, you can prefer wayland, otherwise go with X?

@Joshua-Ashton
Copy link
Contributor Author

So that's the main concern right? Not so much the eventual support by compositors?

If the protocol is sorted out that you can reliably check for it's support at runtime, you can prefer wayland, otherwise go with X?

Yes, that is what I am proposing for us to add when the protocols are finalized.

@flibitijibibo
Copy link
Collaborator

flibitijibibo commented Mar 24, 2024

Call this a total shot in the dark, but: Did the Steam overlay ever get Wayland surface support added? I've not checked in ages but that would probably be good to have to make discussions on this more thorough; it's one thing if we have a complete end-to-end implementation of an all-Wayland system and give compositors as much information as possible in implementing the protocols, it's another if Steam just never implements for desktop or embedded and SDL chickens out without any real commitment from any involved party to fix this. We'd just be back at square one again, which I'm not really up for doing a third time in four years(!).

If not the overlay, and if gamescope does indeed do this right, maybe it would make sense to default to Wayland when gamescope is running, and if it can expose the protocols early, we can make it more robust to default when the protocols are exposed, as someone suggested earlier. As long as something is happening that's fine, but until I see something in writing that says this is going to get done then it's hard for me to review this change in good faith, which for me is just an automatic "no".

TL;DR: My customers get good results with this backend (even on Steam!); unilaterally sweeping others' effort under the rug as a first resort, without any consideration of other shareholders' experiences, does not sit well with me without some really good evidence that this isn't a lazy cop-out.

@SopaDeMacaco-UmaDelicia
Copy link

SopaDeMacaco-UmaDelicia commented Mar 24, 2024

Did the Steam overlay ever get Wayland surface support added?

Steam overlay doesn't work in wayland-native apps.

@Joshua-Ashton
Copy link
Contributor Author

Joshua-Ashton commented Mar 24, 2024

Call this a total shot in the dark, but: Did the Steam overlay ever get Wayland surface support added?

No, it hooks Vulkan, etc just fine, but there is no input on the overlay and Steam Input is not functional.

For GL apps, it does not work at all with Wayland.

Sorting this out with Gamescope would be cool, using the same stuff we use on Deck for the overlay -- especially as we have the Wayland subsurface backend (which is not affected by the same issues referenced in this MR), but we are not at that point yet.

@ernstp
Copy link

ernstp commented Mar 26, 2024

Another missing feature in the Wayland stack is support for opening a game on the primary display, afaik.

@Kontrabant
Copy link
Contributor

Another missing feature in the Wayland stack is support for opening a game on the primary display, afaik.

No, SDL3 does that if there is some means available to determine it.

@ximion
Copy link

ximion commented Mar 26, 2024

Another missing feature in the Wayland stack is support for opening a game on the primary display, afaik.

No, SDL3 does that if there is some means available to determine it.

Yes, via kde_output_order_v1 - there is no generic way yet...

@rayanHappy
Copy link

i am just curious but doesn't https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/201 solve the surface suspend issue ?

@h0x32
Copy link

h0x32 commented Mar 26, 2024

Please do NOT! go through with this PR!
SDL3 is not yet production ready, so hold your horses and if things on the Wayland side do not improve by the time SDL3 is production ready, then do as you wish. Until then, give Wayland a chance since it is currently the best thing that happened in modernizing Linux's user-space graphical software stack.

@Kontrabant
Copy link
Contributor

Kontrabant commented Mar 26, 2024

i am just curious but doesn't https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/201 solve the surface suspend issue ?

No, it basically lets applications know that they are occluded or otherwise in a state such that they can suspend rendering if they want to. The actual presentation throttling is done at the compositor level as frame callbacks stop being sent.

@Kontrabant
Copy link
Contributor

Another missing feature in the Wayland stack is support for opening a game on the primary display, afaik.

No, SDL3 does that if there is some means available to determine it.

Yes, via kde_output_order_v1 - there is no generic way yet...

And via d-bus display properties on Gnome. Not ideal I know, but until there is a better way…

@arkadijs
Copy link

Is this about SDL users / game developers testing according to Steam requirements? I suppose relying on the SDL3 default for proper testing is not a very good idea.

How about setting SDL_VIDEODRIVER and WAYLAND_DISPLAY instead?

@dylanmtaylor
Copy link

Could we hold this for a little while then while we try to get things sorted out in the various upstreams?

Respectfully, your average gamer/user does not care about the technical merits of Wayland vs X11. If running this natively with Wayland will, at the moment, deliver an objectively worse experience, i feel like SDL should run using XWayland by default until that is resolved and then switch back to Wayland by default when it is fully ready. IMO, it makes sense to merge this now.

@feature-engineer
Copy link

I'm not very familiar with Wayland, but if I understand correctly, the issue is that frames are being drawn only when the application is being polled by the frame callback?

How does that enable VRR? With VRR it is the rendering application who should dictate the refresh rate by announcing that it has a frame ready and get it rendered ASAP, no? There's also a protocol enabling screen tearing which I would expect to send the frame to the monitor ASAP too...

Or is the frame callback mechanism limited to v-synced applications? In that case, maybe rendering with the VRR\screen tear mechanisms instead of the v-sync polling mechanism fix these issues?

@stalkerg
Copy link

@sulix
Copy link
Contributor

sulix commented Mar 27, 2024

Despite being a proponent of reverting Wayland-by-default last time, I'd really prefer to keep Wayland the default for 3.0, as I think there are some real advantages to making the transition at the same time as moving from SDL2→SDL3.

One of the reasons this was backed out for 2.0.22 was that there were a number of SDL2 applications which only worked with X11 (due to directly calling Xlib/glX functions, requiring mouse warping, etc), and by switching the default at the same time that we break API compatibility, we won't have to worry about breaking existing applications. (Except via sdl2-compat, which already has a quirks mechanism in place for dealing with this sort of thing.)

As I understand it, there are three (and-a-half) regressions compared to XWayland (please correct me if I am missing anything or otherwise wrong):

  • Lack of FIFO support, which typically appears as poor performance due to blocking, not user-visible correctness issues(?)
  • Surface suspension issues, which cause some games to crash / hit network timeouts / etc, if the window is not visible or occluded.
  • An ugly default icon if no .desktop file is present. (For which there is slow progress on a protocol to fix).
  • (and the Steam overlay doesn't support it properly yet)

Equally, there are benefits to Wayland, particularly around scaling (including the framebuffer scaling features), HDR support, etc. So for some cases, Wayland can be the better option. Of course, these aren't regressions, and it does make sense to treat regressions as more important to new features (up to a point).

Given that SDL3 hasn't been released yet, and so no-one should be using it yet, there isn't any particular rush to revert the default back to XWayland now. If we need to delay the SDL3 release a bit for some of these protocols to be implemented, we can have that discussion — I don't think there's the same pressure to upgrade from SDL2→SDL3 as there was from 1→2 with the fullscreen mode change issues. So defaulting to Wayland shouldn't have a negative impact on users — developers can keep using SDL2 until Wayland's ready. (And if they really want to use it early — and having to deal with games which shipped API-incompatible SDL 1.3 builds makes me hope that this is rare — they can patch the defaults with the library they ship.)

But the reality is that distros, the steam runtime, etc do upgrade minor SDL versions behind the backs of games, so we really need some way of getting people to actually test with and support the Wayland backend from day 1 of SDL3 if we want to be able to change the default before SDL4. And setting the default now seems to me the nicest way to handle that. At the very least, let's keep Wayland the default for now, at least until we've finished locking down the API and are getting ready for release.

@swick
Copy link

swick commented Mar 27, 2024

That is not to say "we should fix FIFO in Mesa/other drivers," but rather that it is completely unfixable without an additional protocol, in this case fifo-v11.

@Joshua-Ashton Moving the blocking to ANI is unfixable without additional wayland protocols. Getting rid of indefinite blocks in present however is something we can fix and I did provide patches for it (and we can increase the minimum frequency to 30hz).

I'd much rather we work around the missing protocol in mesa than reverting to X11 for SDL.

@Joshua-Ashton
Copy link
Contributor Author

i am just curious but doesn't https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/201 solve the surface suspend issue ?

No, the driver has no idea what an xdg_* anything is.

@Joshua-Ashton
Copy link
Contributor Author

@sulix There is no HDR support. The only HDR support exists with using Gamescope + the Gamescope layer which works with X11 windows. Gamescope has a frog_ protocol it talks to Plasma with in the nested case and does thr DRM HDR side itself in the embedded case.

As mentioned before, we can just flip default to Wayland if the fifo-v1 + commit-timing protocols are available.

@darkbasic
Copy link

I'm wondering if checking for fifo-v1 and commit-timing-v1 as triggers to revert to xwayland would even make sense considering there is no guarantee that these specific extensions will ever be accepted.

@swick
Copy link

swick commented Mar 27, 2024

This mesa MR works around the problem for the most part: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28419

Obviously having the protocols would be better but with the MR I don't think we need to prefer X11 over Wayland.

@Conan-Kudo
Copy link
Contributor

@darkbasic

I'm wondering if checking for fifo-v1 and commit-timing-v1 as triggers to revert to xwayland would even make sense considering there is no guarantee that these specific extensions will ever be accepted.

Well, now all of us are aware that we need it, so that puts some urgency that we didn't really have before. That's often the problem with wayland-protocols: since merging them requires implementations to be ready first, writing code needs to be prioritized.

@sulix
Copy link
Contributor

sulix commented Mar 27, 2024

@sulix There is no HDR support. The only HDR support exists with using Gamescope + the Gamescope layer which works with X11 windows. Gamescope has a frog_ protocol it talks to Plasma with in the nested case and does thr DRM HDR side itself in the embedded case.

Sorry — I missed that the KWin implementation was just the cut down frog_ protocol. But that does work with SDL's wayland backend with the hacky vulkan layer as documented on the KWin MR, right? Either way, I do agree that it's not ready now, so shouldn't be much of a consideration at the moment, and you'd know way more about it than me. :-)

As mentioned before, we can just flip default to Wayland if the fifo-v1 + commit-timing protocols are available.

My worry with that is that developers will test with an SDL3 build, it'll default to XWayland, and then when these protocols become available on users' machines, the exact same binaries will behave differently (e.g., mouse warping will just stop working). Maybe we can document that developers should absolutely also test with SDL_VIDEODRIVER=wayland, as that will be what some users see, but I can see that going wrong for a number of reasons.

The safest option, IMO, is still to delay a final SDL3 release until Wayland is ready enough to be the default, at least on some setups. (In which case, maybe it makes sense to gate it on fifo-v1+commit-timing for people running older distros.)

Whether a delay is a worthwhile tradeoff will, of course, depend on how long it takes for these to start landing (and how long all of the other things for SDL3 end up taking as well). In the meantime, I'd prefer we keep Wayland the default for these 'preview' releases, though, as it has more restrictions from a developer POV (mouse warping, windows don't show until a frame is renderered, etc), and in my experince an SDL game which works on Wayland is more likely to work fine on XWayland than vice-versa.

@darkbasic
Copy link

Well, now all of us are aware that we need it, so that puts some urgency that we didn't really have before. That's often the problem with wayland-protocols: since merging them requires implementations to be ready first, writing code needs to be prioritized.

So I guess these extensions are not controversial and there is some consensus towards them. If that's the case I'm relieved: some other extensions like the one to remember windows placement are in a much worse position.

@Conan-Kudo
Copy link
Contributor

Yes, these aren't "messy" like those... 😅

There's still some protocol refinement being done, but that and implementations can be figured out in parallel. Then we can get acks from stakeholders once things are pinned down.

@slouken
Copy link
Collaborator

slouken commented Mar 27, 2024

Thanks for the feedback everyone!

It sounds like the Wayland folks are aware of the issues and are making progress towards addressing them. I'll leave Wayland default for now while they are doing that, so it's easier for people to test and provide them feedback. We will re-evaluate this as we get closer to full release.

@Joshua-Ashton, can you create individual issues that impact this and create a task list in the original merge request report, for tracking?

I'm going to lock this conversation for now so we can focus on development.

@libsdl-org libsdl-org locked and limited conversation to collaborators Mar 27, 2024
@slouken
Copy link
Collaborator

slouken commented Mar 27, 2024

Since the performance issues motivating this PR are resolved by the combination of fifo-v1 and commit-timing-v1, let's check for those protocols before using Wayland by default.

@Joshua-Ashton, I'll go ahead and close this PR. Can you open a new one implementing that?

For those following along, this change guarantees that high performance Vulkan applications like games have the best experience by default. Hopefully these protocols will land soon, but either way the user has final control over the backend being used, via the SDL_VIDEO_DRIVER environment variable.

@slouken slouken closed this Mar 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet