Xdotool and Xmodmap are the two main reasons why, after a few months running Wayland+keyd+dotool I went back to X11. I found really hard to have the following things working at once:
- Italian layout for my keyboard with heavily-customized AltGr keys for mathematical notation (in X11 it's just a matter of having a Xmodmap file)
- Using Espanso for many common shortcuts like :date: (current YYYY-MM-DD date) and :pidigits:
- A reasonable way to run Windows in a VM while using an Italian layout for my keyboard
- The possibility to use automation scripts using something as close as possible to xdotool
- Sometimes I use my home keyboard, sometimes I use my work keyboard, and sometimes I use my laptop keyboard. I expect the system to work in the same way regardless of my input device
It's not that Wayland prevents one from doing all this stuff, but the available solutions were fragile and complicated and took me so long before figuring solutions that only worked partially... For instance, to make keyd work as expected, I was forced to set up my Italian keyboard as an English keyboard and then remap all the keys manually... And every time I plugged a new keyboard, I had to tell keyd to enable my customizations on it, because telling it to use the layout with any keyboard conflicted with VirtualBox.
I understand that X11 is too complicated to be maintained, but from an user's perspective, so far I am far more efficient in X11.
What's wrong with this case? Virtual machine reports invalid key codes to the guest? You need to have the proper layout in Windows, as (virtual) hardware only reports key codes.
You'll never find me saying that Wayland development is good in its present state. I think it's a mess and it has a lot of issues.
But let's be honest about Xorg. The overwhelming majority of people who worked on Xorg are now developing Wayland. Why? Because developing Xorg is a massive pain in the butt. It is a 400K LOC behemoth of a project and it has a ridiculous amount of technical debt. I would have to imagine that if the Xorg developers thought they could fix Xorg, they would do that instead of making a new thing.
There wasn't a need to have 10s of different wayland compositors. There is not a need to endlessly bikeshed over extentions instead of delivering user value. These are failures of leadership in driving the replacement of X.
Just compare this to Windows and how they made this rearchitecture of making their compositor more modern without splitting into 10s of compositors and breaking a ton of apps.
How can you compare the Cathedral with a bazaar? This is not a technical difference at all.
Apple/Microsoft can do whatever they want, just break compatibility at any point and everyone else wanting to have their programs supported on their platform will adapt.
Meanwhile for Linux network effect has a much bigger role to play, you can't tell anyone else what to do, but protocols can only emerge from working together.
Also, I wouldn't bring up Microsoft's display stack as a positive example at all.
Here you're just comparing proprietary closed source development to open source development. In the proprietary version the goal is to improve a product. The OSS goals are much harder to pin down and can be different person to person, but it wouldn't be unreasonable to have a goal of "make it so that other devs can make their own compositors easily" and therefore you're describing an obvious success.
Short term this might be a far slower and worse approach. It's not clear that's the case long term though, making things easier to try out different ideas and then finding a winning compositor project could be better than being stuck with one.
It isn't particularly easier to make your own compositor either, as you now also have to bring your own window manager. What made the X architecture much more interesting is that it avoided coupling the window manger to the compositor. Hell: there even are multiple popular compositors for X, as they also managed to avoid coupling the compositor to the display server (which would be the one part of the system that you don't find too many of -- though there were multiple implementations over the years! -- but that's not really much different than Wayland where everyone is using the same library to implement the behaviors as part of their coupled-together balls of mud.)
But why would I ever want to have a separate compositor and window manager? Like the display stack benefits from "vertical integration", being modular is a tradeoff, often of performance and significant complexity.
Why not just make a display server (which handles everything rendering related, compositing included), and then add a window manager as a plugin/extension on top? Window managers are not that complicated.
Perhaps proprietary closed source development is better for making operating systems. Is it a coincidence that Google was able to scale Linux to billions of devices while open source development ones weren't? Open source development should take some lessons if they want to be successful and not aggrevate developers writing apps for your platform like what happened in the article, forcing them to do extra work.
If development for X is ceasing now, there isn't time to experiment on finding the true successor.
I think the hard part about the Linux desktop ecosystem and its development pattern is the cobbled-up-parts nature of the system, where different teams and individuals work on different subsystems with no higher leadership directing how all of these parts should be assembled to create a cohesive whole. We have a situation where GUI applications depended on X.org, yet the X.org developers didn't want to work on X.org any more. If the desktop Linux ecosystem were more like FreeBSD in the sense that FreeBSD has control over both the kernel and its bundled userland, there'd be a clearer transition away from X.org since X.org would have been owned by the overall Linux project. However, that's not how development in the Linux ecosystem works, and what we ended up with is a very messy, dragged-out transition from X to Wayland, complete with competing compositors.
Bazaar-style development seems to work for command-line tools, but I don't think it works well for a coherent desktop experience. We've had so much fragmentation, from KDE/Qt vs GNOME/GTK, to now X11 vs Wayland. Even X11 itself didn't come from the bazaar, but rather from MIT, DEC, and IBM (https://en.wikipedia.org/wiki/X_Window_System).
Actually, that'd probably be a better outcome. But as it is, Red Hat & Ubuntu et al pay people to work on Wayland and those people follow corporate priorities rather than centralized priorities.
I think Red Hat wants a working desktop but I don't think they have strong official opinions on how to get there. I think individual people are responsible for the GNOME/Wayland/Freedesktop messes.
Windows gets to completely rearchitecture their compositor because they only provide one stable ABI to get pixels on the screen: link to USER32.DLL, create the necessary objects to represent a window of your application's class, then create and pump a message queue for it. It's ancient, but it works, and more specifically will never change. Even the higher level toolkits Windows ships ultimately are creating USER windows, and USER has been the only UI ABI since version 1.
macOS is the same way, except Carbon (a light modification to the procedural Toolbox API) and Cocoa (the Mac's first OOP toolkit) were "toll-free bridged" to each other rather than, say, writing Cocoa in terms of Carbon.
In contrast, X11 is a protocol anyone can implement and speak. There is no blessed library that you must use. No, Xlib doesn't count. Servers have to take their clients as they come. And Wayland, while very much deliberately stripped down from X, still retains this property of "the demarc point is a protocol" while every proprietary OS (and Android) went with "the demarc point is a library".
You're right, Xorg and X11 should be abandoned and for good reason. That should have happened decades ago. But Wayland doesn't actually fix anything that really needed fixing, other than wiping the slate clean. It's a good thing that Arcan exists, or the future of Unixland would be quite bleak.
That begs a question: if they had that much experience why they chose to structure wayland in a way that's such PITA to write for ? This just looks like some massive second system effect.
They just decided X11 did everything wrong and did it differently rather than pick up the pieces (if in spirit of idea, not code) that work and fix parts that don't
I wrote an app using Wayland and XCB/X11 and honestly, I found the Wayland part to be much easier to write than the XCB part, even though it required me to write more code.
This is partly due to the fact that everything you can do with Wayland is defined in protocols that are straightforward to use whereas in X11 you have atoms and messages with arcane name and structures for everything, a lackluster documentation and terrible error handling.
Q: if they had that much experience why they chose to structure wayland in a way that's such PITA to write for ?
A: Because they were reacting to Xorg, so they wrote the exact opposite of that.
And for bonus points, because one of the problems they wanted to solve was "Xorg is hard to maintain", they made sure that the replacement was much much easier to maintain and develop... for them. Not for application devs, not for users, but for the folks making wayland, I have no doubt it's very well streamlined and easy to work on.
You don't see me working on this stuff, but people keep complaining about this because instead of one thing that works but is a pain, we have two things that work but are a pain. It's pretty obvious that while Xorg works for a lot of people, it's not the way forward; but I think it's apparent that Wayland might not be either... although I think it's likely some will end up running a wayland server with Xwayland as the single wayland client to get continuing driver support.
This is a lot different than say OSS vs ALSA. OSS really could have worked (and still does on FreeBSD afaik), but ALSA fully replaced OSS. I think pipewire seems likely to replace PulseAudio, even if it may not have PulseAudio's key functionality of ruining audio when things used to work just fine.
Right, I think we can all mostly agree that the old state of things wasn't great/sustainable. The problem, IMHO, is that they went hard on the second-system syndrome and went way too far the other way. This allowed them to replace a massive messy codebase with a nice clean codebase that doesn't do the things people actually need from it.
Xorg put everything - way too many features - into one single display server (Xorg). Wayland put everything in the hands of the compositor, and then spawned an endless array of them (most of them implementing only a fraction of needed features).
X11 de jure and de facto required all those features to be present. In theory you could have an X server missing new features, but there was no way to get rid of really old features, and in practice you really needed all the new ones or apps would break. Wayland made essentially everything optional, to the point of fracturing the ecosystem.
Xorg was a monolithic reference implementation. Wayland ships a reference implementation in the form of weston, and it's so feature poor as to be useless.
X11 has, in practice, really poor security. (There were/are attempts to improve this, but it's not been terribly successful.) Wayland is really big on security. So much so that they refused to implement little things like screen shots and a11y features because they could be abused.
IMHO, with hindsight, they should have done this in 2 stages: First, do the backend refactoring to get the nice driver-facing parts (GBM, AIUI). Essentially, make rootful XWayland the only Xorg, but in a way that is completely invisible to users. (Or, put differently, ship https://gitlab.freedesktop.org/wayback/wayback in 2010 instead of 2025.) Second, after you've done that and vastly simplified a huge chunk of code and made upkeep and refactoring easier, start working on X12. For the sake of argument, this can still be basically the same protocol as the wayland we actually got. However, don't actually ship that at first. Instead, go build/port an actual complete desktop environment to it, including all the features people actually want - clipboard, screen sharing, a11y and automation tools, remote desktop, etc. - and actually implement all the protocols needed for those. By all means make them optional add-ons to the core protocol, but make them up front. Also, I really recommend making one of those a window management protocol, so that 90% of window managers don't have to be a compositor, though some will. Then, after the thing is actually functional, start trying to get people to switch over. Don't start pushing people to adopt something half-baked and mess about for years on basic protocols that should have shipped day one (last I checked, in 2025 there are still 3 different incompatible wayland screenshot protocols). Make it an improvement, not a regression that only benefits you the Xorg developers.
> IMHO, with hindsight, they should have done this...
FWIW, it was also obvious to many people--certainly anyone who had ever been part of one of these big refactors before, whether as the platform or the user--that this is how it should have been done when they started... they just didn't care, and then they spent a decade both directly and indirectly (by condoning the behavior) bullying people who were concerned about the process and insisting that people who even still today have perfectly working systems were/are committing some kind of cardinal sin by not embracing the one true path of Wayland, despite regressions. It is extremely difficult to find any sympathy for the people involved :/.
> The overwhelming majority of people who worked on Xorg are now developing Wayland.
I've never seen this documented.
> It is a 400K LOC behemoth of a project and it has a ridiculous amount of technical debt.
So we have people who want to create features but do not want to pay for technical debt. So.. they create more technical debt? Is there some indication that the wisdom of the crowd is particularly valuable here?
> I would have to imagine that if the Xorg developers thought they could fix Xorg, they would do that instead of making a new thing.
It seems like all the paid developers are working on Wayland while many of the volunteers are working hard to continue Xorg despite all the sponsored efforts to artificially shutter the project.
The article authors main complaint seems to be that distributions forced users to choose between one or the other when, at this point in history, there are zero good reasons to have done that.
Open source used to be about choice. Now it's about paid interests bullying you out of that choice. And Hacker News readily defends this in the name of modernity for it's own sake. It's truly a bizarre outcome to me.
There's so much hot debate about how bad Wayland is, how incorrect it is. But theres something I respect enormously about Wayland which is that: it is so so so much less than Xorg.
It uses the kernel's graphics buffers. It uses the kernel's mode setting. These alone are humongous differeniatiors.
There's so many other amazing glorious ways that Wayland is less. The protocol-centricity is vastly under rated, a massive win for the bazaar that can keep seeking truth versus the (imo utterly pathetic clining) absolutionist monolith style.
It's revolting to see such persistent bitter angry low user disdain, anger. Without any acknowledgement at all. That protocols allowing multiple implementations allows constant honing in, allows for dynamic change and evolution.
Reflecting on the Hindu Trimurti, a cycle of creation/newness, stasis/pattern, and decay & rot, it's amazing how the protest no-change/stasis-only voice has such a loud undying protest going. X is never getting better, has no room to improve, cursed by its own egocentric insanity which it has recursed into far far too far: which the core devs all agree.
It's not pleasant for everyone that Wayland allows a freedom of implementation. But generally most of the protest here has fallen away: support for major features is just here, on most implementations. That competitors can compete, don't have to keep using the same base is hugely advantageous to humanity. But the protest no-change anger-only voice is so loud. Doesn't know doesn't care.
Humanity should respect systems where competition and improvement are possible. X was a single consigned fate, with no growth or improvement. The competition of Wayland is an incredible breath of fresh air, and the growth of protocol competition here is telling, to not necessarily the "everything just works and is great" desire path of the low tech-ig orant beggar class, but which has enable so much Bazaar democratic figuring shit out, that still shares the ideas while allowing innovation within, in a way that few projects have ever enabled before. We are in a magic age of so so much, such cooperative competitive improvement, and it's just so unspoken, so missed, amid the squeaky wheels offering no actual technical critiques, unable to reflect upon the different (much better) age of possibility the bazaar model has opened us into.
This is exactly backwards. Whenever some team that is maintaining a monolith will look at the possibility of splitting it up and going with this protocol idea, they will look at Wayland as a cautionary tail of just how badly that works out in practice.
What an utterly vacuous statement saying nothing. Making no refutable claims. Typical hot air, full of nothing. Boring as fuck, nothing here.
Further, your point is spoken from the perspective of a company, a single entity. Companies are utterly unable to bank on Bazaar practices, to embrace the multitudes way of finding answers. They lack the hackerly blood to try many approaches. They are not creative enough to do anything but build their one Cathedral.
As a company, no, you should not try to build interoperable protocols to foster internal completion on. Duh, no shit. But strong command and control-while it may be good for a company-is not going to be how a much broader ecosystem finds the best paths to take.
It's incredibly impressive how much Wayland compositors compete/cooperate for better. Sway/wlroots for one example has a new Vulkan backend. They could just go try and do new things. There's protocols to implement and they made new implementations, and now there's a half dozen Wayland compositors that have new cutting edge tech they are trying out. Innovation at the edge, but working together, is the shit. Yeah it's not a model that helps the corpo's but that's because open source is searching a much wider field of options, looking much better for wins, and the cathedral model isn't going to get you any of that.
I'm still impressed what vacuous say nothing piece of shit useless Fear Uncertainty and Doubt folks can spread. This era has such a virulent pox of hatred, built around such empty words. None of these bitter words actually say anything, this whole discussion is filled with rabid useless disdain. Piss on ye, say something contestable you villainous cowards. What does you are, saying nothing, but trying to dynamite it all. A pox.
The calculus of what some team does is totally different than what open source does.
I don't understand a lot of the complains. It asks for a remote connection? That's because of xwayland (which is x11 inside) not wayland AFAIK. Also all the comments about how that is weird on a single system, mmh the whole X server/client architecture always sounded like one was running like on a remote system.
I actually like the approach that compositors are much more different from each other than WMs used to be, that allows people to experiment much more. Also let's not forget that X was a plethora of different plugins and incompatabilities. The reason many didn't encounter that was that the almost everyone was running xorg with all plugins, that said I still remember the hoops one had to jump through to get transparency etc. You needed a compositor and not all compositors were compatible with all WMs (and all had different capabilities).
That said I do also wish that the protocol would evolve faster. It is my impression that if it wasn't for the wlroots people not much would have happened, especially because the gnome guys seem to rather just implement something for themselves and don't try to use or push the standard.
And yet unix account separation really did turn out to be overcomplicated and useless. Hosting providers were never able to separate untrusted users by user account, they either use VMs or containers or give up on offering shell access at all, and on home machines the whole effort falls prey to https://xkcd.com/1200/ .
The general movement of UI paradigm has been from one tech to the next with a focus on backwards compat. Almost amusingly so at times, but this is how all the earlier users and use cases can most easily progress. E.g.
* hollerith cards and sundry + printer
* printing teletype
* dumb (video) terminal
* smart (cursor addressable) terminal
* images of smart terminals
* images of smart terminals with color (businesses resisted color for years)
* ... ?
And in the meantime we have an evolution of support for modelling things visually and working with more descriptive protocols - or even function-defining protocols to raise the abstraction chatting with the display server in realtime. In this, "abstracted" means something that can be sent over the network instead of using a local buffer. These are in a less strict order than foregoing...
* text, color plotters, VDST, and all that other old slow stuff
* [skipping a bit up through bitmapped greyscale graphics]
* bitmapped color graphics
* abstracted 2D graphics (-> W and X)
* abstracted 3D graphics (OpenGL + GLX)
* dynamically client-extendable remote graphics servers (NeWS, mostly 2D)
* ... ?
So here I am, waiting for the next stage in these. Hypothesizing that finally we'll get something with 3D abstracted, network graphics (display lists in GLX but accelerated with something like XCB?), where the primary display coördinate space is (x, y, x) instead of (x, y), where the client can push some code to the remote server and raise the abstraction on the fly, finally. Where maybe we'd be able to permission the objects in that space and share it among users live. Where the 2D apps would be inside the 3D space instead of the other way around. Something for the 2000s instead of familiar abilities provided in 1990.
But instead, Wayland. Wayland, which is not backwards compatible with X. Wayland, which is 2D at its heart. Wayland, another 1990 era graphics system with a super thin offering of features for actual end users (not devs) which come at substantial cost in lost X features. Wayland, which resists the one user doing things we've long thought of as normal - in the name of "security".
Yes the international keyboard support is pretty bad in both X and Wayland. For example, try using Left Shift to switch to layout 1 (while retaining its shift functionality) without patching Gnome. It's impossible.
Or, try making a virtual on-screen keyboard that would send characters that are not in the layout (for example, Greek character with US keyboard layout). Again, you cannot do that, and it's difficult to understand why virtual keyboard has to be restricted with characters printed on physical keyboard.
And if you want to use remote desktop from a computer with Greek layout to a computer with US layout... again, it's going to be difficult. X server-based remote apps would simply temporarily patch the layout and add non-existent keys there to be able to report the key press on a remote machine with different layout. xdotool, I think, used the same hack to input characters that are not in the layout.
The post shows a common issue with Wayland. The protocol is there, but each compositor handles things a bit differently, so tools like xdotool end up running into gaps or inconsistent behavior.
Wayland is improving, but there is still a difference between what the spec supports and what developers can rely on across the ecosystem.
A good look at why automation on Wayland still feels rough for some users.
tl;dr Wayland doesn't have a good set of universally adopted input emulation and UI automation protocols yet, which makes a portable UI automation utility with the full scope of `xdotool` impossible to write. Work remains to be done to close this gap.
The X protocols in this area were not very good, but due to there being a single viable implementation you could rely on them being present (similar to using MSIE-only features in that browser's dominant era).
At this point the Wayland project is effectively keeping desktop Linux from succeeding. It might as well have been a plant project or a strategic intelligence war from Microsoft to keep Linux on the server only.
It's a ten+ year disaster project that held desktop linux back at the precise moment of complete insanity on the part of the Windows designers with Windows 8 and the dual desktop/tiles disaster and yet-another-window-kit.
Microsoft is still pissing off its customers actively, but now we have real traction with Steam for getting gamers off of MS and onto Linux.
Second system effect is the curse of FOSS projects. It's been that way for decades. I don't see a reliable solution for the structural problem that doesn't somehow end up like a Benevolent Dictatorship. At the end of the day, designing complex systems by committee is hard to do. Maybe there is a maximum size of a group beyond which the communication matrix between the members starts to fracture?
I would claim the dictators -- even the "benevolent" ones--tend to do this more often than committees, as they have more inherent power to do so: the committees tend to get stuck in backwards compatible land forever (for better or for worse). I mean, look at Larry Wall or Guido Van Rossom with their respective debacles. Bjarne Stroustrop couldn't mess up in that way even if he seems to want to. As another example, HTTP only started having this problem with Google being able to railroad everyone. The only major second system effect caused by what I believe is a committee that I can easily come up with is IPv6?
IPv6 is a victim of the nature of the problem and a lot of under informed observers. I see too many comments asking why it's just backwards compatible or suggesting less bits would make it easier.
There are real problems but really the issue is that it was a hardware and software problem wrapped into one as well as being a collective action problem.
"that doesn't somehow end up like a Benevolent Dictatorship"
Is that a problem though? If you want to get shit done, you need someone to take responsibility for the decisions. Otherwise you get design-by-committee and endless bikeshedding and software nimbyism.
- Device emulation: uinput covers this; requiring root is reasonable for what it does.
- Input injection. Like XTEST, but ideally with permissions and more event types (i.e. tablet and touch events.) libei is close but I think it should be a Wayland protocol.
- UI automation: Right now I think the closest you can get is with AT-SPI2, for apps that support it. This should also be a Wayland protocol.
None of these are actually easy if you want to make a good API. (XTEST is a convenient API, but not a particularly good one. Win32 has better input emulation and UI automation features IMO.)
Also the tangent about how crazy the compatibility layers are is weird. Yes, funny things are being done for the sake of compatibility. XWaylandVideoBridge is another example, but screen sharing is an area where Wayland is arguably better (despite what NVIDIA has to say) because you can get zero copy window and screen contents through PipeWire thanks to dmabufs.
Some of the lack of progress comes down to disagreements. libei mainly exists, by my best estimate, because the GNOME folks don't like putting things in Mutter, and don't want to figure out how to deal with moving things out of process while keeping them in protocol. (Nevermind the fact that this still has to go through Mutter eventually, since it is the guy sending the events anyways...) However, as far as I know, lack of progress on UI automation and accessibility entirely comes down to funding. It's easy to say "why not just add SetCursorPos(x, y)" and laugh it off, but attacking these problems is really quite complex. There was Newton for the UI automation part, but unfortunately we haven't heard anything since 2024 AFAIK, and nobody else has stepped up.
If Wayland lasts as long as X11 did, it's preposterous to not spend the time to try to get the "new" version of these things right even if it is painful in the meantime.
After all, it isn't like UI automation on Linux was ever particularly good. Anyone who has ever used AutoHotkey could've told you that.
Wayland’s fragmentation is less about one problem and more about how the ecosystem grew. Each compositor implements only what it needs, so tools like xdotool run into gaps and inconsistent behavior.
The post highlights a real coordination issue. The protocols exist, but adoption is uneven and expectations differ across compositors. Users see small breaks and developers face a moving target.
Wayland is improving, especially with work from GNOME and KDE, but stronger shared conventions for automation and accessibility are still needed.
Good write-up that shows why experiences on Wayland vary so much depending on the compositor.
Xdotool and Xmodmap are the two main reasons why, after a few months running Wayland+keyd+dotool I went back to X11. I found really hard to have the following things working at once:
- Italian layout for my keyboard with heavily-customized AltGr keys for mathematical notation (in X11 it's just a matter of having a Xmodmap file)
- Using Espanso for many common shortcuts like :date: (current YYYY-MM-DD date) and :pidigits:
- A reasonable way to run Windows in a VM while using an Italian layout for my keyboard
- The possibility to use automation scripts using something as close as possible to xdotool
- Sometimes I use my home keyboard, sometimes I use my work keyboard, and sometimes I use my laptop keyboard. I expect the system to work in the same way regardless of my input device
It's not that Wayland prevents one from doing all this stuff, but the available solutions were fragile and complicated and took me so long before figuring solutions that only worked partially... For instance, to make keyd work as expected, I was forced to set up my Italian keyboard as an English keyboard and then remap all the keys manually... And every time I plugged a new keyboard, I had to tell keyd to enable my customizations on it, because telling it to use the layout with any keyboard conflicted with VirtualBox.
I understand that X11 is too complicated to be maintained, but from an user's perspective, so far I am far more efficient in X11.
> A reasonable way to run Windows in a VM
What's wrong with this case? Virtual machine reports invalid key codes to the guest? You need to have the proper layout in Windows, as (virtual) hardware only reports key codes.
You'll never find me saying that Wayland development is good in its present state. I think it's a mess and it has a lot of issues.
But let's be honest about Xorg. The overwhelming majority of people who worked on Xorg are now developing Wayland. Why? Because developing Xorg is a massive pain in the butt. It is a 400K LOC behemoth of a project and it has a ridiculous amount of technical debt. I would have to imagine that if the Xorg developers thought they could fix Xorg, they would do that instead of making a new thing.
There wasn't a need to have 10s of different wayland compositors. There is not a need to endlessly bikeshed over extentions instead of delivering user value. These are failures of leadership in driving the replacement of X.
Just compare this to Windows and how they made this rearchitecture of making their compositor more modern without splitting into 10s of compositors and breaking a ton of apps.
How can you compare the Cathedral with a bazaar? This is not a technical difference at all.
Apple/Microsoft can do whatever they want, just break compatibility at any point and everyone else wanting to have their programs supported on their platform will adapt.
Meanwhile for Linux network effect has a much bigger role to play, you can't tell anyone else what to do, but protocols can only emerge from working together.
Also, I wouldn't bring up Microsoft's display stack as a positive example at all.
Here you're just comparing proprietary closed source development to open source development. In the proprietary version the goal is to improve a product. The OSS goals are much harder to pin down and can be different person to person, but it wouldn't be unreasonable to have a goal of "make it so that other devs can make their own compositors easily" and therefore you're describing an obvious success.
Short term this might be a far slower and worse approach. It's not clear that's the case long term though, making things easier to try out different ideas and then finding a winning compositor project could be better than being stuck with one.
It isn't particularly easier to make your own compositor either, as you now also have to bring your own window manager. What made the X architecture much more interesting is that it avoided coupling the window manger to the compositor. Hell: there even are multiple popular compositors for X, as they also managed to avoid coupling the compositor to the display server (which would be the one part of the system that you don't find too many of -- though there were multiple implementations over the years! -- but that's not really much different than Wayland where everyone is using the same library to implement the behaviors as part of their coupled-together balls of mud.)
But why would I ever want to have a separate compositor and window manager? Like the display stack benefits from "vertical integration", being modular is a tradeoff, often of performance and significant complexity.
Why not just make a display server (which handles everything rendering related, compositing included), and then add a window manager as a plugin/extension on top? Window managers are not that complicated.
>What made the X architecture much more interesting is that it avoided coupling the window manger to the compositor
This is the industry standard, putting the compositor and window manager in separate processes.
Android separates SurfaceFlinger and WindowManagerService.
iOS separates quartz compositor and springboard.
Windows separates dwm and explore.
MacOS separates WindowServer and Dock.
> Short term this might be a far slower and worse approach.
We are way past the short term with Wayland!
Wayland is 17 year old.
Perhaps proprietary closed source development is better for making operating systems. Is it a coincidence that Google was able to scale Linux to billions of devices while open source development ones weren't? Open source development should take some lessons if they want to be successful and not aggrevate developers writing apps for your platform like what happened in the article, forcing them to do extra work.
If development for X is ceasing now, there isn't time to experiment on finding the true successor.
I think the hard part about the Linux desktop ecosystem and its development pattern is the cobbled-up-parts nature of the system, where different teams and individuals work on different subsystems with no higher leadership directing how all of these parts should be assembled to create a cohesive whole. We have a situation where GUI applications depended on X.org, yet the X.org developers didn't want to work on X.org any more. If the desktop Linux ecosystem were more like FreeBSD in the sense that FreeBSD has control over both the kernel and its bundled userland, there'd be a clearer transition away from X.org since X.org would have been owned by the overall Linux project. However, that's not how development in the Linux ecosystem works, and what we ended up with is a very messy, dragged-out transition from X to Wayland, complete with competing compositors.
Bazaar-style development seems to work for command-line tools, but I don't think it works well for a coherent desktop experience. We've had so much fragmentation, from KDE/Qt vs GNOME/GTK, to now X11 vs Wayland. Even X11 itself didn't come from the bazaar, but rather from MIT, DEC, and IBM (https://en.wikipedia.org/wiki/X_Window_System).
Access to virtually infinite cash had more to do with Android's success than the source being proprietary.
> it wouldn't be unreasonable to have a goal of "make it so that other devs can make their own compositors easily"
I can't say i've ever wanted a second compositor to choose from. Ideally it would just be part of the window server.
"Failures of leadership" implies that leadership actually exists. Does it?
Right, this is basically peoples' hobby projects. Nobody is incentivized to "lead" the Wayland project.
Actually, that'd probably be a better outcome. But as it is, Red Hat & Ubuntu et al pay people to work on Wayland and those people follow corporate priorities rather than centralized priorities.
I think Red Hat wants a working desktop but I don't think they have strong official opinions on how to get there. I think individual people are responsible for the GNOME/Wayland/Freedesktop messes.
Windows gets to completely rearchitecture their compositor because they only provide one stable ABI to get pixels on the screen: link to USER32.DLL, create the necessary objects to represent a window of your application's class, then create and pump a message queue for it. It's ancient, but it works, and more specifically will never change. Even the higher level toolkits Windows ships ultimately are creating USER windows, and USER has been the only UI ABI since version 1.
macOS is the same way, except Carbon (a light modification to the procedural Toolbox API) and Cocoa (the Mac's first OOP toolkit) were "toll-free bridged" to each other rather than, say, writing Cocoa in terms of Carbon.
In contrast, X11 is a protocol anyone can implement and speak. There is no blessed library that you must use. No, Xlib doesn't count. Servers have to take their clients as they come. And Wayland, while very much deliberately stripped down from X, still retains this property of "the demarc point is a protocol" while every proprietary OS (and Android) went with "the demarc point is a library".
You're right, Xorg and X11 should be abandoned and for good reason. That should have happened decades ago. But Wayland doesn't actually fix anything that really needed fixing, other than wiping the slate clean. It's a good thing that Arcan exists, or the future of Unixland would be quite bleak.
How big is the Arcan development team? Is there any prospect of Gtk or Qt adding at least basic native support?
That begs a question: if they had that much experience why they chose to structure wayland in a way that's such PITA to write for ? This just looks like some massive second system effect.
They just decided X11 did everything wrong and did it differently rather than pick up the pieces (if in spirit of idea, not code) that work and fix parts that don't
I wrote an app using Wayland and XCB/X11 and honestly, I found the Wayland part to be much easier to write than the XCB part, even though it required me to write more code.
This is partly due to the fact that everything you can do with Wayland is defined in protocols that are straightforward to use whereas in X11 you have atoms and messages with arcane name and structures for everything, a lackluster documentation and terrible error handling.
Well, yes. As you say:
Q: if they had that much experience why they chose to structure wayland in a way that's such PITA to write for ?
A: Because they were reacting to Xorg, so they wrote the exact opposite of that.
And for bonus points, because one of the problems they wanted to solve was "Xorg is hard to maintain", they made sure that the replacement was much much easier to maintain and develop... for them. Not for application devs, not for users, but for the folks making wayland, I have no doubt it's very well streamlined and easy to work on.
You don't see me working on this stuff, but people keep complaining about this because instead of one thing that works but is a pain, we have two things that work but are a pain. It's pretty obvious that while Xorg works for a lot of people, it's not the way forward; but I think it's apparent that Wayland might not be either... although I think it's likely some will end up running a wayland server with Xwayland as the single wayland client to get continuing driver support.
This is a lot different than say OSS vs ALSA. OSS really could have worked (and still does on FreeBSD afaik), but ALSA fully replaced OSS. I think pipewire seems likely to replace PulseAudio, even if it may not have PulseAudio's key functionality of ruining audio when things used to work just fine.
Right, I think we can all mostly agree that the old state of things wasn't great/sustainable. The problem, IMHO, is that they went hard on the second-system syndrome and went way too far the other way. This allowed them to replace a massive messy codebase with a nice clean codebase that doesn't do the things people actually need from it.
Xorg put everything - way too many features - into one single display server (Xorg). Wayland put everything in the hands of the compositor, and then spawned an endless array of them (most of them implementing only a fraction of needed features).
X11 de jure and de facto required all those features to be present. In theory you could have an X server missing new features, but there was no way to get rid of really old features, and in practice you really needed all the new ones or apps would break. Wayland made essentially everything optional, to the point of fracturing the ecosystem.
Xorg was a monolithic reference implementation. Wayland ships a reference implementation in the form of weston, and it's so feature poor as to be useless.
X11 has, in practice, really poor security. (There were/are attempts to improve this, but it's not been terribly successful.) Wayland is really big on security. So much so that they refused to implement little things like screen shots and a11y features because they could be abused.
IMHO, with hindsight, they should have done this in 2 stages: First, do the backend refactoring to get the nice driver-facing parts (GBM, AIUI). Essentially, make rootful XWayland the only Xorg, but in a way that is completely invisible to users. (Or, put differently, ship https://gitlab.freedesktop.org/wayback/wayback in 2010 instead of 2025.) Second, after you've done that and vastly simplified a huge chunk of code and made upkeep and refactoring easier, start working on X12. For the sake of argument, this can still be basically the same protocol as the wayland we actually got. However, don't actually ship that at first. Instead, go build/port an actual complete desktop environment to it, including all the features people actually want - clipboard, screen sharing, a11y and automation tools, remote desktop, etc. - and actually implement all the protocols needed for those. By all means make them optional add-ons to the core protocol, but make them up front. Also, I really recommend making one of those a window management protocol, so that 90% of window managers don't have to be a compositor, though some will. Then, after the thing is actually functional, start trying to get people to switch over. Don't start pushing people to adopt something half-baked and mess about for years on basic protocols that should have shipped day one (last I checked, in 2025 there are still 3 different incompatible wayland screenshot protocols). Make it an improvement, not a regression that only benefits you the Xorg developers.
> IMHO, with hindsight, they should have done this...
FWIW, it was also obvious to many people--certainly anyone who had ever been part of one of these big refactors before, whether as the platform or the user--that this is how it should have been done when they started... they just didn't care, and then they spent a decade both directly and indirectly (by condoning the behavior) bullying people who were concerned about the process and insisting that people who even still today have perfectly working systems were/are committing some kind of cardinal sin by not embracing the one true path of Wayland, despite regressions. It is extremely difficult to find any sympathy for the people involved :/.
> The overwhelming majority of people who worked on Xorg are now developing Wayland.
I've never seen this documented.
> It is a 400K LOC behemoth of a project and it has a ridiculous amount of technical debt.
So we have people who want to create features but do not want to pay for technical debt. So.. they create more technical debt? Is there some indication that the wisdom of the crowd is particularly valuable here?
> I would have to imagine that if the Xorg developers thought they could fix Xorg, they would do that instead of making a new thing.
It seems like all the paid developers are working on Wayland while many of the volunteers are working hard to continue Xorg despite all the sponsored efforts to artificially shutter the project.
The article authors main complaint seems to be that distributions forced users to choose between one or the other when, at this point in history, there are zero good reasons to have done that.
Open source used to be about choice. Now it's about paid interests bullying you out of that choice. And Hacker News readily defends this in the name of modernity for it's own sake. It's truly a bizarre outcome to me.
The overwhelming majority of people who worked on x11 are retired and a growing minority are dead.
This is the fourth incarnation of x11 and the people working on it now have nothing to do with the people who developed it.
Xorg is the castodian group who started life as a fork, of a fork, of a fork of an spinout from mit.
Them trying to kill X11 is laughable to anyone who knows anything about its history.
Wayland on the other hand is now 18 years old and we've been told it will be good any day now for 18 years.
I mean, Wayland works fine for me. I'm using niri and an nvidia card.
There's so much hot debate about how bad Wayland is, how incorrect it is. But theres something I respect enormously about Wayland which is that: it is so so so much less than Xorg.
It uses the kernel's graphics buffers. It uses the kernel's mode setting. These alone are humongous differeniatiors.
There's so many other amazing glorious ways that Wayland is less. The protocol-centricity is vastly under rated, a massive win for the bazaar that can keep seeking truth versus the (imo utterly pathetic clining) absolutionist monolith style.
It's revolting to see such persistent bitter angry low user disdain, anger. Without any acknowledgement at all. That protocols allowing multiple implementations allows constant honing in, allows for dynamic change and evolution.
Reflecting on the Hindu Trimurti, a cycle of creation/newness, stasis/pattern, and decay & rot, it's amazing how the protest no-change/stasis-only voice has such a loud undying protest going. X is never getting better, has no room to improve, cursed by its own egocentric insanity which it has recursed into far far too far: which the core devs all agree.
It's not pleasant for everyone that Wayland allows a freedom of implementation. But generally most of the protest here has fallen away: support for major features is just here, on most implementations. That competitors can compete, don't have to keep using the same base is hugely advantageous to humanity. But the protest no-change anger-only voice is so loud. Doesn't know doesn't care.
Humanity should respect systems where competition and improvement are possible. X was a single consigned fate, with no growth or improvement. The competition of Wayland is an incredible breath of fresh air, and the growth of protocol competition here is telling, to not necessarily the "everything just works and is great" desire path of the low tech-ig orant beggar class, but which has enable so much Bazaar democratic figuring shit out, that still shares the ideas while allowing innovation within, in a way that few projects have ever enabled before. We are in a magic age of so so much, such cooperative competitive improvement, and it's just so unspoken, so missed, amid the squeaky wheels offering no actual technical critiques, unable to reflect upon the different (much better) age of possibility the bazaar model has opened us into.
This is exactly backwards. Whenever some team that is maintaining a monolith will look at the possibility of splitting it up and going with this protocol idea, they will look at Wayland as a cautionary tail of just how badly that works out in practice.
What an utterly vacuous statement saying nothing. Making no refutable claims. Typical hot air, full of nothing. Boring as fuck, nothing here.
Further, your point is spoken from the perspective of a company, a single entity. Companies are utterly unable to bank on Bazaar practices, to embrace the multitudes way of finding answers. They lack the hackerly blood to try many approaches. They are not creative enough to do anything but build their one Cathedral.
As a company, no, you should not try to build interoperable protocols to foster internal completion on. Duh, no shit. But strong command and control-while it may be good for a company-is not going to be how a much broader ecosystem finds the best paths to take.
It's incredibly impressive how much Wayland compositors compete/cooperate for better. Sway/wlroots for one example has a new Vulkan backend. They could just go try and do new things. There's protocols to implement and they made new implementations, and now there's a half dozen Wayland compositors that have new cutting edge tech they are trying out. Innovation at the edge, but working together, is the shit. Yeah it's not a model that helps the corpo's but that's because open source is searching a much wider field of options, looking much better for wins, and the cathedral model isn't going to get you any of that.
I'm still impressed what vacuous say nothing piece of shit useless Fear Uncertainty and Doubt folks can spread. This era has such a virulent pox of hatred, built around such empty words. None of these bitter words actually say anything, this whole discussion is filled with rabid useless disdain. Piss on ye, say something contestable you villainous cowards. What does you are, saying nothing, but trying to dynamite it all. A pox.
The calculus of what some team does is totally different than what open source does.
I don't understand a lot of the complains. It asks for a remote connection? That's because of xwayland (which is x11 inside) not wayland AFAIK. Also all the comments about how that is weird on a single system, mmh the whole X server/client architecture always sounded like one was running like on a remote system.
I actually like the approach that compositors are much more different from each other than WMs used to be, that allows people to experiment much more. Also let's not forget that X was a plethora of different plugins and incompatabilities. The reason many didn't encounter that was that the almost everyone was running xorg with all plugins, that said I still remember the hoops one had to jump through to get transparency etc. You needed a compositor and not all compositors were compatible with all WMs (and all had different capabilities).
That said I do also wish that the protocol would evolve faster. It is my impression that if it wasn't for the wlroots people not much would have happened, especially because the gnome guys seem to rather just implement something for themselves and don't try to use or push the standard.
This is analogous to calling unix account separation "fragmentation". Why can't I just run all my services as root? It has worked for years!?
The answer is that it is a fragile, unmaintainable security nightmare.
Wayland has separation of concerns to fix that problem, with the tradeoffs described in the blog post.
And yet unix account separation really did turn out to be overcomplicated and useless. Hosting providers were never able to separate untrusted users by user account, they either use VMs or containers or give up on offering shell access at all, and on home machines the whole effort falls prey to https://xkcd.com/1200/ .
The general movement of UI paradigm has been from one tech to the next with a focus on backwards compat. Almost amusingly so at times, but this is how all the earlier users and use cases can most easily progress. E.g.
* hollerith cards and sundry + printer * printing teletype * dumb (video) terminal * smart (cursor addressable) terminal * images of smart terminals * images of smart terminals with color (businesses resisted color for years) * ... ?
And in the meantime we have an evolution of support for modelling things visually and working with more descriptive protocols - or even function-defining protocols to raise the abstraction chatting with the display server in realtime. In this, "abstracted" means something that can be sent over the network instead of using a local buffer. These are in a less strict order than foregoing...
* text, color plotters, VDST, and all that other old slow stuff * [skipping a bit up through bitmapped greyscale graphics] * bitmapped color graphics * abstracted 2D graphics (-> W and X) * abstracted 3D graphics (OpenGL + GLX) * dynamically client-extendable remote graphics servers (NeWS, mostly 2D) * ... ?
So here I am, waiting for the next stage in these. Hypothesizing that finally we'll get something with 3D abstracted, network graphics (display lists in GLX but accelerated with something like XCB?), where the primary display coördinate space is (x, y, x) instead of (x, y), where the client can push some code to the remote server and raise the abstraction on the fly, finally. Where maybe we'd be able to permission the objects in that space and share it among users live. Where the 2D apps would be inside the 3D space instead of the other way around. Something for the 2000s instead of familiar abilities provided in 1990.
But instead, Wayland. Wayland, which is not backwards compatible with X. Wayland, which is 2D at its heart. Wayland, another 1990 era graphics system with a super thin offering of features for actual end users (not devs) which come at substantial cost in lost X features. Wayland, which resists the one user doing things we've long thought of as normal - in the name of "security".
Wayland is not what I've been waiting for.
Yes the international keyboard support is pretty bad in both X and Wayland. For example, try using Left Shift to switch to layout 1 (while retaining its shift functionality) without patching Gnome. It's impossible.
Or, try making a virtual on-screen keyboard that would send characters that are not in the layout (for example, Greek character with US keyboard layout). Again, you cannot do that, and it's difficult to understand why virtual keyboard has to be restricted with characters printed on physical keyboard.
And if you want to use remote desktop from a computer with Greek layout to a computer with US layout... again, it's going to be difficult. X server-based remote apps would simply temporarily patch the layout and add non-existent keys there to be able to report the key press on a remote machine with different layout. xdotool, I think, used the same hack to input characters that are not in the layout.
The post shows a common issue with Wayland. The protocol is there, but each compositor handles things a bit differently, so tools like xdotool end up running into gaps or inconsistent behavior.
Wayland is improving, but there is still a difference between what the spec supports and what developers can rely on across the ecosystem.
A good look at why automation on Wayland still feels rough for some users.
tl;dr Wayland doesn't have a good set of universally adopted input emulation and UI automation protocols yet, which makes a portable UI automation utility with the full scope of `xdotool` impossible to write. Work remains to be done to close this gap.
The X protocols in this area were not very good, but due to there being a single viable implementation you could rely on them being present (similar to using MSIE-only features in that browser's dominant era).
At this point the Wayland project is effectively keeping desktop Linux from succeeding. It might as well have been a plant project or a strategic intelligence war from Microsoft to keep Linux on the server only.
It's a ten+ year disaster project that held desktop linux back at the precise moment of complete insanity on the part of the Windows designers with Windows 8 and the dual desktop/tiles disaster and yet-another-window-kit.
Microsoft is still pissing off its customers actively, but now we have real traction with Steam for getting gamers off of MS and onto Linux.
The opportunity is still there.
Second system effect is the curse of FOSS projects. It's been that way for decades. I don't see a reliable solution for the structural problem that doesn't somehow end up like a Benevolent Dictatorship. At the end of the day, designing complex systems by committee is hard to do. Maybe there is a maximum size of a group beyond which the communication matrix between the members starts to fracture?
I would claim the dictators -- even the "benevolent" ones--tend to do this more often than committees, as they have more inherent power to do so: the committees tend to get stuck in backwards compatible land forever (for better or for worse). I mean, look at Larry Wall or Guido Van Rossom with their respective debacles. Bjarne Stroustrop couldn't mess up in that way even if he seems to want to. As another example, HTTP only started having this problem with Google being able to railroad everyone. The only major second system effect caused by what I believe is a committee that I can easily come up with is IPv6?
IPv6 is a victim of the nature of the problem and a lot of under informed observers. I see too many comments asking why it's just backwards compatible or suggesting less bits would make it easier.
There are real problems but really the issue is that it was a hardware and software problem wrapped into one as well as being a collective action problem.
"that doesn't somehow end up like a Benevolent Dictatorship"
Is that a problem though? If you want to get shit done, you need someone to take responsibility for the decisions. Otherwise you get design-by-committee and endless bikeshedding and software nimbyism.
I don't see how else it could work...
In my opinion, three basic things are needed:
- Device emulation: uinput covers this; requiring root is reasonable for what it does.
- Input injection. Like XTEST, but ideally with permissions and more event types (i.e. tablet and touch events.) libei is close but I think it should be a Wayland protocol.
- UI automation: Right now I think the closest you can get is with AT-SPI2, for apps that support it. This should also be a Wayland protocol.
None of these are actually easy if you want to make a good API. (XTEST is a convenient API, but not a particularly good one. Win32 has better input emulation and UI automation features IMO.)
Also the tangent about how crazy the compatibility layers are is weird. Yes, funny things are being done for the sake of compatibility. XWaylandVideoBridge is another example, but screen sharing is an area where Wayland is arguably better (despite what NVIDIA has to say) because you can get zero copy window and screen contents through PipeWire thanks to dmabufs.
Some of the lack of progress comes down to disagreements. libei mainly exists, by my best estimate, because the GNOME folks don't like putting things in Mutter, and don't want to figure out how to deal with moving things out of process while keeping them in protocol. (Nevermind the fact that this still has to go through Mutter eventually, since it is the guy sending the events anyways...) However, as far as I know, lack of progress on UI automation and accessibility entirely comes down to funding. It's easy to say "why not just add SetCursorPos(x, y)" and laugh it off, but attacking these problems is really quite complex. There was Newton for the UI automation part, but unfortunately we haven't heard anything since 2024 AFAIK, and nobody else has stepped up.
https://blogs.gnome.org/a11y/2023/10/27/a-new-accessibility-...
Color management is the perfect example of how a simple ask can be complicated. How hard could it really be? Well, see for yourself.
https://gitlab.freedesktop.org/wayland/wayland-protocols/-/m...
If Wayland lasts as long as X11 did, it's preposterous to not spend the time to try to get the "new" version of these things right even if it is painful in the meantime.
After all, it isn't like UI automation on Linux was ever particularly good. Anyone who has ever used AutoHotkey could've told you that.
This is a good and informative comment.
Wayland’s fragmentation is less about one problem and more about how the ecosystem grew. Each compositor implements only what it needs, so tools like xdotool run into gaps and inconsistent behavior.
The post highlights a real coordination issue. The protocols exist, but adoption is uneven and expectations differ across compositors. Users see small breaks and developers face a moving target.
Wayland is improving, especially with work from GNOME and KDE, but stronger shared conventions for automation and accessibility are still needed.
Good write-up that shows why experiences on Wayland vary so much depending on the compositor.