From 9c7c7b99143c1111c08e4758e4cb7a72c0bea325 Mon Sep 17 00:00:00 2001 From: "Elf M. Sternberg" Date: Sun, 27 Feb 2022 11:02:21 -0800 Subject: [PATCH] Added mucho documentation. --- docs/XCB_Cheatsheet.md | 221 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 206 insertions(+), 15 deletions(-) diff --git a/docs/XCB_Cheatsheet.md b/docs/XCB_Cheatsheet.md index 31e07ce..abbc7e4 100644 --- a/docs/XCB_Cheatsheet.md +++ b/docs/XCB_Cheatsheet.md @@ -4,24 +4,90 @@ XCB is a library for communicating with the X-Windows system used on Linux, FreeBSD, and other Unix-like operating systems. XCB's interface is written in C. -- Connecting, Verifying Connection, and Disconnecting the server. -- +## Understanding X, at least the parts I'm dealing with. + +The X Windows System (hereafter, **X**) is a server. It listens for +events (keyboard events, mouse events, timer events from connected +programs, etc.) and "stores" the results on a *display*. The +`display` is the root the of the **X** hierarchy, and the object to +which you initially connect. The `display` is described by a string +which encapsulates the address of the **X** server. The format of the +string predates the URL standard. + +Part of the problem with untangling the hierarchy in X-Windows is that +it's meant to be extremely malleable and re-usable. [The Wikipedia +article](https://en.wikipedia.org/wiki/X_Window_System_protocols_and_architecture) +says that the `display` has a top-level window; The [Xlib +Tutorial](https://tronche.com/gui/x/xlib/introduction/overview.html) +says a `screen` is a physical monitor, and a workstation can have more +than one screen, but each screen has its own top-level window. But +both of these statements are inaccurate! + +A top-level window can span multiple screens, and each screen may or +may not have an monitor at all. Headless **X** systems used for +testing may have an `output`, which is a managed region of memory into +which **X** is "drawing", but may have no physical device at all! + +For the purposes of modern **X**, the server manages one or more (in +commonplace practice, only one) logical `screen` objects, which has +multiple `output` and `crtc` objects. An `output` is the video output +manager on your device, such as your GPU, but it may just be a +virtualized chunk of memory for the headless scenario described above. +The **X** server is responsible for figuring out at start time what +`output` objects you have and what `crtc` objects are connected to +them, assigning `crtc` devices to each `output`, and choosing a +default set-up that will support running your +instance of GTK or KDE or whatever. + + + +For example, in a two-monitor set up X will have a single `screen` +with a single root `window` spanning both monitors, but there would be +two `output` devices, each with its own `crtc`, mapping a client +program's output to the physical pixels on the screen. This is how +it's possible to drag an application window from one monitor to +another, and have it be visible as it crosses the bezels between them. +The `output` and `ctrc` together act as the framebuffer manager for +displays. + +My goal is to enable autorotation on tablets running **X**, using the +XCB interface. For the purposes of that fairly straightforward goal, +I want to find the base `screen`, assert that it has a single `output` +and a single `crtc`, and then send a command to the `crtc` object to +rotate its contents to the orientation I desire. The change to the +`crtc` will cause the `output` object to remap all of the pixels it is +currently tracking to the new orientation. Most modern window +managers are pretty good about re-arranging the screen to manage this +change! + + + + + +TODO: I haven't yet figured out the bit about remapping the tablet's +touchscreen inputs, so that when you place your finger or stylus on +the screen **X** maps the pointer location to the right place. ## Connecting. -X-Windows is a server. It listens for events (keyboard events, mouse -events, timer events from connected programs, etc.) and "stores" the -results on a *display*, which is intended to be seen with the human -eye. A display is made up of one or more *screens*. Screens can be -literal (one of the physical devices in a multi-monitor setup) or -virtual (a virtualized desktop where the window manager supports -different "pages" on the same monitor), or even just parts of the same -physical screen space broken up by some logic. - -To connect to X via XCB, you use the `xcb_connect` function. It takes two -arguments, a string with the name of the display, and a -pointer-to-int to the preferred screen. It returns an opaque data -structure, 'xcb_connection_t'. +To connect to **X** via XCB, you use the `xcb_connect` function. It +takes two arguments, a string with the name of the display, and a +pointer-to-int to the default screen's ID, which is a returned value. +It returns an opaque data structure, 'xcb_connection_t'. ``` xcb_connection_t* xcbConnection = xcb_connect(const char* display, int* screen); @@ -50,4 +116,129 @@ free the memory XCB used to report the connection failure: void xcb_disconnect(xcb_connection_t* xcbConnection); ``` +## Getting the default screen structure +Once you've connected, you need the default `screen` structure. +Oddly, it's at the bottom of a linked list of `screen` structures that +you have to find by counting down from the default screen ID retrieved +during the connection, using XCB's supplied traversal functions. + +Doing this is so commonplace that there's a function for doing it +provided by the `xcb-aux` extension, which on Ubuntu is accessed +through `libxcb-util-dev`: + +``` +xcb_screen_t* screen = xcb_aux_get_screen(xConnection, default_screen_id); +``` + +The screen structure is not opaque; it contains the root window, as +well as the width and height of the total display (covering all +monitors!) that it is expected to manage, along with other details +that, well, aren't relevant (at least, not yet, and maybe I hope not). + +## Getting the screen's resources + +Now we're getting into RandR's portion of the business. RandR is an +extension to **X** that allows userspace programs to manipulate the +core functionality of outputs and displays. Those are the resources +the **X** screen has to draw on. This is also the first time we're +going to encounter the "standard" XCB interface. + +XCB has an idiom of sending a request to the **X** server and storing +a token (a `uint32_t`) that it calls a *cookie*. When it wants to +review the reply to that request, it asks for the reply using the +cookie. + +It's possible to send many requests, both commands-to-set and +requests-for-information, to the **X** server, and then retrieve the +replies all at once. If the XCB interface has already received the +replies, it can hand them over at once; otherwise, it'll wait for +one. In this way, XCB and **X** can work asynchronously, batching +transactions and reducing latency. + +For our purposes, we're not going to do that. We're just going to ask +for one object. The `get` command uses the rootWindow, which as I +mentioned earlier is on the `screen` we retrieved as `screen->root`. + +``` +xcb_generic_error_t* error = nullptr; + +xrandr_get_screen_resources_cookie_t screen_resources_cookie = + xcb_randr_get_screen_resources(connection, screen->root); + +xrandr_get_screen_resources_reply_t* screen_resources = + xcb_xrandr_get_screen_resources_reply(connection, screen_resources_cookie, &error); +``` + +**IMPORTANT** `reply` objects are allocated by XCB. *You* are +responsible for `free()`ing them afterward. If there is an error, the +reply object will be `null`, but the error object will contain the +error response as an allocated object and you are responsible for +`free()`ing it. Only `reply` and `error` objects are allocated; all +the rest are part of the `connection` object and will be freed on +disconnect. + +## Getting the outputs and crtcs + +Now that we have a screen, we want to get all the `output` devices, find +the `crtc` associated with it, and see the rotation! As I've been +using C++, I'm going to store a collection of cookies in vector. + +There are two idioms in this example; the first collects all the query +cookies and then iterates through the replies; the second gets the +query cookie and immediately requests the reply. While the second +idiom is "slower" by an order of magnitude, on my laptop with a +shared-memory connection the difference is 3000 nanoseconds vs 300 +nanoseconds-- not enough for most people to notice. + +Notice that I do *not* free the return from the +`xcb_randr_get_screen_resources_outputs` call; that is simply +interpreting the contents of the `screen_resources` object. As +before, the `screen_resources` object itself will have to be freed +eventually. + +``` +std::vector output_get_cookies; +std::vector output_crtc_ids; + +xcb_generic_error_t* error = nullptr; + +int len = xcb_randr_get_screen_resources_outputs_length(screen_resources); +xcb_randr_output_t* randr_outputs = + xcb_randr_get_screen_resources_outputs(screen_resources); + +for (int i = 0; i < len; ++i) { + output_get_cookies.push_back( + xcb_randr_get_output_info(connection, randr_outputs[i], timestamp)); +} + +for (const auto& cookie : cookies) { + xcb_randr_get_output_info_reply_t* reply = + xcb_randr_get_output_info_reply(connection, cookie, &error); + if (error) { + free(error); + continue; + } + + xcb_randr_get_crtc_info_reply_t* crtc = xcb_randr_get_crtc_info_reply( + connection, + xcb_randr_get_crtc_info(connection, output->crtc, timestamp), NULL); + + // It's possible that there is no CRTC associated with the + token. This isn't an error. + + if (!crtc) { + continue; + } + + std::cout << "(x: " << crtc->x << ", y: " << crtc->y << ") (width: " << crtc->width + << ", height: " << crtc->height << ") status:" << unsigned(crtc->status) + << " rotation: " << rotation_map(crtc->rotation) << std::endl; + + free(crtc); + free(reply); +} +``` + +... this is as far as I've gotten. And it's all starting to make +sense, but wow, what a journey just to get this far.