What happens when you type in google.com and press enter?
That used to be one of the interview questions I asked, and it’s still one of my favorites; the idea is to talk a candidate through parts of their chosen technical stack. Most often we spend most of our time on the front and back ends of a typical web stack, but once in a rare while we venture into the OS layer or web protocols1. Sometimes it’s easy to forget how much needs to happen for a Google homepage to appear on your screen.
On a distant tangent, ever since I’ve been building my mechanical keyboards and messing around with various hardware components and now into the firmware, I’ve been thinking about an analogous “keypress stack” as it relates to keyboards. Specifically, what happens when you press a key on a keyboard, and how can you customize that to your liking along the way?
Here’s my version of that answer—to the best of my understanding, so far. Note that since I do most of my work and personal computing on a Mac, the software stack will be primarily focused on that platform. For most of the tools I mention, there are either Windows/Linux versions or equivalents.
At the hardware level, we have the physical switch and the electric signal that when triggered is sent to the printed circuit board (PCB). For mechanical keyboards, there’s a huge variety of mechanical switches available to choose from, many with their characteristics](https://www.theremingoat.com) on typing feel and physical activation attributes. And they’re socketed or soldered onto a PCB, which doesn’t allow for too much creativity2, though some boards allow for fancier controls like knobs and sliders.
Keyboard firmware is highly customizable. I’ve been messing around with ZMK (which powers the Advantage360 Pro) and QMK (which powers the lily58 split keyboard I built over Thanksgiving), and both provide a lot of functionality and opportunity to tweak keyboard functionality well beyond mapping a key to a keycode, provided that you’re comfortable writing and compiling C code. LEDs are fully controllable; there are key tap-and-hold combinations that can be enabled by C macros; or, you can hook directly into pre- and post-keypress events to modify behavior. The firmware code itself is admittedly quite messy—it reminds me of PHP spaghetti mixed with JS callbacks—but it’s easy enough to make tweaks and flash the keyboard to test changes. My keyboard has 2 little OLEDs on it, and with copious amounts of copying and pasting others’ work, I have Bongo Cat smashing a table per my typing speed.
The BIOS, kernel, and operating system can tweak a key input. The keyboard input travels, usually via Bluetooth or USB, to the computing device where it’s interpreted at the BIOS, then the operating system at the kernel and user space levels. It isn’t usually much available for end-users to modify at this layer; I’ve generally only seen hardcoded settings to toggle modifier keys (e.g., ⌃, ⌘, ⇧ and the like) and to disable Caps Lock since it’s no longer that useful in modern word processing but inherits a convenient position on most keyboards.
Within an OS, background processes can intercept an incoming key and modify its behaviors. On macOS, I’ve been using Karabiner-Elements to set up more complex modifier key changes; on Windows, Autohotkey can perform a similar function and is popular for setting up macros independent of OS or hardware. There are also several utilities to manipulate text via expansion or replacement: TextExpander is one of the most popular cross-platform solutions, but I’ve been a much simpler app called aText. My clipboard manager is Pastebot, which isn’t the newest app on the block but remains solid on the basics.
Other helpful utilities run agents in the background to catch universal shortcut key combos to activate their apps. I use Moom, for instance, to add a layer of window management in MacOS as the OS default is still lacking after all these years. Another recent favorite is Raycast: since it can define and capture any keyboard shortcuts across apps and its plugins, its ecosystem has replaced most other legacy utilities for me.
Finally, the active app itself gets to interpret the keypress. There are a set of conventional, universal macOS shortcuts that appear in the menu bar and can be customized within the System Preferences/Settings app. If it’s not a shortcut key combo intercepted, or it has been reinterpreted by any of the layers of the stack above, then the active app can receive the input and do something with it—perhaps to render the corresponding character in a text field.
So how does this all play out in practice? Here’s one combination I do use—one of the more convoluted ones:
- I hit the lower left key on my lily58 keyboard.
- The key is programmed as a mod-tap key3, and I have the key sending a RIGHT⌃ keypress when tapping.
- In addition, I’ve hooked into the tapping event itself and set up a one-shot mod4, so it’s waiting for my next keypress to send the full modifier + key combination.
- I hit the ⇥ key, which in sum sends ⌃ + ⇥ to the OS.
- Karabiner-Elements translates ⌃ into ⌃ + ⇧ + ⌥ for me5; it’s the same as pressing ⌃ + ⇧ + ⌥ + ⇥ with 4 fingers. This key combo is captured by aText and triggers an inline macro search in the current text field.
It took me several years of using macOS to configure the right utilities and system settings; the introduction of programmable mechanical keyboards added another layer of complexity and customizability. There’s also some overlap in functionality between different layers of the stack that provide optionality—where you implement a macro6 or a remap is dependent on whether you want that config to apply per keyboard, per operating system, per computer, or even per user login.
And much like other hobbies, there’s a point of diminishing returns, where getting the keypress stack exactly tuned isn’t going to increase productivity, as much as it is just fun to explore what’s possible. This is a good local maximum to stick with, at least, until there’s another interesting keyboard or utility to try out.
That said, I’ve never had anyone talk through the hardware stack or the browser rendering engine.↩
Though I’ve seen some pics of folks creatively soldering a component to make electric contact and save a potentially ruined board.↩
This is where the key behaves one way when held, usually acting as a modifier, and another way when tapped.↩
One-shot modifiers let you apply the modifier with one keypress, and the key with a second one, e.g., instead of holding ⇧+k, you can tap ⇧ then k and get the same output.↩
Some people seriously hardcode URLs as shortcut keys in keyboard firmware.↩