Hacker News new | past | comments | ask | show | jobs | submit login
Buttons as Finite Automata (stanford.edu)
229 points by picture on March 24, 2022 | hide | past | favorite | 79 comments



I am an embedded software engineer, and I love everything about this FSM example. It shows that seemingly simple and binary things often have more complexity then meets the eye. I love using old fashioned finite state machines for implementing all kinds of behavior; if forces you to think about every single state and transition, and it naturally isolates the different cases - basically, it makes it easier to think and talk about the implementation.

Buttons get even more fun when you add a "long push" to it. I can not recall how often I have had discussions with customers about the intended behavior of physical buttons on devices; it takes some effort and patience to explain how requirements like "Perform action A when the button is pushed, or perform action B when the button is pushed for a 3 seconds" simply can not be implemented. "Oh, so you mean, perform action A when the button us released within 3 seconds?". "Well, no I want it do A when I push the button".

And then I draw a little picture on the white board showing the circles and arrows.


This isn't quite as impossible as you make it sound. You can certainly make it feel like you have this behaviour by triggering A on release if the button is held for < ~300ms, or whatever feels right as a short press.

Of course now you have to accept the button doing nothing if it's held for longer than 300ms but shorter than 3s, but making it slowly fill up with another colour or something is probably enough to communicate the long press behaviour.


You: "So you want action A to happen immediately when the button is pushed AND 3 seconds after that when the button is still held action B should happen?"

Customer: "No, I only want action B when it is held for 3 seconds"

You: "Todays technology can't do that"

Customer: "What. Wait. Why?"

You: "Because at the point when action A would be triggered we would have to know the future to tell whether the button will have been held for 3 seconds or not."


PM: "Let's not rush into any hasty answers!"

(Under no circumstances should you show them this: https://www.youtube.com/watch?v=BKorP55Aqvg)


This is entertaining, but also cringe; clients and product people aren't generally idiots. They just have requirements that they can't adequately express.

Requirements capture is part of every eng job I've had.


They aren't generally idiots but sometimes their mental model of the problem space is incomplete and they don't understand what they're asking for. This is fine if they don't want to specify details. However when they do it can result in impossible requests.


Agreed, but as a former freelancer: Sometimes your customer doesn't know what they want and instead of figuring it out together they sketched together an internet-research-fuled plan that they want you to follow to the point, even if the plan is inefficient, doesn't solve their problem etc.

I usually managed to convince those people to come up together with a new plan while making sure they still feel like their original work is somewhat in there — after all I was the expert they came to with their issue, would be a bit idiotic to not pay for my expertise..

Sometimes this does not work, then I usually just told them I won't take that project. And projects like these don't make any sense, the customer will complain about their own planning mistakes as if it was your fault, you get angry, they get angry, everybody loses.

The best customers are those who know the problem they want to solve very well, as well as having some idea how a potential solution could look, but who thank you if you have an even better solution.


I hear you and I understand what you're saying, those are some good points. Can you express what you require out of a none cringe comedy skit that this lacks?


He should be fired for not suggesting 7-dimentional plot.


Lie about immediate effect, wait 1.5sec on press A. If it was <1sec press, proceed. Otherwise, wait for another 1.5sec and then work on B modality.

It depends on the lie, but I think it probably fits human-centric timescale.

The alternative is to work on the partially ordered set of events {A, B, undo-A} and convert to {A, undo-A}, {B} and then transform to {no-op, B}

Hysteresis/Queue is your friend.


> "Oh, so you mean, perform action A when the button us released within 3 seconds?". "Well, no I want it do A when I push the button".

It seems like what's gone wrong here is not that the customer wants something impossible, but that they don't understand what you mean when you say "perform action A when the button is released".

If you show one of those people a device with the behavior you describe -- e.g. "toggle a light if the button is released within 0.5 seconds, but toggle the other light, instead, if it's pressed and held for 3 seconds" -- would they agree that it does what they say they want?

Almost everyone understands "pushing a button" to include the action of releasing the button.


Been reading a few books about automaton (after going back into parsing) and it's indeed nice. There's something universal in them.


Is there a (mathematical if possible) way, using FSMs, to demonstrate if a specific case or behavior can or can't be implemented.


Yes there is. You may want to look into Deterministic and Non-Deterministic Finite Automata. They are foundational to computability and Turing machines.

https://en.wikipedia.org/wiki/Deterministic_finite_automaton...


You are designing a "prefix code". It is a collection of strings of button presses such as {A, BA, BBAA, BBABA, BBB, ...}. Each string maps to an action.

Ideally no string is a prefix of another string. Even more ideally any infinite sequence of As and Bs can be decoded into a sequence of actions.


This is something I created many, many years back for Stanford’s CS103 course as a lecture demo. Apologies for the lack of mobile support - I’ve always presented this from my laptop. :-)


Woah. This is what I find seriously cool about HN. Was searching around for "push button+state machine" the other day to learn how to program some menus with a microcontroller, and came across this. It's a really neat visualization, thanks for your work years ago!


There appears to be a mistake in the automaton diagram (not operation). The start state is "Idle, Up", and assumes the mouse is presently outside the button. But after doing "Click to Reset", you're already inside the button and so cannot enter it. Thus you should actually, and unfortunately, be in "Idle, Up". The demonstration gets around this by magically jumping to "hover", but this violates the whole educational point of what a start state is.

I know this is nitpicky, and yes, I see that "fire!" is an accepting state, but maybe you could have a special transition from "fire!" to "hover" labelled "reset".


Well...

You've made a couple mistakes here.

1. You've made the unfounded leap that the diagram intends to represent state after the button has fired. That's manifestly incorrect.

2. You've made the incorrect assumption that a full reset of the diagram is accomplished by clicking the button. But that's not what it says. Obviously, you also need to move the mouse out of the button to return to the start state. So you could criticized this for incomplete instruction, however, since the context is a demo for a college course, omitting over-instruction of obvious and irrelevant details is most likely the best decision to reach the overall instructional goal.

If you want to be pedantic on the internet, you need to go a bit deeper than that. ;-)


Just think of it as throwing the old button away and buying a new one. Buttons come idle from the factory. Once the old button is hauled away and the new one is installed under where your cursor happens to be, it transitions to hover.


Beautiful.

Small suggestion for improvement: There are really two boolean variables (inside the button area, or not; mouse button down, or not). They give rise to more than 4 states because the order of actions is important.

But the diagram might be made more intuitive by placing the states within a 2x2 chess board corresponding to in/out, up/down. Not much change is necessary (for example: shift "held outside" left, "pressed" down, and "fire" up).


Just wanted to say thank you so much for your incredible lectures! - A random student who took CS103 with you in Fall 2019


Right click locks you in the "press" state, just so you know.


No repro here (Chrome 99 on Windows): right-clicking the button works the same as left-clicking it (yes, there's my browser's context-menu, but it doesn't seem to interfere with the webpage's logic at my end).


On Chrome on Mac I get the same, and right clicking outside the button makes it stick in the (idle, down) state, hovering makes it move between (hover, pressed) and (idle, down).

Still a crazy cool demo though. Having worked on ui for my hobby games, I always knew it was more complex than just onClick, but this makes me understand it perfectly. I'm probably going to implement a model based on this in my current project.


I can make it do this on Firefox on Ubuntu; the key is to hit Escape to dismiss the context menu, instead of clicking away. Closing it by clicking away makes it act correctly.


Safari on a mac. The context menu pulls up, but even when dismissed, it's still stuck in the "idle, down" state.


Repro's on Chrome/Mac as well--I think it's a macOS issue.


Not sure why it seems MacOS related, but same thing here on mac (safari and firefox).

I added a `#button:active` style to see if it was actually leaving it activated after dismissing the context menu, but apparently no. Not sure what the cause is.


nice! how about something like that for drag/drop? anyone? :)



Neat.

...how do I get the button into the "idle, down" state? (i.e. how do you press a button without first hovering over it? I tried with [Tab] and spacebar to activate it, but that didn't do anything).

------

I'm thinking this page could likely be reimplemented without any JavaScript, using only CSS' interaction state pseudo-classes (`:active, :hover, :focus`, etc) as proxies for the button's state (with the sibling `~` selector to control the state-diagram image). Hmm, though the "held outside" state might be difficult (perhaps `body:hover button:not(:hover):active ~ #stateImage` would work for that?).


> how do I get the button into the "idle, down" state?

Click and hold outside of the button, and drag over it.


I figured this out too but I wonder if the button is strictly the fsm if the mouse state is also being tracked, in the example yes, if there was more than one button probably not... but the idea is brilliant and probably worth playing with.


just hold down the mouse outside the button.


aaaahh! now I feel silly.

However, the page shouldn't be doing that because the state of the <button> is not actually affected by the mouse-pointer's state, so that entire "idle, down" state node shouldn't be there at all (nor "hover, pressed", as browsers use the same `:hover:not(:active)` state for the <button> as "hover" in that situation).


No, it needs to do that. When you click the mouse down outside the button, then enter the button area and release, you do not want to fire. So you need those extra states.


> When you click the mouse down outside the button, then enter the button area and release, you do not want to fire

If you're referring to not wanting the `mouseup` event to fire, you are correct - however web-pages should not be using `mouseup` / `mousedown` to detect <button> clicks in the first place: they should be listening to only the `click` event, which is not raised when the user releases a held-down mouse button anyway. And the `click` event is also raised when the <button> is activated by other means, such as the spacebar which makes it more accessible too.

When you have the page open, open your console and run this:

`document.querySelector('input[type=button]').addEventListener('click', e => console.log("button 'click' event") )`

and then do the mousedown-and-hover-and-release described and you won't see anything logged until you really do click on it.


What's also interesting to realize is that the standard HTML button is a sort of unholy chimera that doesn't model any sort of physical button in common usage.

Physically speaking, we generally have either:

1. a "push" button, which activates when you press it down fully, or

2. a "hold" button, which activates continuously when you hold it down.

In either case, the "action" occurs at the instant you press the button down, and perhaps continues until you let go.

But the standard HTML button works differently, and from a physical perspective, quite weirdly: when you press it, it only gets "cocked", and when you release it, it activates. That's why its state representation feels complex and unintuitive!

Rather than the behavior of a "push" or "hold" button, the standard HTML button behavior is more like extracting an SD card: you press the card in and then release it to get it out.

(I'm sure there are physical buttons out there somewhere that activate on release. And you can also make a great model of a "push" or "hold" button with custom JS. But I think it's fair to say that the default HTML button doesn't work like real-world experience would lead anyone to expect.)


To be fair, it's not just HTML, "Touch Up Inside" is the usual button press event to subscribe to on iOS/macOS, and to my personal knowledge this has been true on iOS from day 1 (unsure about macOS or at least NeXTSTeP), so there is an argument that it is at least somewhat normalized, though your point about it not matching up with most IRL buttons makes sense.

However it's also worth noting that IRL buttons tend to require more than a feather's touch to detect a press, whereas there is no such margin on monitors/pointing devices.


Also shows one of the fun aspects of many finite automata I encounter...

If you right click it, it gets stuck thinking it's "pressed" until you click somewhere.

As much as I like FSM in principle, I feel like substantially more than half that I've encountered in software have either failed to accurately model how a system actually behaves, or have been missing core states that are a natural fit for real-world use. It has grown to become a fairly accurate predictor for me that a system is going to have problems due to over-simplification and lack of flexibility.

Which is not a FSM problem obviously, but the correlation is astoundingly strong. I'm not sure why. They obviously work out fine in many places. Maybe software uses I've run across are just way more complicated than can be accurately understood from a visual diagram? This one certainly seems reasonable and complete, yet...


Inadequate use of FSM is what scares me a lot, especially in embedded development. The current state in software should be the result of sensory information whenever possible. Or at least state consolidations based on sensory information has to be thought through rigorously. Is there any literature on the topic?


What I love about this is back in the archaic days when car radios had (mechanical) radio buttons, that was the introductory example of a tiny state machine. They were also used in other devices but the car radio was the application everybody had seen.

Nowadays I imagine only a small proportion of people who deploy "radio buttons" have even seen such a car radio, so the metaphor is now empty, like the floppy disk icon for "save".


Random question, were physical radio buttons ever circular, or are they circular on computers only to differentiate from a check box? I've only ever seen rectangular radio buttons IRL, and a quick image search seems to agree.

https://www.knowahead.in/wp-content/uploads/2012/05/car-radi...

https://41.media.tumblr.com/tumblr_mbyb9qOw8H1rg5mmto1_1280....

https://i.pinimg.com/originals/10/d6/86/10d686376ae39772ac4e...


I only remember rectangular ones on actual car radios, but on other apparatus they were mostly circular.

Basically a "radio button" interface is a mechanical XOR. So for example to route a signal to destination A, B, or C you'd want to push the A button and be sure B and C weren't selected. Often it mechanically latched, so you could also see at a glance which option was selected, rather than needing to have an indicator (typically a small incandescent bulb).

Radios themselves implemented that mechanism slightly differently. Each button had a stop (like a tab stop, not an organ stop). When you pushed the button it disabled all the other stops and then either a spring pulled the arm right to the stop or your own muscle power pulled it left. A pully rotated the tuner dial as the arm moved right or left. I believe the buttons were rectangular in order to have enough surface area for your finger to supply a firm press enough to pull the tuner arm all the way from right to left (the extrama case). If they'd been round they would have taken up too much vertical space.

The radios typically had five or six favorites, no more. The entire mechanism was mechanical.


Huh. I never understood the terminology "radio button", and I have no memory of noticing such a thing in a car. But it could easily have been there; why would I have cared?

I did have a tape player with buttons (play / rewind / stop / etc) that, when pushed, unpushed the other buttons. So I have a mental model for buttons of that kind. But I've never associated them with a radio.

Also, I've never associated the radio selector HTML element with a button (HTML or otherwise). Buttons are about triggering effects. But radio selectors are about selecting something; they have more in common with checkboxes.


I vaguely remember when I was a kid our hifi system had buttons like that, they were all round cylindrical buttons


For my tape player, the buttons were large rectangles with no border between them. If I recall correctly, they didn't sink in, but rotated - pushing one would depress one edge of the button without depressing the opposite edge.


Nice!

It's funny that buttons are the "Hello World" of frontend components, because they're actually feverishly nuanced.

I wrote a proof of concept for one of these, using XState, a few months back.

My use case was a cross platform react native button -- which means there's technically a difference between "pressed" and "hover" -- and there's also a loading state, which required special handling for the a11y state.

https://codesandbox.io/s/fervent-shirley-dgesq?file=/src/Pre...

NOTE: this will open a popup (might have to enable popups) with the visualizer, and it defaults to the a11y machine, but there's a drop down to switch to the main button machine.


One place you'll see this in real life is modded flashlights, running Anduril. My friend has an absurdly powerful light with 1 button and 20 modes: https://budgetlightforum.com/node/38364


This is fun. Now I’m wondering what it looks like as a regular expression.

Here’s my first attempt. (Probably wrong, since I did it by hand.)

  (enter,leave,|press,(enter,leave,)*release,|press,enter,release,leave,|enter,press,leave,(enter,leave,)*release,)*(enter|press,enter,(leave,enter,)*release),press,(leave,enter,)*,release
I’d never really thought about how many different sequences of events can lead to a button activating.


XState is useful for modelling these interactions.

https://xstate.js.org


The visualizer on mobile has issues, I’ll try it on desktop later. Thanks for sharing


Its all fun and games until somebody managed to click on your button without hovering first.


I see two ways to do that:

1. keyboard navigation (Enter and space are ignored)

2. hover, then F5 to reload the page (this resets the state, and you get your cursor on the button without a state transition)

presses are completely ignored either way


3. $('#ohhai').click()


So, any browser testing suites/runners :p


I don’t see how to enter the two states at the bottom left (idle, down and hover, pressed). I can’t press without first hovering. Those two states seem to be inaccessible.


Click/hold the mouse while outside of the button. Middle-bottom state is hovering over the button while your mouse is down, without having clicked the button. I can see that being useful. However, I can't see why the bottom-left state is ever useful, except perhaps as the only way to get to the "hover, pressed" state.


It is useful because you don't want to fire when you click and hold outside the button, then enter the button area, then release. So, when you release inside the button, you must know whether you clicked inside or outside the button. Thus you need those states. I think.


If you imagine DOM elements as each having a separate FSM and if you consider the possibility of entry/exit actions on each state, "idle, down" in the bottom left could have an entry action that primes the system for drag & drop. Click down outside, release inside is exactly what D&D would have.


I am able to enter "idle, down" on chromium on linux by clicking and holding the very edge of the button.


Oh, how silly of me. I can enter that state by clicking anywhere outside the button (Vivaldi (basically Chrome) on Debian).


you click anywhere else on the webpage


uhhh 'spinner' state pls?

lack of loading state in buttons is a huge unforced error that makes all ajax stuff way more verbose than it has to be

the dom has no idea of how client-server applications work, which is defensible because the dom is so old but doesn't portend well for either 1) the quality of the google-weighted standards process or 2) the future of the web as the platform for saas and non-mobile consumer


Seems broken when hovering and you start scrolling - it stays as hover state during scrolling when you're actually outside of button.


Minor nit: right click is interpreted as 'press' with odd behavior if the context menu selection is 'Inspect Element'.


in 2016 Casey Muratori wrote a few words about his old video "IMMEDIATE-MODE GRAPHICAL USER INTERFACES (2005)" -- apparently he'd been using the technique for years by then and had coined the, now commonly used, imgui acronym -- see: https://caseymuratori.com/blog_0001

and if you dive into that video at 16:50 you'll see a very elegant description of minimal state immediate mode button logic; direct time link: https://youtu.be/Z1qyvQsjK5Y&t=16m50s


it was a long time ago, but there used to be a model of the state machine for selecting a file in the mac os9 Finder that was often shared (yes i'm talking about around the time that OS9 was current). I don't suppose anyone remembers that? I'd love to see it again


The Elm architecture feels like you're making little state machines for your UI. It's a really nice way of doing things.

Note: I've not actually tried Elm only Elmish in F#.


Now add double-click.


Bug on safari 15.4: release from 'hover, pressed' didn't go to hover, and instead went to 'idle,up'. Also, love FSMs.


Note that each state has exactly two exits: (1) Cross the button boundary, and (2) Toggle mousedown status


I don’t believe a two-state button is standard HTML.


Not sure what you mean.

I've played with some buttons that struck me as fairly standard and found that they have exactly the semantics implemented here (to fire, you need to press down inside the button area, hold down (leaving the area or not), and release inside the area).


Buttons can't remember their state. The "fire" circle shouldn't exist, and after you click and release, it should go back to idle up or hover.

This diagram is showing more checkbox behavior rather than button.


Ah, for illustration purposes this is showing a "one-shot" button. Notice the "fire" state is a double circle -- this means that it's a terminal state for the state machine, meaning the state machine is "done" once it enters that. In practice, the fire state would be wired up to the start state of another state machine, which would be wired up to the start state of the button machine to get things going again. The "Click to Reset", notice, isn't part of the state machine itself, and instead it's modeling the stuff that happens between firing and transitioning back through "start."

A button that has no "on click," so to speak, could be represented by having the fire state transition immediately back through "start."


Now add in touch controls!


I love computer science.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: