I'd go so far as to say that the command line is with us now more than it was 10 years ago. Google is essentially a command line for the web. Spotlight (as well as third party Mac utilities like Quicksilver and LaunchBar) are command-lines for launching programs, which satisfies a primary function of the original command line. As long as we have usable keyboards we'll have command lines one way or the other. (Phones, even iPhones, are an example of unusable keyboards by this standard. An iPhone command line would be less usable than the standard interface for launching apps, for instance, while on my Mac, Spotlight is more usable than the Finder.)
I don't like the attitude that keyboards are some outdated relic. Keyboards are the most efficient human-thought-to-language medium ever invented. On even the most simple Notepad/Pico/TextEdit style text editor, I can go from thoughts in my brain to written language faster than a pen, and almost as fast as my mouth can go from thoughts to spoken language. And with far more accuracy than either alternative. If the keyboard is ever replaced, it'll be replaced by something even more arcane and unnatural.
Doing away with hierarchical file systems is easier than he lets on. In fact, Gmail has already done this in our mail client. The solution is, instead of having a single static directory structure, you dynamically generate views of your files based upon search queries and user-added tags or categories. He mentioned Spotlight as another example of this, and he's mostly right (now that you can save "smart folders").
But sometimes human categorization is still going to be needed to keep related files together. In the past we did this by making directories. These days, "tags" or "labels" are more popular. But we might want to make hierarchies of tags, at which point you have a hierarchical file system again. There's one advantage—tags allegedly allow you to have the same file under multiple tags—but that's nothing more than hardlinks, and "there should be a better UI for hardlinks" would make a very boring essay. Then again, if we reverse engineer this thinking and refer to "tags" as nothing more but "a better UI for hardlinks", we just might have a good idea for a new filesystem UI.
(It also occurs to me that hardlinks are one-way links: each hardlink knows where its inode is, but the inode doesn't know where all the hardlinks are. A tagging UI is going to have to cross this chasm somehow.)
I used to think that a metadata based file system was a really bad idea until two things happened. I read Jef Raskin's "The Humane Interface" and I watched people use web apps and 'database' apps like iTunes.
In THI, Raskin covers the idea that the best filename for any given file is the file itself. If I save a copy of this comment on my computer, I might call it "HN - Desktop Metaphor/Hierarchical Filesystems". That's a lousy name. Three months down the road when I'm trying to remember where I quoted 'the best filename for any given file is the file itself', it's much more convenient to type in that phrase instead of trying to remember where I put the file.
There are two arguments against this approach, both with good counter-arguments.
First, if you love your hierarchy, nobody is taking it away from you. When you save a file, instead of being asked for a filename, you might be prompted for search tags. Type in a hierarchical file name there (eg. /comments/hackernews/hierarchical filesystems").
Second, what about files with no good textual metadata? A movie file, an image, etc. I can't just type in 'movie' and look through them until I find the one I want. In this case, the system could detect that there's no indexable metadata attached to a file when you go to close it and request that you add a tag. This is no larger burden than adding a filename now.
Of course, ideally you'd never have to 'Save' a file, just delete files you didn't want to save, so instead of 'Save' in the menu bar, you might have an 'Tag' item instead.
The other reason I think metadata based systems are feasible is by watching people use software like iTunes or web apps. As long as the applications that manipulate your data present a reasonable interface for navigating said data, it doesn't matter where the file is actually stored. Watch an end user play some music with iTunes and see if they ever even care where their files are. (Preferably someone without a CompSci degree, we tend to have wacky ideas about needing to know everything about the machine.)
I liked Nancy Kress' sci-fi suggestion in her "Beggars of Spain" sleepless series. Gene-enhanced people who no longer need to sleep (along with more future gene enhancements) find speaking too slow. They begin to craft thought objects that they transmit to each other that express whole spheres of information at once.
Wouldn't the interface that was developed using book-based prediction, where the user just navigates forward towards the most likely letters/words, completing the words/sentences, be a step forward along this approach?
(Sorry, I can't find the link, but I know I read about it and watched the video from a link on HN).
It looks like Microsoft may be falling in step with this thought: Server 2008 can be installed as Server Core, text only, no GUI. It seems to be starting to blur differentiation with other text-base, command line interfaces.
I think sufficiently advanced voice-recognition could replace keyboards.
<oop>Imagine a world with truly ubiquitous computing, augmented reality and a predominantly voice- and gesture-driven interface. Everyone would appear to be wizards, especially hackers, because surely, new programming languages would appear where combinations of arcane keywords and gestures take special meaning. Sorry, imagination just ran wild here.</oop>
When I'm typing, I can revise my thoughts more easily. For instance, I just deleted that whole sentence, wrote "When I'm typing, I can go back", deleted the last two words, wrote "revise my thoughts more easily. For instance...". This would be awkward via voice.
Some people type thousands of words per day, or the vague equivalent. I hope your ubiquitous-computing future has room for drinking lots of water, because my mouth would get dry from all that yapping.
There's also the nice feature that I can sit down and have a nice quiet writing session, or a nice writing session with music in the background. My sound space is either empty or filled with something else while I'm writing. And if I'm in a lab or an office or some other environment around other computer users, I don't want to listen to their jabbering.
BumpTop strikes me as more of a slick tech demo more than it is a real product - sort of like way back when someone modded DOOM to display system processes (and can kill them with the shotgun). Amusing, gets oohs and aahs from the audience, but ultimately doesn't really increase anyone's productivity.
I'm not convinced we need full-3D graphics and physics simulations just to work a computer.
What we do need, IMHO, is more pervasive hardware acceleration at the OS level, and rich support for animations. Windows fails horribly at this, but they're making promising strides - motion is one of those important UI bits that has basically been ignored up until now.
For starters, it only works with the Desktop folder, not any other folders. Personally, I like to keep my Windows desktop very tidy and so having the power of BumpTop is useless.
It would be great if BumpTop can navigate the full drive and directory tree using the cool features of BumpTop without resorting to Explorer and without being tied to the desktop.
We have an experimental "Bump This Folder" shell extension, which provides a little button in Windows Explorer that turns the current view into a BumpTop. It needs to be enabled in the BumpTop settings.
I have 1.6 million files on my computer at the moment. Not sure how they'd map to bumptop. Even just photos; all the demos I've seen limit files to only a dozen or so. How would you manage a few thousand photos?
1.6 million files break down in most viewing paradigms. In BumpTop you can turn on "infinite workspace" and you can have thousands of files. Then use pile by type or date to automatically make that more manageable.
The article sort of hints at it, but the (near-to-mid) future is incremental improvement, not radical overhaul.
The desktop metaphor is already dead. And I don't mean "dead" as in "Microsoft is dead [to certain people in certain ways]", I mean that very few people are really using a desktop interface any more. Compare Windows 3.1 to a modern system, like Vista or Gnome. The remaining Desktop metaphor items are there out of inertia more than anything else.
Directories may still have "folder" icons, but they are not even remotely bound by the folder metaphor anymore. They may open as web pages or photo albums or a music browser, or through the component architectures, anything else. Several sorting options are available, and nobody asks "Does this make sense from a 'desktop' 'folder' point of view?" before they implement something like a recursive file size view. In fact, in general, nobody ever asks that question, because the metaphor doesn't matter to anybody anymore.
What we did is bootstrapped from the desktop into a generalized windowing environment. It is not the desktop that we have stagnated on, we moved past that anyhow, it is the windowing environment. Windows are now the base cognitive metaphor, and that is already relatively freeing compared to thinking we're stuck on the "desktop".
What are the real purpose of Windows? The real purpose is also the weakness people are grasping at but I don't think generally get right: They keep the complexity of adding verbs at (roughly) O(n) on the verbs by isolating them from each other. If I pop open Firefox, it is shielded from my word processor and vice versa; they only know of each other across the clipboard, if that. If I want to write an entirely new app, I don't have to worry about what else is running on the desktop, I just pop open a new window and I'm in my own world.
We intuitively grasp that if we could break this isolation, we could have "better" interfaces, but it only works for the beginning part of the O(n^2) curve. That's why we keep seeing these promising demos of new interface ideas that go nowhere; it all works to manipulate photos (the go-to task), but it comes crashing down when we try to implement the full suite of nouns and verbs we all actually want in the new O(n^2) complex world. Our intuition is actually wrong here, because our intuition is glibly gliding past the part of the problem where you actually have to start implementing the whole world in your brand new paradigm.
I see something like http://www.youtube.com/watch?v=KiLzbNiEyj8 and think: It's great to show how easy it is to manipulate photos, but... how do I also manipulate music? Word documents, which really don't thumbnail worth a crap? Upload to arbitrary locations? Handle 100,000 photos? Integrate photo manipulation into the interface seamlessly (since opening a new app like Photoshop misses the point of the criticism)? What happens if I get an IM during this process? How do I attach this to an email, and how would I type the email? How would I have multiple competing implementations of the IM, email, photo manipulation, etc. that I could switch between if they didn't meet my needs? You can answer these questions and the hundreds more that come up in real life, but it starts looking like "a reskinning of traditional interfaces with a couple of nice optimizations for the photo case" instead of "a radical break from the windows paradigm that will change computing forever!" really quickly.
We are in this local optima for good reasons. If you want out, you first need to understand why we are here. (And don't mistake this post for fatalism; show me a new paradigm of value and I will jump all over it! But it's going to be hard work.)
(PS: Note how moving to web apps only furthers the isolation of processes. At least local OSes have a "file" abstraction, web apps don't even naturally have that, and while shipping a desktop app that didn't speak all the relevant file formats would be nearly unthinkable (MS Office being the exception, not the rule), web apps have a lot of incentive towards lockin. Personally, I'm still down on them, long-term. I'd rather have local apps with net-backed storage for almost anything I can think of doing, because anything else will actually manifest as a regression against the desktop metaphor, which we all already instinctively think of as somewhat limiting....)
I've never really questioned what was going to take the desktop's place. 3D UI's have been created and honed in the game and Visual Effects industry for several decades. It's only a matter of time before OS's start to make use of some of the metaphors that are standard in game UI's.
I'd argue that the reason that they aren't popular now, is because we haven't had enough processing power on cheap, generic computers to make adding 3D elements to a UI worth while. When we all have 16, 32 or 64 cores, sitting under out desks waiting around for something interesting to do, it is going to make a lot more sense to include more animations, videos, composites and 3D elements into our UI's.
3D and graphics heavy UI's also answer the question of, "What can we possibly do with 64 cores on a CPU?" This question was answered long ago: render more video and more graphics. Gamers haven't tired of having more parallel processing units on their GPU's over the past 20 years. They get more processing power, they just crank up the triangle count in their models, turn particle rendering to "high" and set rail guns to "awesome".
I have no doubt we'll be doing the same with our OS eye candy soon. I can't wait to use a CPU cycles for particle dynamics on my OS. :)
As far as loosing the desktop metaphor == backwards compatibility problems... That's becoming a non-issue with virtualization. With virtualization, you can have backwards and cross platform compatability with pretty much any OS out there except for Apple OS's. <kvetch>Apple is the exception because they don't want you to play with their OS software unless you buy it with their hardware.</kvetch>
It's good to thing about these things, though. Thanks for the article.
You are smoking the crack, sir. Games do not have 3D UIs, they have 2D HUDs on 3D worlds, if they have 3D at all. 3D modeling software doesn't even have 3D UIs. Outside of a tiny number of specialized applications, 3D UIs are another Microsoft Bob. 3D has been tried, and it didn't fail for lack of computing power.
There is absolutely no reason to navigate a 3D landscape to create your document, even a 3D document, or get to your email. The 3D interface that actually improves productivity has yet to be invented.
What? Eye candy? Seriously? There's too much eye candy already. There are eleventy billion better things to do with multiple cores. If there aren't better things to do on your desktop, then you should be using a cheaper computer or running Folding@Home, or something. Anything. 3D is for art, not for UIs.
The problem with 3D interfaces is not the 2D display.
There are two major problems with 3D interfaces: Your best input devices are fundamentally 2D. Fully 3D interface devices exist but are unusual and generally tuned for a specific purpose. (Even if you want to jump up and say "The Wiimote!", from what I can see it is typically used in a 2D manner. Those 2D are not the conventional 90 degree angles(), mostly in a way to make them easier to use with wrists and arms, but it's still a lot of 2D motion with rare forays into true 3D that get old fast.)
The other major problem is that full* 3D is virtually incomprehensible; we live in a 2.5D world. Try to train a "normal person" to play Descent. (If you think full-3D interfaces would be awesome, go play Descent to be sure!) We can't use the full 3D. What we can use is a 2.5D plane stretching out into the distance. But, as long as we're using 2.5D, why not place it perpendicular to our eyes and see the maximum area, instead the 3D-engine view that moves all but a very small part of the 2.5D plane out of our field of view and range of action?
Fully 3D interfaces are fundamentally and deeply flawed. We've had the hardware to display them for at least a decade now and there's a reason there isn't even a halfway decent prototype... and it's not for lack of trying. It's just a really, really bad idea, one where you can't even overcome the fundamental flaws with brute force and ideology.
(*): Recall that dimensionality is not constrained to the traditional three 90-degree rotated planes, it is a measurement of how many numbers need to be specified in a given situation. Hold out your arms and fingers straight, and move only your wrist around. As you move your wrist around, look at the surface defined by where the tip of your middle finger is. It is piece of a vaguely spherical shell which curves in 3D space, but is itself a 2D surface; given the situation I laid out I need only two numbers to identify a point on that plane. The Wiimote is capable of being used in full 3D and it certainly is in some cases, but the human body itself imposes constraints on how you move that thing around and full 3D gets tiring, fast. Even the ones that use full 3D still strike me as using 2.5D; a full 2D for the wrist motions, and .5D for just thrusting backwards or forwards with no meaningful wrist interactions (just serving as a tactile button). Another example: Using the Wiimote as a steering wheel, as nifty a use case as it is, actually cuts it down to 1D, as only the angle matters.
Excellent points, especially the part about the constrained degrees of freedom in our actual interactions with the world. To give another example: the human hand has somewhere upwards of 20 degrees of freedom. Yet except in a few situations that require lots of training (like touch typing or playing musical instruments) the movements of the hand are limited to a small number of stereotyped poses-- a power grip for holding a coffee mug or a suitcase, a pose bringing the thumb to forefinger, a handshake pose, a thumbs-up pose, a baseball grip, etc. And that's not even getting into the physical limitations of the hand.
While I agree that animation is one very effective way to enhance user interaction [1], I don't believe that simply upping the pixel count or adding 3d visuals will make interfaces that much better.
Compared to the late 90s, our current OSes focus an insane amount of their resources on eye candy. The newest UI fads (3d window switching, etc.) would make a windows 95 or Mac OS 9 user's head spin. Have they made the interaction metaphor better? Somewhat, but at nowhere near the pace of almost any other computer-related field.
I think that the main thrust of the article is that there is a limit to the advantages to be gained from simple graphics acceleration, and we are quickly approaching it. We need a new paradigm, otherwise we'll just continue languishing in a world where menu chains (edit->tools->filters...) and "ribbons" (i.e. MS Office) are the norm.
I think the filesystem metaphor can change without it being 'an island'. Just because previously implementations have been poorly thought out doesn't mean future ones have to be.
Use a filesystem backed on a document-oriented database. Have a non-hierarchical filesystem, then use a single field in the database to imitate the hierarchical one (i.e. directory and filename for example). We already have container formats which contain metadata as a workaround for the failures of current filesystems (think MP3 tags or EXIF), just dynamically add and strip the attached metadata to files on the fly for transportation to share it across a network and with other platforms.
Doesn't XO / Sugar have some of these features - I never quite got my head around it -- playing a little bit in VMWare didn't quite cut it. Anyone have any experience with the OLPC kit?
He mentions Spotlight, but not Google. It is almost better to have the data you care about on the web (ignoring privacy concerns), where it can be indexed by Google and found with a quick search. I think the Google search box + hyper links is what has really replaced the files and folders metaphor. Figuring out a meaningful hierarchy and folder names on the desktop seems like such a pain now compared to how easily we navigate on the web.
I've always been inspired by David Gelernter's Mirror Worlds and LifeStreams projects. You can see parts of his vision/ philosophy in Apple's Time Machine backup UI and even Twitter:
"Life is a series of events in time -- a timeline with a past, present and future. The events of your life and the memories in which they're recorded aren't parceled out into directories, or typed shoeboxes. An information beam incorporates documents of all types into one (focussable) beam. The question "where did I put that piece of information?" always has exactly one answer: it's in my beam.
Life isn't static. New information arrives constantly; time flows. So my beam has to flow. Or, in other words: the elements that make it up flow. They move at the rate of time. In this respect the structure is more of a "stream" than a "beam."
The stream has a past, present and future. The future flows into the present into the past. If I've posted an appointment or reminder in the "future" part of the stream, eventually it flows automatically into the present where I'll notice it and be reminded, and then into the past where it's part of the permanent, searchable, browsable archive.
When I acquire a new piece of "real-life" (versus electronic) information -- a new memory of (let's say) talking to Melissa on a sunny afternoon outside the Red Parrot -- I don't have to give this memory a name, or stuff it in a directory. I can use anything in the memory as a retrieval key. (I might recall this event when I think about Melissa, or sunny afternoons outside the Red Parrot.) I shouldn't have to name electronic documents either, or put them in directories. And I ought to be able to use anything in them as a retrieval key.
I can "tune in" my memories anywhere; I ought to be able to tune in my information beam anywhere too, using any net-connected computer or quasi-computer.
Those are the goals of our lifestream (or "information beam") project. In our view of the future, users will no longer care about operating systems or computers; they'll care about their own streams, and other people's. I can tune in my stream wherever I am. I can shuffle other streams into mine -- to the extent I have permission to use other people's streams. My own personal stream, my electronic life story, can have other streams shuffled into it -- streams belonging to groups or organizations I'm part of. And eventually I'll have, for example, newspaper and magazine streams shuffled into my stream also. I follow my own life, and the lives of the organizations I'm part of, and the news, etc., by watching the stream flow."
yes! lifestreams offer a much more useful abstraction than the desktop metaphor. i'd really love to see a lifestream OS/filesystem... unfortunately, it doesn't seem likely to come about in any consumer-friendly realization, at least in the foreseeable future.
the default tab in chrome that shows most visited webpages is a start but crude.
combine that with a much more robust bookmark service and RSS feed and you start to have something.
I don't like the attitude that keyboards are some outdated relic. Keyboards are the most efficient human-thought-to-language medium ever invented. On even the most simple Notepad/Pico/TextEdit style text editor, I can go from thoughts in my brain to written language faster than a pen, and almost as fast as my mouth can go from thoughts to spoken language. And with far more accuracy than either alternative. If the keyboard is ever replaced, it'll be replaced by something even more arcane and unnatural.
Doing away with hierarchical file systems is easier than he lets on. In fact, Gmail has already done this in our mail client. The solution is, instead of having a single static directory structure, you dynamically generate views of your files based upon search queries and user-added tags or categories. He mentioned Spotlight as another example of this, and he's mostly right (now that you can save "smart folders").
But sometimes human categorization is still going to be needed to keep related files together. In the past we did this by making directories. These days, "tags" or "labels" are more popular. But we might want to make hierarchies of tags, at which point you have a hierarchical file system again. There's one advantage—tags allegedly allow you to have the same file under multiple tags—but that's nothing more than hardlinks, and "there should be a better UI for hardlinks" would make a very boring essay. Then again, if we reverse engineer this thinking and refer to "tags" as nothing more but "a better UI for hardlinks", we just might have a good idea for a new filesystem UI.
(It also occurs to me that hardlinks are one-way links: each hardlink knows where its inode is, but the inode doesn't know where all the hardlinks are. A tagging UI is going to have to cross this chasm somehow.)