Hacker News new | past | comments | ask | show | jobs | submit login
Here comes the Muybridge camera moment but for text (interconnected.org)
249 points by RA2lover 11 months ago | hide | past | favorite | 65 comments



Yes, yes, more explorations in this direction.

For a couple of years now, I've had this half-articulated sense that the uncanny ability of sufficiently-advanced language models to back into convincing simulations of conscious thought entirely via predicting language tokens means something profound about the nature of language itself.

I'm sure there are much smarter people than I thinking about this (and probably quite a bit of background reading that would help; Chomsky, perhaps McLuhan?) but it feels like, in parallel to everything going on in the development of LLMs, there's also something big about us waiting there under the surface.


> convincing simulations of conscious thought entirely via predicting language tokens means something profound about the nature of language itself.

> there's also something big about us waiting there under the surface.

I don't believe so. In "The Origins of Knowledge and Imagination" by Jacob Brownoski, he argues that human language have four unique characteristics:

- We can separate information (data of what being described) from emotional content (how we're supposed to react). There's no longer a bijection between communication and action.

- We can extend the time reference of the communication content. We talk about the past, we plan for the future.

- We can refer to ourselves. So we examine what we've done and iterate over it until we fix the errors. We can see ourselves doing the action without actually doing it.

- We can rearrange units of languages to have different meanings. The same words can have different meanings based on their order. So meaning depends not only on the words, but their sequence. And that goes from words to phrases to sequence of dialogs.

The fourth point is the most important. LLMs by predicting languages tokens can give use the most common order for a particular context. And because we don't have that many words, their orders can be extracted from books and other written content. But then they fail for the higher levels, mostly because that's when everything get unique.

As for the third point, by observing ourselves, our communication is constantly being based on reality, which grounds it in truth. And because we can extend the reference it's based on, that leads us to observe changes and model laws. The first point allows us to separate what things are from what we should do or feel based on their existence and absence.

Instead of the LLMs fooling us, it's more us fooling ourselves, because by recognizing meaning in sentences, we try to extract meanings for longer sequences of text where there aren't any. Why? Because there is no "I" that has done the job of extracting information and using language to transmit it (while still cognizant of the imperfection of natural languages). LLMs are lossy compressions of ideas. Only the smallest survives and then it generates much more false ones.


Are you certain that you're not playing with words to arrive at a predetermined conclusion? What is this "I" to which you're referring and how can you demonstrate that "I" does not or cannot exist within systems such as these? Further, if you are to find something which qualifies as an "I" elsewhere, what makes that elsewhere fundamentally different and therefore capable of supporting and being an "I" and is that elsewhere such simply by definition or in and of itself? Further, if the language usage is indistinguishable from the language usage of an "I", is the difference of source meaningful? If so, why?


The "I" is stemmed from the theory of the mind. We can only access our own mind and thus has no way to infer the thoughts of other. So we observe them and infer based on our own patterns. In a sense, we assume that others have the same mechanism that we possess, and thus we engage in interactions with them. So far, there is no demonstration of reasoning within systems such as these, it's all simulation of the communication channel themselves.

> Further, if the language usage is indistinguishable from the language usage of an "I", is the difference of source meaningful?

Is it indistinguishable? The first thing we look for in communication is consistency so that we can examine for intent. And this is after we determine the other party. Because we know the intent is not ours. But what I've seen of prompt engineering is that the communication intent always come from the person, not the models. Then it goes on to find the most likely continuation of this intent (based on the model training) and then it quickly become an echo chamber. It's search in lexical space and you can see the limits when it became a oscillating loop between the same set of reply. Because there's no "I don't know" damping.


Why does there need to be an "I" that uses language to transmit information? Language itself encodes information. I can read a piece of text and gain something from it. Where the text came from is irrelevant.


> Language itself encodes information.

Which it does in a lossy manner. Information is independent from language. The more complex the information, the more language fails. Which is why there is so many mediums for communication. Language has three main components: the symbols, the grammar, and the dictionary. The first refers to the tokens of our vocabulary, the second to the rules to arrange these tokens, and the third describes the relation of the tokens to the things they represent.

The relation between the three is interdependent. We name new things we encounter, creating entry in the dictionary, we figure the rules that governs these things, and the relation to other things encountered previously. And thus, we can issue statements. We can also name these statements and it continues recursively. But each of us possess its own copy of these stuff with its own variations. What you gain from what I said may be different from what I intended to transmit. And what I intended to transmit may be a poor description of the things itself. So flawed interpretation, flawed description, and flawed transmission result in flawed understanding. To correct it, you need to be in presence of the thing itself. Missing that, you strive to establish the tokens, the grammar, and the dictionary of the person that have written the text.

In LLMs, the dictionary is missing. The token "snow" has no relation to the thing we call snow. But because it's often placed near other tokens like "ice", "freeze", etc,... Then a rule emerges (embedding?) that these things must be related to each other. In what way it does not know. But if we apply the data collected in a statistical manner, we can arrange these tokens and the result will probably be correct. But there's still a non-zero chance that the generated statement is meaningless as there's no foundation rule that drives it. So there's only tokens. And rules derived from analyzing texts (which lack the foundation rules that comes from being in the real world).

All of these to say the act of learning is either observing the real world and figure how it works. Or read from someone that has done the observing and has written his interpretation, then go outside and confirm it. Barring that, we reconstruct the life of this person so that we can correct the imperfection of languages. With LLMs, there's no way to correct as the statement themselves are not truthful. they can just be accidentally be right.


I think the core insight OP may be looking for is that your dictionary is just an illusion - that concepts being related to other concepts to various degree is all that there is. The meaning of a concept is defined entirely by other concepts that are close to it in something like a latent space of a language model.

Of course humans get to also connect concepts with inputs from other senses, such as sight, touch, smell or sound. This provides some grounding. It is important for learning to communicate (and to have something to communicate about), and was important for humans when first developing languages - but they're not strictly necessary to learn the meanings. All this empirical grounding is already implicitly encoded in human communication, so it should be possible for an LLM to actually understand what e.g. "green" means, despite having never seen color. Case in point: blind people are able to do this, so the information is there.


Blind people are no more able to understand* (as qualia) "green" than a sighted human is able to understand* gamma rays. The confusion is between working with abstract concepts vs an actual experience. A picture of bread provides no physical nourishment beyond the fiber in the paper it is printed on.

In an abstract space (e.g. word vectors, poetry) green could have (many potential) meanings. But none of them are even in the same universe as the actual experience (qualia) of seeing something green. This would be a category mistake between qualia-space and concept-space

understand in the experiential, qualia sense.

https://en.wikipedia.org/wiki/Qualia

https://en.wikipedia.org/wiki/Category_mistake


I don't need the qualia of gamma rays to understand gamma rays, nor to be understood in turn when I say that "I understand gamma rays".

Conversely, I can (and do) have qualia that I do not understand.

The concept of qualia is, I think, pre-paradigmatic — we know of our own, but can't turn that experience into a testable phenomena in the world outside our heads. We don't have any way to know if any given AI does or doesn't have it, nor how that might change as the models go from text to multimodal, or if we give them (real or simulated) embodiment.


>that concepts being related to other concepts to various degree is all that there is

This is the view that Fodor termed "inferential role semantics". https://ruccs.rutgers.edu/images/personal-ernest-lepore/WhyM...


Chomsky, of all people? Chomsky rose to fame by attacking BF Skinner’s book “Verbal Behavior”. Which is the book that made exactly the case you’re making now, only some 60 years ago.

Skinner would marvel at today’s LLMs. They are the most elegant proof that intelligence is not just shaped by external contingencies, but that it is identical with those contingencies.


To this list I would absolutely add Julian Jaynes' "The Origin of Consciousness in the Breakdown of the Bicameral Mind."

> simulations of conscious thought entirely via prediction language tokens

Jaynes goes so far as to assert that language generates consciousness, which is characterized by (amongst other features) its narrative structure, as well as its production of a metaphor of our selves that can inhabit a spatiotemporal mental space that serves as an analog for the physical world; the mental space where we imagine potential actions, play with ideas, predict future outcomes, and analyze concepts prior to taking action in the "real, actual" world.

The generation of metaphors is inextricably linked to the psychotechnology (to pull a word from vocabulary discussed by John Vervaeke in his "Awakening from the Meaning Crisis" series) of language, which is the means by which one object can be described and elaborated by its similarity to another. As an etymological example: the Sanskrit word "bhu" which means "to grow" forms the basis of the modern English verb "to be," but predates lofty abstract notions such as that of "being," "ontology," or "existence." It's from the known and the familiar (plant or animal growth) that we can reach out into the unknown and the unfamiliar (the concept of being), using (psycho-)technologies such as language to extend our cognition in the same way a hammer or a bicycle extends our body.

There is something here about language being the substrate of thought, and perhaps even consciousness in general as Jaynes would seem to assert in Book I of his 1976 work, where he spends a considerable amount of time discussing metaphor and language in connection to his definition of "consciousness."

There are also questions of "intentionality" and whether or not computers and their internal representations can actually be "about" something in the way that our language and our ideas can be "about" something in the physical (or even ideal) world that we want to discuss. Searle and the "Chinese room" argument come to mind.

Turing famously dodged this question in his paper "Computing Machinery and Intelligence" by substituting what is now called the "Turing test" in lieu of answering the question of whether or not "machines" can "think" (whatever those two words actually mean).


>Jaynes goes so far as to assert that language generates consciousness

The recent discussion of Helen Keller[1] and her description of learning the meaning of "I", strongly backs this assertion, on my opinion.

I read her words as implying that you can't have consciousness without self identity.

[1] https://news.ycombinator.com/item?id=40466814


100%, maybe intelligence is not as mysterious and extraordinary as we thought


One thing I always find interesting but not discussed all that much at least in things I’ve read is - what happens in the spaces between the data? Obviously this is an incredibly high dimensional space which is only sparsely populated by the entirety of the English language; all tokens, etc. if the space is truly structured well enough, then there is a huge amount of interesting, implicit, almost platonic meaning occurring in the spaces between the data - synthetic? Dialectic? Idk. Anyways, I think those areas are a space that algorithmic intelligence will be able to develop its own notions of semantics and creativity in expression. Things that might typically be ineffable may find easy expression somewhere in embedding space. Heidegger’s thisness might be easily located somewhere in a latent representation… this is probably some linguistics 101 stuff but it’s still fascinating imo.


My intuition is that the voids in an embedding space are concepts which have essentially no meaning, so you will never find text that embeds into those spaces, and therefore they are not reachable.

For example take a syntactically plausible yet meaningless concept such as "the temperature of sorrowful liquid car parkings"[1]. That has nothing near it in embedding space I'd be prepared to guess. When you embed any corpus of text this phrase is going to drop into a big hole in the semantic space because while it has components which have some sort of meaning in each of your semantic dimensions, there isn't anything similar to the actual concept- there isn't any actual meaning there for something else to be similar to.

You need the spaces because there are so many possible different facets we are trying to capture when we talk about meaning but only a subset of those facets are applicable to the meaning of any one concept. So the dimensions in the embedding space are not independent or really orthogonal, and semantic concepts end up clustered in bunches with big gaps between them.

That's my intuition about it. When I get some time it's definitely something I want to study more.

[1] Off the top of my head but you can come up with an infinite number of similar examples


> the temperature of sorrowful liquid car parkings

This is quite a beautiful, strange (estranging?) clause - at least in the sense that we (or I) constantly struggle to find meaning and patterns in what might simply be plain noise (apophenic beauty?). It’s a similar form of intrigue that I and I think others often experience when reading the outputs of LLMs operating in the high-temperature regime, though of course we are just talking about embedding/embedding inversion here.

On a human level though, it makes me wonder why you picked that phrase. Did you roll dice in front of a dictionary? Play madlibs? Were they the first words that came to your mind? Or perhaps you went through several iterations to come up with the perfectly meaningless combination? Or perhaps you simply spilled your hot chocolate on your favorite pair of pants or dress while getting out of the car this morning (or perhaps as a child) and the memory has stuck with you… who knows! Only you!

In any case, my original point was simply that these interstitial points in embedding spaces can become ways of referring to or communicating ideas that we simply do not have the words for but which are none-the-less potentially useful in a communication between two entities that both have the ability to come to some roughly shared understanding of what is being referred to or expressed by that point in the embedding space. Regular languages of course invent new words all the time, and yet the points those new words map to in the embedding space always existed (eh not a great example because the shape of the embedding space might change as new words/tokens are introduced to the lexicon but I think the idea holds). Perhaps new words or phrases will come about to bring some point back into textual space; or perhaps that point will remain solely in the shared lexicon of the algorithmic systems using the latent space to communicate ideas. Again, who knows!

For instance, consider the midpoint of a segment connecting two ideas, or the centroid of any simplex in the embedding space… if we assume that there is some sort of well-defined semantic structure in the space, is it necessarily the case that the centroid must refer to something which equally represents all of the nodes, a kind of lowest-common semantic denominator? Obviously if the semantic structure only holds over local regions but breaks down globally this is not the case, but if all the points are within a region of relatively sound semantic structure, that seems plausible. We know what happens when you do a latent space traversal for a VAE which generates images, and it can be quite beautiful and strange (or boring and familiar by 2024, depending on your perspective), but some similarly weird process might be possible with embedding space traversals, if only we could some how phenomenologically if not linguistically decode those interpolating points.

> concepts which have essentially no meaning

This is a pretty strange idea to try to wrap your head around.


> it makes me wonder why you picked that phrase

It took me a few goes to refine the idea. I started with the word sorrowful and thought "ok what could not possibly be sorrowful?" -> a car parking space.

Ok then what attributes could a car parking not have -> being liquid

Then once I had got the idea then I wanted some other physical attribute this nonexistant thing might have and that got me to temperature.

I agree with your idea that it's quite interesting to think about properties of concepts we are currently unable to communicate at all in our language. For example if my intuition is correct, even if you have two concepts which are completely meaningless you would be able to discern similarity/difference between them conceptually, and this is leading to your centroid idea. If we look at those centroids, some might land in semantically meaningful places ("Who knew? The average of tennis and squash is badminton!") whereas some might end up in this void space and that might be quite fascinating.

I've always thought[1] that creativity is essentially about making connections between concepts that had previously been thought to be unconnected and therefore it seems to me that some (not all) of these void spaces have potential to be joined in to the mainstream semantic space over time as people find ways to link these concepts to things we already have some meaning for. That's very interesting to me.

[1] After reading "The Act of Creation" by Koestler


> It took me a few goes to refine the idea. I started with the word sorrowful and thought "ok what could not possibly be sorrowful?" -> a car parking space. Ok then what attributes could a car parking not have -> being liquid. Then once I had got the idea then I wanted some other physical attribute this nonexistant thing might have and that got me to temperature.

Darn. I was really pulling for the hot cocoa theory.

Also, you clearly don’t live in New York City if you can’t fathom the idea of a parking space being associated with sorrow!


I strongly believe there's nothing there other than gibberish. Piping /dev/random to a word selector will probably enumerates everything inside that set. There's a reason we can translate between every language on earth. That's because it's the same earth and reality. So there's a common sets of concepts that gives us the foundational rules of languages. Which is the data that you're speaking about.


I think a concrete application of what your wondering is: What is the most useful word that doesn't exist?


This sums up what I wrote above (as well as in a longer reply to a reply) much more elegantly and clearly than I ever could. Thank you!

Edit: but I might exchange the word useful for something else… maybe not…


Now this is a fun idea. If you think of embeddings as a sort of quantization of latent space, what would happen if you “turned off” that quantization? It would obviously make no sense to us, as we can only understand the output of vectors that map to tokens in languages we speak, but you could imagine a language model writing something in a sort of platonic, infinitely precise language that another model with the same latent space could then interpret.


Ya I'm having my return to plato moment. It really feels like we are the dēmiurgós right now with AI systems. The nature of interpolation vs extrapolation and the exploration of latent spaces will answer a lot of philosophical questions that we didn't expect to be answered so quickly, and by computers of all things.


That reminds me of the crazy output you get when raising the temperature and letting the model deviate from regular language. E.g. https://news.ycombinator.com/item?id=38779818


The space is an uncountable set, at the limit. Mostly it’s noise. See: curse of dimensionality.


If I’m not mistaking, the coordinates in any given latent space (in this context) are countable, as there is a finite amount of dimentions. You can even only consider the space enveloped by the already explored coordinates (e.g. English words), to get a finite space which can be fully enumerated.


> Could you dynamically change the register or tone of text depending on audience, or the reading age, or dial up the formality or subjective examples or mentions of wildlife, depending on the psychological fingerprint of the reader or listener?

This seems plausible, and amazing or terrible depending on the application.

An amazing application would be textbooks that adapt to use examples, analogies, pacing, etc. that enhance the reader’s engagement and understanding.

An unfortunate application would be mapping which features are persuasive to individual users for hyper-targeted advertising and propaganda.

A terrible application would be tracking latent political dissent to punish people for thought-crime.


I'm sure it comes up frequently, but the adapting textbook thought reminds me of the "Young Lady's Illustrated Primer" from Diamond Age.


The repercussions of what the author summarizes as "could you colour-grade a book?" still feel wildly unknown to me, even after a couple years of thinking about it (see Photoshop for text [1][2]).

Partially it's because we're still wrapping our heads around what kind of experience this might enable. The tools still feel ahead of the medium. I think we're closer to Niépce than Muybridge.

In photography terms, we've just figured out how to capture photons on paper — and artists haven't figured out how to use that to make something interesting.

[1] https://news.ycombinator.com/item?id=33253606

[2] https://stephango.com/photoshop-for-text


> The tools still feel ahead of the medium.

Or maybe it's that we instinctively feel that writing should still be linear writing, if reading is still going to be linear reading.

Personally I think the "photoshop for text" analogy shows just how misguided it is to expect people to tolerate words that were calculated, not crafted.

Literacy is too important to mess with like this.


Genuine question — do you think synthetic images pose less of a problem than synthetic text? If yes, why?


Images — photos, paintings, designs — are not primary human expression.

Words are fundamental, dense, often objectively chosen, and the most primary way of communicating thoughts.

Asking someone to read your thoughts that you didn’t actually even think, because you’d rather save the time writing them, is profoundly disrespectful to the reader, who has to invest the same amount of time reading generated words as real ones.

Which is not to say that I think passing off generative images as one’s own work is not disrespectful. Or that extensive, unreal body sculpting or skin retouching is not — as a photographer I believe that to also often be not just unethical but immoral.

But a judgement on a retouched image is less of a burden of time.

I would likely judge someone who uses ChatGPT to communicate personally with me as harshly as I would judge them editing a photo to deliberately lie to me.

(Which is not to say that I don’t think GPTs have inherent grammatical advantages for cleaning up poorly-written text; I do think generating entirely new text is disrespectful to the reader, though)


When I think about Photoshop it is so tied in my mind to its history as an offshoot of ILM and the VFX industry https://en.wikipedia.org/wiki/Adobe_Photoshop#Early_history

ILM's famous t-rex scene from Jurassic Park contains very little text/dialog, but emotional, expressive, synthetic imagery: https://www.youtube.com/watch?v=Rc_i5TKdmhs

In this case the scene is not made up of "generative" images in the current definition of the term, but synthetic images generated from polygons, virtual lighting, etc. It seems that there could be artistic utility to manipulating text in a similar way.


I don’t think I mind it in explicitly artistic contexts so much, putting aside the fact that all the GPTs I have seen write in a banal, unimaginative, equivocating way that is exactly the opposite of what you want from creative dialogue.

I can see narrow uses for it in that sort of way.

But it’s being marketed as a tool for businesses to use to talk lazy crap at people who would prefer to hear from humans: it’s fundamentally a disrespectful thing in that context.


Artistically constructed images may not be primary human to human expression, but posture/silhouette is one of the most powerful human to other mammal expressions.

You can't communicate much beyond imperatives, but you can communicate those fairly strongly, even in the absence of time working on the shared vocabulary needed for the precision of words.


I have proof from my commit history on the readme to CTGS[1] that my usage of the term "Photoshop for Creative Writing" (What I tried to market it as) predates all of this by... years now.

https://github.com/Hellisotherpeople/Constrained-Text-Genera...

I'm obsessed with this idea of a proper LLM desktop class prosumer front-end. Something feeling like it was made by Adobe in a world where they didn't go to shit in the early 2010s. Blender, but for LLMs. Oobabooga, but actually good and not janky. It would ideally implement all forms of "representation engineering" and hacking or playing with the embedding/latent spaces, along with every other LLM feature folks would love to have but often don't know exist (i.e. constrained generation)

If you're a VC type reading this and believe in this idea, I really want to talk to you right about now.

Also, if you are an expert in DearPyGUI or DearImGUI, I want to talk to you right now.


Terence McKenna phrased this wonderfully, by saying “It seems to me that language is some kind of enterprise of human beings that is not finished.”

The full quote is more psychedelic, in the context of his experience with so-called ‘jeweled self-dribbling basketballs’ he would encounter on DMT trips, who he said were made of a kind of language, or ‘syntax binding light’:

“You wonder what to make of it. I’ve thought about this for years and years and years, and I don’t know why there should be an invisible syntactical intelligence giving language lessons in hyperspace. That certainly, consistently seems to be what is happening.

I’ve thought a lot about language as a result of that. First of all, it is the most remarkable thing we do.

Chomsky showed the deep structure of language is under genetic control, but that’s like the assembly language level. Local expressions of language are epigenetic.

It seems to me that language is some kind of enterprise of human beings that is not finished.

We have now left the grunts and the digs of the elbow somewhat in the dust. But the most articulate, brilliantly pronounced and projected English or French or German or Chinese is still a poor carrier of our intent. A very limited bandwidth for the intense compression of data that we are trying to put across to each other. Intense compression.

It occurs to me, the ratios of the senses, the ratio between the eye and the ear, and so forth, this also is not genetically fixed. There are ear cultures and there are eye cultures. Print cultures and electronic cultures. So, it may be that our perfection and our completion lies in the perfection and completion of the word.

Again, this curious theme of the word and its effort to concretize itself. A language that you can see is far less ambiguous than a language that you hear. If I read the paragraph of Proust, then we could spend the rest of the afternoon discussing, what did he mean? But if we look at a piece of sculpture by Henry Moore, we can discuss, what did he mean, but at a certain level, there is a kind of shared bedrock that isn’t in the Proust passage. We each stop at a different level with the textual passage. With the three-dimensional object, we all sort of start from the same place and then work out our interpretations. Is it a nude, is it an animal? Is it bronze, is it wood? Is it poignant, is it comical? So forth and so on.”

This post feels like the beginning of that concretization.


> “It seems to me that language is some kind of enterprise of human beings that is not finished.”

I would include this all the way up to higher intelligence itself, language is but the force carrier for intelligence. We've been developing muscles and balance for hundreds of millions of years, but our intelligence that communicates in advanced language is pretty much brand new.


Fascinating comment, that articulates the point of TFA better than TFA did.

I've always been highly articulate, and also frustrated by the limitations of spoken language. This is a common (maybe even the dominant?) theme in 20th century theatrical writing. People like Ibsen, Chekhov, Pinter, Genet, and Churchill all struggle with it in their own ways. People like Beckett and LePage and Sarah Kane ultimately kind of abandon language altogether.

Or, though poetry's not as much my field as theatre, you could go back to TS Eliot:

... Words strain, Crack, and sometimes break, under the burden, Under the tension, slip, slide, perish, Decay with imprecision, will not stay in place, Will not stay still.

My own speculation, along your lines, is that it's because sound is transient, hearing imperfect, and memory fallible. Even apart from ambiguity, two people will never quite agree on what was said. (Most of my arguments with my wife begin this way!) Even court transcripts, intended to eliminate this limitation, don't capture non-verbal cues.

As someone who's been marinated in the written and spoken word for all my life, research like this is fascinating, and slightly creepy: will all of the ghosts in the machine be exorcised? If those are blown away, and the bare mechanism of language exposed, what comes next?


> What would it mean to listen to a politician speak on TV, and in real-time see a rhetorical manoeuvre that masks a persuasive bait and switch?

Why do I suspect the offence will always be ahead of the defence in these areas?

I'd earlier suggested that everyone, in elementary school, ought to watch Ancient Aliens and attempt to note the moment where each episode jumps the shark. I take it we could attempt this with LLMs, now?


> Why do I suspect the offence will always be ahead of the defence in these areas?

because destroying is easier than creating/entropy increases over time?

The only solution I can see is working on turning bad actors into good actors, or another way: positive reinforcement cycles.

No idea what that would look like with regard to LLMs though.


At the end of the day there is no permanent solution.

In nature we typically don't see something 'win' and that's the end of the story. I mean yes things do go extinct, but the winner always has something new to deal with. Could be a more advanced predator eating all it's food sources. Could be a bacteria that it's not resistant to. Simply put, when there's entropy on the table, something is going to evolve to take it with the least amount of work possible.


So embedding space itself is interesting. It's more than a step to an LLM. That's been known for a while, back to that early result where "King" - "Man" + "Woman" -> "Queen". This article, though, suggests more uses for embedding spaces. This could be interesting. It's a step beyond viewing them as a black box.


Is ♔ - m + f = ♕ specific to embeddings, or does it also work in https://en.wikipedia.org/wiki/Formal_concept_analysis#Exampl... ? (either as ♔ ⊕ f ⊕ m = ♕ or as ♔ ⋀ not(m) ⋁ f = ♕?)

[alas, HN scrubs venus and mars symbols, and I shall spare you all the ancient egyptian hieroglyphs and O'Keeffean mathematical symbols, so `f` and `m` they are]


> What if the difference between statements that are simply speculative and statement that mislead are as obvious as, I don’t know, the difference between a photo and a hand-drawn sketch?

Given how long these have been pored over by existing hyperconnected nanomachine networks (i.e. brains) it may be that we'll mostly unearth qualities humans can already detect, even if only subconsciously.

When it comes to separating truth and lies, perhaps the real trick the computer will bring is removing context, e.g. scoring text without confirmation bias towards its conclusion.


LLMs seem to do more of what brains do unconsciously, rather than consciously. Which means brains may be better at rating e.g. trustworthiness of some text, but they don't surface specific ratings to the conscious level. Meanwhile, language models seem to be able to expose those features as knobs, allowing you to boost or attenuate them. So you get to drag the e.g. "excited" slider down to minimum, and get a text that may be easier to process at a conscious level. Having a slider to remove rhetoric from text would be really useful development.


For those perplexed by the headline, the Muybridge camera moment refers to Eadweard Muybridge who managed via camera photos taken in rapid succession to prove that when a horse runs it at times has all four legs above the ground.

https://en.wikipedia.org/wiki/Eadweard_Muybridge

(the article doesn’t bother to mention any of this until near the end in the tl;dr section, which since it’s tl and you dr, you never got to).


(On an irrelevant note, the Stanford Barn, where those pictures were taken, has gradually been closed off to the world. It was open to the public until COVID. It's still there, and there's a Stanford equestrian team, but road access has been cut and all mentions of the barn removed from directional signs.)


There are so many of these places I've encountered what used to be publicly available pre-COVID and are no longer. The reasons/excuses vary.

Example: Sometimes it's a symptom of a small business already wanted a reason to pivot to a new venture, and they keep the old thing going to profit from some old whales while in transition.


There was a lot of that post 9/11 too. It used to be that you could walk into nearly any office building in the world with little more than a smile and a confident wave. A lot of previously public areas got locked down on September 12th.


Office building security changed significantly much earlier than 2001. The mass shooting in 1993 at 101 California Street in San Francisco was the beginning of many such changes.

The attack [...] also precipitated sweeping changes in downtown San Francisco. Before Ferri walked into the building that July day, almost no high-rises in the city had security measures. While many had a front desk, only a handful checked badges. The building at 101 California had two side entrances that were completely unguarded. The Examiner reported that at the time, the Chevron building and Charles Schwab’s SF headquarters had the toughest security in town; electronic badges were required at Chevron, an anomaly in 1993.

Today, security checks are standard at offices large and small, a fundamental shift that happened because of 101 California.

https://www.police1.com/active-shooter/articles/101-californ...


I mean, honestly if this didn't happen it likely would have happened by now anyway.

Enough people would have walked in and picked up computer systems filled with company information that security would have been implemented at some point.


It's often public services that have reaffected resources while that place was closed, and found after Covid that they couldn't spare (of justify sparing) these resources again once they contemplated reopening the place.

Ie, while that historic greenhouse in the city park was nice and appreciated by some people, now that the two gardeners who were working in it part-time have to take care of the newly planted trees along the streets, it's not possible to put them back to the less essential greenhouse and they don't have budget for hiring two new gardeners. So the greenhouse stays closed.


Not only that, but the tldr basically only talks about that, so it's not much of a summary at all. I read the tldr and I have no idea what the article is about.


> "Even in 1821, horses were wrongly depicted running like dogs."

Great essay, but this small comment toward the end of the essay confused me. Is he saying that dogs never gallop?

I'm still not sure about the answer breed-by-breed, but searching for it led me to this interesting page illustrating different dog gaits: https://vanat.ahc.umn.edu/gaits/index.html

In particular, it seems to say that at least some dogs do the same "transverse gallop" that horses use: https://vanat.ahc.umn.edu/gaits/transGallop.html

And that greyhounds at least also do a "rotary gallop": https://vanat.ahc.umn.edu/gaits/rotGallop.html

I have a Vizsla (one of several breeds in the running for second fastest breed after greyhounds) and my guess is that she at times does both gallops. I can't find a reference to confirm this, though.


In the linked article (https://www.amusingplanet.com/2019/06/the-galloping-horse-pr...) there are some examples of "wrong" galloping horses. The first two examples look like the "rotary gallop", which is how a dog or a cat, not a horse, would run. The third example is plainly wrong, because the horses are mid-air but seemly ready to land in one leg.


For a game based on semantic vectors: https://semantle.com/


https://archive.is/EcQfE

Site is struggling


I thoroughly enjoyed reading this style of loose connected thoughts.


> Looking at this plot by @oca.computer, I feel like I’m peering into the world’s first microscope and spying bacteria, or through a blurry, early telescope, and spotting invisible dots that turn out to be the previously unknown moons of Jupiter… There is something there! New information to be interpreted!


Any tools to replicate @oca.computer's work?

Once we have the 1000-dim vector embeddings I can make the rest work. Not sure how to go from 20-word span to a 1000-dim vector embedding.


Generating embeddings is relatively simple with a model and Python code. There's plenty of them on HuggingFace, along with code examples.

all-MiniLM-L6-v2 is a really (if not the most) popular one (albeit not SotA), with 384 dimensions: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v...

Edit: A more modern and robust suite of models comes from Nomic, and can generate embeddings with 64 to 768 dimensions (https://huggingface.co/nomic-ai/nomic-embed-text-v1.5).

When the author talks about thousands of dimensions, they're probably talking about the OpenAI embedding models.


Zardoz predicted this ~50 years ago


Quite literally what my company does - https://ipcopilot.ai/

We discover innovative ideas in companies and help them protect their IP.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: