Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's the point?

So you can get predicted text that looks "coherent". Then what?

There is literally no place to add logic. Neural net-based language models are impressive, sure, but it's not hard to see how useless they are.

The only time their output is logically coherent is when they are lucky, and that seems to happen often because most of their input was logically coherent to begin with.



Whether or not the current technology is useless is an empirical question. How many people are using ChatGPT, Stable Diffusion, etc. for economically or personally valuable activities? We actually don't know.

Even if we assume the technology is useless in its current state, it is still incremental progress. Could we have predicted 10 years ago what neural networks would be capable of today? Now, tell me what neural networks will be doing in 10 years. If you think you know the answer with any degree of certainty, you're probably deluded.


My point is that ML-based NLP (like chatGPT) has a clear ceiling, and we seem to have reached that.

We can get coherent (understandable) output all day long, but we can never introduce logic.

ML-based NLP is a semantic word-guessing machine. It's based on entirely on how often words show up near each other in the training datasets. There is no room to add logic.

The entire exercise is like a magic trick: impressive sure, but at the end of the day, a fool's errand.


You make very strong claims about things we know very little about. It's far from clear that we have reached a ceiling. Who can predict how systems with 10x the parameters and as-yet-undiscovered deep learning models will behave?

We don't understand how humans do logic. It's entirely possible that whatever structure in the human brain is responsible for handling logic can emerge in a neural network.

If we're talking about what it takes to get to true AGI in the near future, then I agree that a pure neural network approach might not cross the finish line first. I think Stuart Russell made this point in an interview, basically saying that a neural network is a very inefficient computational model and that we could do the same thing much more efficiently if we had the right "good old fashioned AI" algorithm. But fundamentally a neural network is just computing a function so there's nothing in principle preventing a neural network from doing whatever a symbolic system does. It's mostly a matter of efficiency and hardware availability.


But we do know plenty about it. It's right there in front of us. Pretending there is some understanding just out of reach is called mysticism.

What you are telling me is that I should place my expectations for the future, not on the reality in front of me, but on the hopes and dreams you have for the future. That's circular reasoning.

The very reason that I don't place credibility in your assertions is the lack of reason itself: in your assertions, and in what a neural network is.

Neural networks are like dreams. Wonderful only when your intention is to get lost in a swirl of memories. Useless if you want to actually accomplish something.

Knowing the difference is crucial, because that difference can never be taught to a neural network without completely redefining what a neural network is in the first place.

Knowing the difference is literally the thing neural networks are incapable of doing. They don't know anything. They just guess. That's literally the function. In the code. Guess what comes next.

There is no sense pretending sense itself will magically appear out of a guessing machine. Neural networks are nonsense generators, and that is what they are forever doomed to be.


You made two claims: (1) current language models are useless and (2) current language models have reached a ceiling. I said:

How many people are using ChatGPT, Stable Diffusion, etc. for economically or personally valuable activities?

If (1) is true, then the answer to that question is "zero" or at least "close to zero". Do you really believe that?

If (2) is true, then it is also true to say that transformer models will never exceed today's capabilities by a significant amount at any time in the future. Do you really believe that?


Yes. The ceiling is the floor.

The limitation is inherent in the core design. There is no overcoming. This is not a hurdle or a wall. It's a design flaw.

Is it totally useless to everyone? No. Not completely. It's like a coherent search engine: a way to find data that is close to other data. But "close to" in this case is only "semantically", and never "logically", so that's that.

Is it going to get any less useless than it is? Only slightly. "It" will never get better. The only better version of "it" is a completely new ground-up redesign that doesn't resemble "it" at all.


Modern neural network architectures are Turing complete [1]. So I don't see any argument for a limit in principle unless you are arguing that a Turing machine can't achieve language understanding. If that's what you're saying, then I wonder who is espousing mysticism here.

[1] https://arxiv.org/abs/1901.03429


Are you forgetting the distinction between a program and a computer?

Language understanding doesn't magically spawn itself as a process on your computer! Someone has to write that program first.

And that's my point. ChatGPT transforms language, but it does not understand it. For that, we will need a different kind of program.


Language understanding doesn't magically spawn itself as a process on your computer! Someone has to write that program first.

Do you think it's impossible for such a program to emerge as weights in a Turing complete neural network architecture?


Play with https://chat.openai.com/ to experience how powerful predicting text is.


I have.

And as I said, it's very impressive.

And it has some usefulness: essentially it's an alternative to reading through many pages/posts of StackOverflow and Wikipedia.

But it doesn't know anything. It has no clue whatsoever whether it is correct or incorrect. It only makes guesses. The only reason there is useful output is because that output is a transformation of useful input.

There is no logic. There is no way to introduce logic. There is no way to filter it through logic.

If some coherent mixture of the ML's training datasets already contains the answer to your question - like literary or code examples, definitions, etc. - then the output will be useful. Otherwise, it's just wrong, and sometimes unexpectedly so.

The output of chatGPT (or any other ML-based NLP) can only be as correct or knowledgeable as the data it is trained on; and it will practically never even match that level, because it is only mixing words by semantic popularity, never by logical relationship.


Chat bot interfaces are only a small part of what can be done with large LMs.

You can use and fine-tune them to solve almost all existing natural language processing tasks: machine translation, recommendation/search, text classification and summarization, code generation, etc.


False.

You can use them to transform already existing text and code (the training datasets); but you can never do more than that.

There is no room in the ML algorithm to introduce logic. It's doomed to forever be a guessing game; and the resulting guesses will always be limited by the information it is fed to begin with.

The only reason chatGPT is so impressive is that it is transforming human conversation that itself is impressive (except that we were already aware of it). The code generation, literature, and definitions, etc. it outputs are all just rephrasing the written code, literature, and definitions that it was given as training data.

It's effectively no more than a sleight-of-hand. Flashy and impressive, but never anything more.


You should read this: https://ai.googleblog.com/2022/11/characterizing-emergent-ph... .. and probably also the paper.

I find these emergent phenomena pretty interesting.


They are missing the forest for the trees.

The "emergent phenomena" can be trivially explained by the input they are giving it.

They are not using a dataset that contains an equal amount of "correct" and "incorrect" responses. They are using datasets of human communication, which are obviously filtering for "correct" data. We get things wrong occasionally, but that is quite rare relative to what we get right. We can't even structure a sentence without getting something correct!

If you feed a dog good food, is it really a surprise that dog is healthy? You never fed it poison!

The language model is only returning semantic relationships. The "emergent phenomena" is that most semantic relationships in human communication just happen to also be logical relationships.

But the language model doesn't know that. In no way does it interact with logic. It only interacts with semantics.

If anyone actually bothered to train an instance of GPT or whatever on poisoned data, (i,e nonsensical stories) then you would see that emergent phenomena disappear. But no one is writing the nonsensical stories in the first place, so such a dataset does not exist.


You're at least a few weeks behind the state of the art.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: