I can well see and disambiguate what it is and is not what QGIs aims to be. Nothing is close to QGIS in terms of features, it deploys 1gig of open tech. The only trouble is the design of the app haven’t changed much in 20 years, while data sizes did 10x at least on every level. It struggles to breath u der heavy load…
My current understand that "subjective experience" is a post effect of memory forming in the process. "I experience X" ≈ "I remember that I just recently received [external stimulus / interpreted my current state as] X".
And I am as well baffled why people make such a big deal out of "subjective experience" and "consciousness".
I was joking that maybe I miss this properties, but now starting to really wonder if it might be the case. What if these phenomenons are present in humans to various extent? Check aphantasia. Only in XIX we discovered, that ability to visualize mental images is not universal, available to different people to various degree and some people completely miss it. My ability to visualize is weak. What if "consciousness" and "subjective experience" are similar?
And I am slightly worried when I am writing this that it might turn to be truth and in ~20 years I will be treated as "inferior human" without complete set of human rights.
I am getting tired of hearing "next token predictor" from carbon-based facial expression predictors. You are saying this like an argument which allows somehow estimate upper bound of possible influence of these entities. And I do not see how this help to make predictions. It sounds to me like saying "but air is just molecules bobbing around". OK, that's true, but does it help calculate wing aerodynamic profile?
Yup, it makes sequence of symbols. We already have seen that producing specific sequences of symbols is mindbogglingly powerful: merely DNA producers somehow has flown to the Moon!
And yes I am quite aware of Chinese room analogy. Perfectly fine applies to humans as well: single neurons in my head do not understand language, yet I as a whole I would say do understand. Just like applying Chinese room to humans does not help to estimate what humans can do I do not see how it helps to estimate what LLM can do.
It poses a simple problem. Take humanity back not that long ago into the past and language didn't even exist - our expressed token base was practically 0. We went from that discovering the secrets of the atom, putting a man on the Moon, and more. If you put an LLM in that starting point, they're going to do nothing but endlessly cycle over basically nothing. If you give them an infinite amount of time and processing, that wouldn't change.
This same issue simultaneously demonstrates how humans are not anything at all like token predictors. No matter how much time you spend remixing the tokens of primitive man, you don't get 'and here is how you land on the Moon' from it.
Token is not a clump of letters. It's a multidimensional initial input vector that gets tweaked and transformed. GPT doesn't think in tokens. It just accepts them as input (although it happily accepts any other vectors in-between the vectors that represent tokens and finding best prompt for a given task not as tokens but as input vectors is a legitimate prompt optimization strategy).
It also outputs vectors that are coerced into tokens for human consumption.
Yes, it goes through tokens but possible internal meanings assigned to these tokens (when surrounded by other tokens) are infinite.
That's how humans form caves got to where we are now. By associating new meanings with the same old sound clumps.
Only by having the LLM random walk the hypothesis space with a validator rejecting invalid ones.
The reason why LLM hypotheses are any good is because it already consumed a civilization worth of knowledge. You couldn't have bootstrapped such system with nothing but a few priors/axioms and let it discover the universe.
Well yes, LLM need rich and favorable substrate to grow and learn (or we might say bootstrap)
As well as DNA needs specific substrate (cell with ribosomes and other machinery). As well as humans (one need oxygen atmosphere, food, parents).
But in the world we live in existence of favorable substrate for humans or LLMs is a given thing. It is _already_ bootstrapped. Can we infer something about LLM limits or possibility of it achieving AGI from its bootstrapping requirements?
> I am getting tired of hearing "next token predictor" from carbon-based facial expression predictors.
That's not even a clever swipe, and it's tiring seeing such a knee-jerk reaction to a completely accurate description. LLMs are next token predictors. People are not. Humans have an inner world and subjective experience. Humans learn through their experiences, not just backprop.
Token predictors are lesser, they are not alive and will never be alive.
Let's assume we have infinite memory with constant time lookups. With a sufficiently large lookup table, you could exactly replicate the behavior of any person. You could encode it as a next-token predictor: you have precomputed every possible prefix and assigned it a next token. This is a Chinese room, but it is completely indistinguishable from an intelligent, sentient person. There is no experiment you can design to slip a piece of paper (a prompt) under the door to determine whether it is Bob or the lookup table clone of Bob inside the room.
Does that make the lookup table conscious or alive? Undefined. It's the wrong question. Or it's not a question science can address.
So we cannot dismiss on it's face the idea that next token predictors "are not and never will be alive" unless by "alive" you simply mean "biological," but that's not really what's debatable.
The argument is also very brittle because they are not in fact all next token predictors. I doubt people making this argument would be willing to concede that diffusion models are more likely to be conscious than causal models (which I do not believe but is an implication of the argument).
I'm not saying that they are conscious or sentient to be clear, but the reductionist argument that they are next token predictors and therefore don't have some property humans have is not an argument. That's going from A directly to Z. You need to flesh out the bit in the middle because that doesn't follow.
Right. Humans are a biological computer. They have a state and they compute an output. I had to look this up (and use AI) but an estimate for the state of a human mind is about 5 peta-bits (10^15) and the estimated processing power is about 1 exa-FLOP (10^18). Compare this to the largest models at ~5 tera-bits (10^12) of state space and ~2 x 10^14 FLOPS (for one session with some reasonable token rate).
Assuming the above is anywhere near true (I think there's a lot of debate about the capacity of the human mind, where data is actually stored, and where compute happens) then we are talking about 3 orders of magnitude win for humans in state and 4 orders of magnitude in compute. And we're doing all that pretty energy efficient as well.
The other big difference in humans is that we learn and the model only "learns" in context. Out "learn" space is much larger than the 1M tokens that frontier models struggle with.
Anyways, point is that a computer can appear to be alive. If we simulate the human brain perfectly and train it like a human then we'll have something that has human capabilities. LLMs have interesting capabilities but at least at this point not fully human ones (and the delta-state/compute would be a hint that there is still a large gap to cover).
human context/memory could just be an Agents.md file too that gets read instantly before your next token prediction runs. The AI can make multiple such memory files and read on demand depending on what the topic is, kind of like how as a human when you try to remember a math problem you don't go to your childhood bicycling Agents.md file either.
The point is that this is no more relevant, informative, or even accurate than "carbon-based facial expression predictors". Any phenomenon in the Universe can be described by a simple and/or insulting short phrase. In other comments you've also shouted out "autocomplete!" and "Markov chain!", as if these phrases are a knock-down argument.
"Pachinko machine", "avalanche", and "game of mad libs" has also been used:
>Humans learn through their experiences, not just backprop.
Sure, sure. And humans move through the act of walking, not just terrestrial locomotion.
>Token predictors are lesser, they are not alive and will never be alive.
And on and on it goes...
Which means what the real world? What are we supposed to see now or in the near-future? I assume you've been saying all of this stuff since at least the launch of ChatGPT. Probably longer than that.
But this is complicated and takes us sideways. Let's say somehow we can determine if LLM has inner world or/and subjective experience. Will this new gathered piece of information affect your estimate of upper bounds of LLM capabilities? It does not affect my estimate.
The Philosophical Zombie thought process is dumb, because zombies don't exist, so the entire premise depends on something that quire frankly might be impossible for the very reason it is arguing against.
I read the misanthropy as ironic. They're applying the same reductionist logic to humans, not because they are misanthropic, but to illustrate that it doesn't help us understand the case we can all agree on. "Humans aren't sentient either" is definitely not the takeaway.
The point is, we have no idea what "sentient" or "intelligent" even means. If we agreed on the definitions, the debate would have been settled long ago.
Imagine the dependency humanity has on such a technology, after 1/4 of a generation of time has passed- and all the students grew up with "I dont have to know anything"
I suppose we do not even know exact reasons of decline of wildlife population. Quite definitely it is due massive intervention of human activity, but which aspects exactly? Light pollution might play big role in insect population decline.
This is a push for privacy and it is fundamentally pushes in the opposite direction from let's say "forming accurate knowledge about the world".
How can we combat being mislead by false AI generated images? I'd say keeping track of provenance is what we should adopt, at least as an option. I hope we will find solutions to propagate images over the net reliably keeping how, when and where they were taken.
This is far from the first time that I see on HN indignation on LaLiga blockings. Sadly all this rage does not seem to lead to any change.
I'd like to suggest some steps that might/should be followed, which I will not pursue personally but in my defense - I do not live in Spain and not affected.
1) (first! low-effort) Somebody should create any space on the internet, where such anecdotes might shared and probably people with common goals of fixing internet access in Spain will meet. E.g. telegram group, discord channel, subreddit...
2) probably create wiki with related research: legal framework and possible actions etc
3) Raise public awareness. Create a resource/website with schedule of past and future "semi-blackouts", simple explanation of possible effects a layman may notice etc
4) Explore legal actions that might be taken. How this issue might be forced to be discussed by politicians? For instance I know that Portugal has official mechanism to put forward petitions, that will be discussed in parliament if get enough votes [1]
Space of possible demands in such petitions is vast. For instance:
- Make LaLiga compensate partly price of internet access
- Force LaLiga to include education notice in the beginning and the of translation with title like "Start of reduced internet connectivity" / "End of reduced internet connectivity"
Humankind is not doing well with implementing new policies. We should really strive for each new policy (like in this case - blocking access to some parts of internet during soccer games):
- Consider running policy in small scale scenario (e.g. testing blocking in small parts of Spain before whole country rollout)
- Implement channels to gather info from those who are faced with results of policy implementation (in this case: the op got webpage with description why the page is blocked - a bit of sanity! It would be better if it was served with HTTP code 451)
- Policy instructions
- When deciding on policy put a date at which policy should be reconsidered and revised using data collected during the time when it was in effect
- ... and some more I have not thought about.
Let's strive to cultivate this principles in all life areas where we can affect how new policies are implemented.
reply