Consciousness, and Artificial Intelligence
Headings
Consciousness- the Mystery of “Self”
Through all the myriad of arrays of electrical impulses, a variety of qualities of stimulus can be produced and in incremenally infinitely quantity, even though these are no more than just that- chemical reactions and their products. A set of sensory stimulii which are chemical leads ot a set of effector responses, also chemical. A visual (or auditory) story (as an example) into the individual acts involved in it and the conclusion is drawn through this learned process as to how it affects us what act we must perform in response, and how we must feel (the hormonal response). So perhaps consciousness is related to the manner in which neural responses are visualized, and the manner in which a story-line is constructed from those thoughts.
So if a pattern of neuronal responses can be linked to a particular visual story line, then that story line is a thought, and if not visual, then auditory (at his moment I’m hearing what I’m typing in my head). This is what feels like “me”. A healthy mind is able to file away such patterns as “memories”. This is my take on “selfhood”. It might serve to explain how those that do not effectively construct coherent storylines are “difficult”, “difficult to relate to”, “incorrigible” and even on the autistic spectrum. Those who construct the wrong stories are deluded, those that hear incorrectly hear voices in their heads, those that construct parallel storylines might appear to have “split personalities” and so on.
Can Consciousness ever be a “Downloadable” pocket commodity?
Misunderstanding in the scientific community when it comes to so-called downloadable consciousness. It appears that the soldiers like nick bostrom who is somewhat controversial the proposing that evolution of mankind will involve being able to download a consciousness is into electronic systems that about it achieving a form of immortality. So there’s two ways of looking at this.
The first is that each of changing each neural element in the brain for a silicone element. This would be possible if we could workout and then went a silicon device that performs all the functions of a nerve cell. The simplest way to do this would be obviously too build another cell that was indestructible and yet alive post op. Having done this then to replicate the DNA in that artificial nerve cell including all the changes that lead to strengths of synaptic connections of the neuron encoded in it. It is quite impossible even to describe the scale of complexity that will go into create and one search neuron when we do not even have a full understanding of what DNA changes influence synaptic controls and how semantic controls might be locally maintained, and that might only be the beginning of the understanding of building alive cell. To escalate that into the whole brain one would need to be able to replicate the current state of every single neuron in that brain, it’s got instead of activity in terms of iron changes across the cell membrane and then somehow replace create each of the all of these neurons simultaneously so that presumably they can all be simultaneously switched. On top of that neurons are not the entire story of the brain, the clio system is extremely complex and extremely important and also not that well understood. So we haven’t even begun talking about how’s that feet can be achieved. Assume that all of this can be achieved bio silicon chip come on this is hard to conceptualise. But once again the process of interchange involves possibly insurmountable complexity. Having done all of this will not have shifted the question when he fall, cause we still do not know whether a complex set of silicon chips system does achieve consciousness. And we have good reasons seemingly to believe it does not stop.
This si not really a download process, rather this is a biologica process of achieving immiortality through preventing cell death. It is an assumptjion that silicon units working together or individually perform the same function as biological brain matter r neurons respectively, and the problem of the disruptionto the biological matter in the replacement process is over and above that.
However the more obvious process would be to simply build a silicon construct and download the brain data into it externally, this would avoid the replacement complexity and trauma. One might say that “data” is mergent from the iochemical structure of the brain at anytime, so this is essentially what requires to be reproduced: it would make more sense to do it in totoo so that all the complex interactions can also be recreated, not just the individual cellular structures. Replacing a neuron (say thi sis artificially grown) does not replicate the original cnditions because in the replacemtn, the interactions of the unit with the whole are lost.and further that loss might set up other reactions both in the unit as well a in the whole. GIvne these scenarios , a like for like replacement which was the point will nto be achieved. But as we were saying, the problem here is that we have not even discerned the data in teh brain, that is, it has not even been converted into a downloadable form, which was the whole point in teh first place.
the second scenario of creating an identica sructure wit identical data value is ot only option remaining. again, simply creating an identical biochemical state does not confer the property of downloadability of data, all you have is a biological clone state. We’ve not really solved the problem of converting biochemical states into digital states, or even knowing how the two ate interchangeable if at all they are. Also bear in mind that we ar enot just our brains but also the neural networks in the rest of our bodies.
But for this discussion say we assume that some discovered that basically cnsisted of all a humans thoughts, but not their biochemis=cal structures. That data is stored in the manner in which the data is conceived: in a computer. How is that data achieving conciousness? There is no amount or combinaton of fata on my laptop that will do that and I think my laptop at 500 GB has more data than I hold in my brain. I only carry around very vague adn selective memories.
So we make another assumption here, that the data does achieve consciousness, because it is through a supercomputer downloaded into the format of say, a large language model, that’s why its different from your computer. Thanks to the amazong self-attending transformer architecture it si able to incorporate the very meaning of language. into its operations. And say that throgh another assumtption this LLM developed the same subjective experience as what it were to be you. Does it now at last mean that we have downloaded your cnsciousness?
No, it mean’s there’s possibly now two of you, we’ve duplicated your consciousness. But the first you has not even noticed any change, leave alone having noticed being sucked into a digital vortex or anything like that. The you simply yawns at this point, closes the laptop, and goes off to bed, tomorrow is another day at work. IN order for you to indeed experience being sucked into a digital vortex never to return, as we see in so much of AI themed television, somethign would have to pass between you and your laptop Often it is much smaller than a laptop), somwthing that something that were the “youness” of you, something for which we already have a spiritual reference, which is the soul, but physically there is no such thing that we have been able to define and therefore nothing that can fulfill this condition.
A useful way of challenging the notion of this “downloadable consciousness” one comes across in works of fiction where the consciousness of a dead/dying subject is preserved and thereby conferred with apparent immortality is to ask something as simple as what happens when this consciousness is downloaded with the person is still alive and healthy. Does this not now give rise to two persons, verutable twins whose life trajectory will diverge from this point on like those of any twins (note that monozygotic twins are effectively biological clones). Where then is the notion of this supposed transference of the person or the subjective experience to the computer? Its an obvioius confabulation.
Are/Can LLMs be Conscious?
How do LLMs work?
We need to understand LLMs, which have stormed into human history with the advent of ChatGPT during the COVID pandemic of 2019, revolutionising what we previously thought computers might be capable of. The reason for the psychological impact LLMs have had is simple- they seem to speak. Language models developed very gradually and largely unnoticed since the 1980s due to the limitations of computing power.
They way they worked is by codifying a data source, perhaps a few pages of text, into a neural network. I’ve not heard of this being called a “codification”, rather it gets called “compression”. The reason that I use “codification” is because words get stored numerically as a system of numbers.
In a neural network data is distilled into layers. Each data point is assigned certani weights and the connections certain strengths. This is literally done through adjusting the voltages and capacitance respectively. This is what I understand as the process of compression. But the text is not stored as text, it is in this form of “weights and balances”. This means that contrary to what people believe, the LLMs do not see webpages, or even text in the form of sentences.
But what this process represents is the only means which humans have invented by which we might attempt to teach computers language: a distillation of a dataset consisting of volumes of human text (prose/poetry) into a codified and specific algorithmic form that would enable an output, stimulated by prompts, in the manner that a human speaks when prompted.
This is also how a human produces speech, by the way, we do not store webpages like a computer or hard drive, rather we have an inscrutable means of distilling information into a system of categorization with a similar assigning of weights according to relative significance and importance. But we’re not ver clear on how our brains accomplish this either (this is well described by Kant, for example and other “categorisers” like Aristotle an Schopenhauer, people who never saw a computer in their lives). This is also the means by which a baby or child learns language, which is not through any rules of grammar. This is why there was no attempt to teach a machine the rules of grammar, rather the aim was to somehow confer a knowledge of concepts., as one would tach a baby also. But the project is essentially to make a machine talk, to confer language upon a non-living thing, by exalting computation to the level of speech.
One might visualise a neuronal network by using a larger type of net, like a fishing net (or a volleyball net). Obviously hte usual use of nts is as a barrier and filter, but the in this case we must not visualise what goes through the net, but rather the net itself. I’ll say one other thing- we don’t understand what goes on inside LLMs very well at all. But if we are to understand anything at all, then we must at least be able to put it into language- these are language models after all. This is why I like to dispense with drawings, because useful as they are, they can give a false impression of understanding. I also realise that this is a strange thing to say which I myself do not fully know if I believe, because I am a very picotoral learner myself (or I thought I was). But I’m going to stick with it for now at least- I want to be able to explain a large language model in language, else I will not feel as though I had expressed it at all. Perhaps its the feeling that were I not to do so, it would mean that the computer were able to speak about us, but we were not able to speak back about it, rather only look at pictures of it.
Now just to complete the net network analogy: Imagine you were holding one end and your girl friend were several yards down the beach, holding the other. Now say you wanted to convey a subtle message to her, and the only way to do it were by waving the net so that she had to interpret it judging by the manner in which the wave reached her, the pressure and rate of the individual strings of that net upon her fingers. The only way you could manipulate that were to change the weights of the strings. Say through using some miraculously complex scheme, you were anle to manipulate the shape of the wave reaching her so that it contained information, or such that it conveyed language. There is one other thing that you could do, which is to add complexity to the shape of the network itself. This is what is done in the transformer” and with “back propagation”. Finally you could ask for a multidimensional net so as to increase the complexity of the wave form and therefore enable conveyance of more complex information. So that’s probably the only analogy I have for you, but I think its important to have a net analogy if we’re going to be talking about networks. Can’t not have that.
Unfortunately the first time that this was attempted resulted in pretty lacklustre results. The innovation that turned out to be the game changer was “attention” (which I think is the same as “back propagation”) as was first laid out in what is now recognised as a seminal paper in the field “attention is all you need”. This (and I apologise, I did say I was a pictoral learner) is the equivalent of the thing feeding itself into its own a***. That basically enables it to examine its results and improve upon them on the next run.
The key is to represnt data as numbers. The video from 3blue1brown was great here. We are familiar with representing data on a graph. When you represent data points on a graph, it is possible to extract a pattern by joing the dots with a curved, or better still, a straight line. Such a line has only two parameters: its slope and its y (or x) co-ordinate, where it crossed the graph, if it did. In an LLM there’s not two, but rather a billion parameters that are represented not in 2-D but in 12,000 D (I’m not even joking). So while points along the line are represented as (x,y), points in the LLM are represented my matrixes and these numbers are called “tensors” (another interesting parallel to general relativity). What’s the point? Well through these complex multidimensional associations, the LLM is able to associate and interlink words according to meaning. You could never do this is 2 or 3D, because three parameters simply does not have sufficient bandwidth to carry meaning. This is how numbers can carry meaning.