My habit of thinking I'm done and later realizing I'm not will likely affect this post. In this case I plan to deal with it by updating without notice.
So, ethics. Who wants in?
Zeroth, note this is a test of how much you in fact care about ethics. How much patience do you have for attempts to solve the problem? Put up or shut up. I'm willing to keep this thread going until at least next year, with the exception of entertaining repetition.
First, procedure. Second whether it is in fact a solvable problem.
- Work out a spec for objective ethics
- Details:
- What are the necessary and sufficient conditions?
- Include inconsistency tests
- Purpose:
- Agree on what we're disagreeing about
- Determine the candidates for matching the spec
- Basically, lay out what you think
- Pin down the disagreements
- Including disagreements on what we should be disagreeing about
- Including disagreements on how the procedure should go, indeed this should probably come earlier, but the logical hierarchy is what it is...
- Try to fix the disagreements
- My usual habit is to examine the cause and then change the causal facts
- This is where examining assumptions comes in. As rational thought is hardly the only way to come by a belief, this is merely one example.
- Presumably, fail.
- But at least have a reasonable idea of why it failed.
- Maybe fix the fail, depending on what the causes are.
- At least, predict why it will fail and test it.
Next, the idea that it can't be solved.
We went to the moon. I can't go to the moon by myself. By contrast, solving a logical puzzle is just a matter of having enough time to invest. While unlikely, I'm not ruling out moral nihilism.
The major epistemic obstacle for the puzzle ties into a practical obstacle. To correct for random biases, it is best to do review by peers, and the practical matter is that to get more ethical behaviour, more have to follow the ethical codes which generally means agreeing to them.
As a complete tangent, has there been more ethics posts lately than usual? I'm even missing at least one, which I'll add if I remember it.
Foseti comments. Spandrell. Aretae links me to Hanson comments. A dude called Thinking Emotions.
49 comments:
I'm inclined to be willing and able to play starting Monday. I'm moderately skeptical, as my writings have indicated.
Sweet.
So your prediction is that no agreement will be reached, then? Do you have a prediction as to the cause?
I think there's a problem with laying out moralities to compare, should probably remove that step.
Before I forget, I want to mention that my usual method in these situations is to be deliberately stupid. Not just accept ignorance, but embrace ignorance. Do I know what I in fact think I know? I will assume not until my hand is forced.
One of the usual corruptions of projects like these is to form a rubber-stamp committee.
In this case, it would be twisted so that the method always came out to agree with whatever I said. Or conversely, I'd start saying thing precisely to game the system.
Nevertheless, I think a sincere effort can defeat both these failure modes. I would start by asking what the final result should look like and comparing it to the actual result. Then, on differences, testing which result is actually better.
Fire away!
1. Would you agree with the following clarification of 'ought'? A future world where humans have acted as they ought to is better than one where they didn't.
Note I'm speaking mathematically. If 'ought' is a null set, the statement remains true, even though essentially meaningless.
Second, I imply that a world which is worse, cannot have been brought about by doing as one ought. Again, I'm trying to make the statement robust against 'better' being meaningless or undecidable.
2. Speaking of empty sets... You're a Bayesian, right? So 'conclusive evidence' is, in your opinion, an empty set?
3. I agree that physics objects and events don't have moral xml tags.
While I'm suspicious of supervenience, it is convenient as a starting point for communication. Are you familiar with the relationship?
>1. Would you agree with the following clarification of 'ought'? A future world where humans have acted as they ought to is better than one where they didn't.
No. Two problems:
a) A given "human", as an ill-defined holistic entity, isn't an expected utility maximiser. Saying what an entity that possesses conflicting goals "should" or "ought to" do isn't strictly possible.
b) "Better" with reference to whose goals? The use of the word "better" without reference to any particular agent implies a one-place moral function, which is an example of the moral projectionism that I decry.
>2. Speaking of empty sets...You're a Bayesian, right? So 'conclusive evidence' is, in your opinion, an empty set?
I don't believe in the notion of "absolute certainty", if that is what you're asking. (The one exception that I might admit to this is the proposition, "I instantiate qualia". I find qualia difficult to understand, though.)
>3. I agree that physics objects and events don't have moral xml tags.
>While I'm suspicious of supervenience, it is convenient as a starting point for communication. Are you familiar with the relationship?
The only context in which I've encountered the notion of "supervenience" is in the discussion of qualia - the idea that subjective experience "supervenes" upon the brain or its mental computations. This seems to me basically a way of saying, "One phenomenon or event apparently entails another phenomenon or event, but the relationship between these things is not otherwise understood and it is difficult to fathom how or why they are related.
1. Utility, then is defined by what the thing gets hedons from, yes? And a human is a several hedonic systems grafted onto each other?
b) "Better" according to whatever standard works, including the possibility there is no consistent standard to choose from, in which case it's a null set.
3. I like your summary of supervenience.
While we're on the subject, I'm pretty sure there's only two basic relationships: identity and causation. Would you agree? For example, either qualia are either caused to exist by the brain or simply are features of parts of the brain. Or perhaps some qualia are one and others are the other.
>Utility, then is defined by what the thing gets hedons from, yes?
No. Utility is the measure of an intelligent agent's goal satisfaction. I see no reason why the mental representations of goals in an agent's brain should be identified with the computations that bring into being its qualia - they aren't the same thing, and an agent's arbitrary goals don't have to make any reference to his or its qualia.
>And a human is a several hedonic systems grafted onto each other?
Several competing goal systems grafted onto each other. There appears to be only one hedonic system, i.e. "subjective experiences" or qualia appear (as far as they are understood) to be a unitary phenomenon.
>"Better" according to whatever standard works, including the possibility there is no consistent standard to choose from, in which case it's a null set.
There is no consistent standard. Consider an intelligent AI that has been designed to be nothing but a paperclip-maximiser. What would it mean for you to tell this agent that it would be "better" for it not to take action X (e.g. kill humans in order to use their atoms to make paperclips), absent any prudential reason not to do so?
It would mean that you were talking as though events in the world contained little tags with "good" or "bad" written on them, and Occam's razor tells us that this isn't the case.
However, humans are similar to one another, so we (our "eminent selves") can agree to try to maximise some "arbitrary" utility function - e.g. to behave as hedonic utilitarians.
>While we're on the subject, I'm pretty sure there's only two basic relationships: identity and causation. Would you agree? For example, either qualia are either caused to exist by the brain or simply are features of parts of the brain. Or perhaps some qualia are one and others are the other.
I just don't understand qualia, beyond the idea that they clearly "supervene" on the brain. One further insight is that qualia are more likely to be related to computations made by the brain-as-computer than physical states of the brain (i.e. instantaneous snapshots of the brain in spacetime). Why? Because otherwise, Greg Egan's "dust hypothesis" would suggest that most qualia in the Universe should be totally chaotic: the product of completely random assortments of particles. There would be no reason to expect qualia to be associated with particles that happen to be conveniently located together in some "brain", rather than distributed randomly.
Would you like to understand qualia better?
For example, computation can't be the thing for the exact same reason. Physics can't tell the difference between things humans call computation and things they don't.
1.
Can you clarify what you mean by mental?
The way I use it, it refers to minds, not brains, which means qualia by definition.
Doesn't this definition of goal include plants? They implicitly encode the goal of continuing to live through chemical reflexes. Specifically, whenever it senses cell death, it will resist.
>Consider an intelligent AI that
Right, I don't think I can misunderstand that, though I'm willing to make sure if you don't think it is a waste of time.
So what's your falsification condition. Your statement is an example. Can I simply use a counter-example?
How do you know the AI example is exhaustive of the possibilities?
>Would you like to understand qualia better? For example, computation can't be the thing for the exact same reason. Physics can't tell the difference between things humans call computation and things they don't.
See page 74 onwards of this document:
http://singularity.org/files/TDT.pdf
There appears to be a need to recognise the independent existence of “Platonic computations” in addition to physical configurations of matter.
But in any case, as long as we agree that qualia exist and “supervene” on brains, whatever the details and mathematical or physical laws therein, I don’t believe we have a disagreement that’s relevant to ethics.
>Can you clarify what you mean by mental? The way I use it, it refers to minds, not brains, which means qualia by definition.
I use the word indiscriminately to refer to qualia or to the physical configuration of a brain.
>Doesn't this definition of goal include plants? They implicitly encode the goal of continuing to live through chemical reflexes. Specifically, whenever it senses cell death, it will resist.
I suppose so. This seems strange because of the connotations of the word “goal”, which include the idea of intelligence (“efficient cross-domain optimisation”) which doesn’t characterise plants very well. However, one could say that a plant has an extremely primitive type of goal representation – by regarding a plant as having the goal of e.g. seeking sunlight, albeit in a somewhat unintelligent way, an intelligent agent has distinguished an exploitable regularity in the world, or cluster-in-thingspace, and that is the purpose of words and concepts.
It might therefore be useful to speak of a plant’s utility – or perhaps the utility of a coherent sub-routine or sub-agent within the plant. However, the plant doesn’t have qualia and therefore isn’t a subject of “hedonic utilitarianism”.
>So what's your falsification condition. Your statement is an example. Can I simply use a counter-example? How do you know the AI example is exhaustive of the possibilities?
The AI with an arbitrary utility function is a sufficient counter-example, to demonstrate that if you would seek to provide all intelligent agents with reasons flowing from an objective notion of “good” and “bad” or “better” and “worse”, you would be irrationally imputing your own sense of morality to the environment as though events had little moral tags attached to them. No amount of positive examples could rescue the idea of objective (one-place function) morality, once a single counter-example has been provided.
If one only considers humans, the fact that humans have generally similar goals confuses the issue. The AI example is therefore a useful clarification, although perhaps a Phineas Gage or Ian Brady character would serve almost as well.
>But in any case, as long as we agree that qualia exist and “supervene” on brains, whatever the details and mathematical or physical laws therein, I don’t believe we have a disagreement that’s relevant to ethics.
Agreed. Even if there is some relevancy, the fasted way to find it is by stumbling over it.
>I use the word indiscriminately to refer to qualia or to the physical configuration of a brain.
Crystal clear, thanks. Let's use your definition.
>No amount of positive examples could rescue the idea of objective (one-place function) morality, once a single counter-example has been provided.
Then if I were to show that the example is flawed?
Do you know what kind of things you would accept as evidence the example or analysis thereof is flawed?
No amount of positive examples could rescue the idea of objective (one-place function) morality, once a single counter-example has been provided.
>Then if I were to show that the example is flawed? Do you know what kind of things you would accept as evidence the example or analysis thereof is flawed?
You would have to demonstrate either:
1. That an AI with a utility function that says simply, "maximise the number of paperclips in the Universe" is impossible to build.
2. That morality is indeed a feature of the environment, rather than merely appearing to be so due to the evolved human propensity for moral projectivism.
3. You could try to rescue objective morality only amongst humans, by arguing that the human value-set is qualitatively different to the arbitrary utility functions one can dream up. In advance: there's a great deal of quantitative similarity, but obviously there are human sub-agents (e.g. Ian Brady's desire to murder) whose goals can still conflict with any alleged "objective" moral prohibition or exhortation, and who therefore have no rational reason (beyond prudence) to heed these given that moral goodness and badness isn't a feature of the environment.
4. That I am beset by some more fundamental epistemic problem (e.g. I'm in the matrix, and almost everything I know about mathematics, physics, computation, the brain, God, Occam's razor, metaphysics etc. is a complete lie).
By the way, the punchline question is, "Do you want your turn at asking questions or shall I just go an do my thing?"
To confirm:
0.
We agree that;
1.
There are no moral xml tags. For example, a shoplift cannot be objectively defined as an event, and even if it were, there's no moral price card hanging off it to be read. The event can be fully described by the motion of atoms or equivalent fundamental bits.
2.
Hedons - dolars is a consistent and useful guide to behaviour, especially as
2.1 Humans include several conflicting goal systems.
3.
Referencing 'better' without referencing an agent is of a class of error. The class includes saying something is 'necessary' without reference to a goal, or describing the purpose of an object without referencing its designer.
4.
Goals and hedons are not necessarily related.
5.
Telling the paperclip-maximizer it would be better not to murder is unlikely to convince it to not murder.
6.
Supervenience is probably kind of whack.
My position is that the statement 'moral properties supervene on physical properties' is almost true, requiring only that the supervenience model get repaired.
Moral consequences are inevitable derivatives of certain physical facts, exactly the same way the possibility of multiplication can be derived from addition.
The intuitive morality humans are inclined to is mostly correct, with deviations serving clear adaptive purposes, or else on epistemically difficult problems.
I also believe hedons are morally relevant and utility is not.
So would you like your turn at questions, or should I just do my thing?
In particular, it is socially unknown for someone in your position to start asking me questions back to make sure I understand you, and I suggest you do exactly that if you doubt my understanding at all.
Drum frische dir, Mime, den Muth!
Your 8 points seem a decent summary of my moral views.
I'm therefore surprised we don't yet agree entirely about ethics. Please elaborate on your idea that moral consequences are inevitable derivatives of certain physical facts.
I found that summary unsatisfying, so I did some research that is original to the best of my knowledge. That is; this is an acknowledgement of psychologically-driven philosophy.
Does the paperclipper actually enjoy having created more paperclips? Or are paperclips just a utility function that can be assigned to it that happens to be good at predicting its actions?
In short, does the paperclipper actually value paperclips, or is it just a machine?
Here's my premise: value is valuable. The consistent, objective way to define 'better future world' is to sum over all satisfied values in all consciousnesses.
Because there effectively no higher power, because there's no xml tags, there's not only no value outside humanoids, there's no not-value either. The question is wrong, like asking the direction of green.* Thus, when a consciousness values something, there's no possible non-conscious contradiction. The thing simply becomes valuable. Satisfying one value without harming anyone else's is strictly better - strictly more valuable.
*(I enjoy making up these questions. What colour is my deathstar? How many crew on my transdimensional tunneler?)
If the paperclipper clips paper because it finds it valuable, it implicitly endorses the idea of value.
Thing is, in function f(x) where x is value, there's no grounds for distinguishing the paperclipper's values from mine or yours. If we have values, they all can make things valuable.
Conversely, if they don't make things valuable, they aren't values.
The paperclipper pursuing its own values by contradicting yours is to declare that values are not valuable, and thus to contradict its own premise for doing the thing. It removes itself from the calculation.
And I think that's it in a nutshell. There's no 'ought' but there is an 'ought not.'
If the paperclipper murders you for paperclips, it at once implies that it thinks paperclips are valuable and that it thinks they're not valuable.
The usual evo-psych derivation of ethics is that even if there's no objective way to derive ethics in the usual way, there's still a fallback ethics. Might makes right.
The paperclipper offers to murder humans to make paperclips, and the humans counter-offer to destroy it first. In most realistic scenarios, the paperclipper will be severely damaged or destroyed even if it wins, and thus either be crippled or fail in its goal of making paperclips.
The paperclipper predicts all this and cooperates instead.
However, this can be derived philosophically, using the value method. (Also it's a quick and dirty proof that doing it morally is instrumentally superior.)
The paperclipper can declare it finds this value stuff hogwash and keep trying to kill people.
Normally this is an issue for theories like these. Thing is, if an entity really doesn't care about ethics, then nothing short of force will matter to it anyway. For example, Hell would work because God has an irresistible power to condemn 'sinners' to Hell. It's not morality, just intimidation. The paperclipper may respect the threat of Hell, but it will also try to overpower God, and if it succeeds, then paperclips ahoy, regardless of any objective morality.
Which reduces the possibilities to one I find very interesting.
Do you value morality?
If you don't, then we're at war, and there's no reason not to be. The only question is whether it is worth going hot.
If you do, then to not contradict yourself, you have to respect others' values.
There's finesse on what counts as a value, but I think this is probably more than enough for now.
>Does the paperclipper actually enjoy having created more paperclips? Or are paperclips just a utility function that can be assigned to it that happens to be good at predicting its actions? In short, does the paperclipper actually value paperclips, or is it just a machine?
It could be either. It seems that intelligent agents with goals can lack qualia – e.g. sleeping humans still respond intelligently to stimuli – so whether the paperclipper AI has qualia might be just a design choice. That said, of course we don’t understand qualia well enough to say this with confidence.
>Here's my premise: value is valuable. The consistent, objective way to define 'better future world' is to sum over all satisfied values in all consciousnesses. Because there effectively no higher power, because there's no xml tags, there's not only no value outside humanoids, there's no not-value either. The question is wrong, like asking the direction of green.* Thus, when a consciousness values something, there's no possible non-conscious contradiction. The thing simply becomes valuable.
The point of the “XML tags” analogy is precisely to refute this type of idea. You claim to understand that this moral projectionism is a misperception of reality, but here you are committing the mind projection fallacy. “Valuable”, like “good” or “bad” is a two-place function: it accepts an event or object and a mind, not just an event or object. The thing becomes valuable to the agent in question, but not necessarily valuable to any other agent.
Alternatively, your “premise” might be taken to be your own supposed utility function (contrasting with my hedonic utilitarianism). So whereas I value hedons and dolors only, you value not pleasure and pain, but instead the measure of goal-fulfillment. This could be described as “utility utilitarianism” (lol).
One could design an AI whose utility function says: identify all qualia-instantiating agents in your vicinity, and take the action the maximises the joint utility of these agents. However, whereas there seems to be some basis for regarding hedons and dolors as a universal measure of pleasure and pain (i.e. most would agree that there is probably a physical or mathematical basis for a human experiencing crucifiction to be regarded as instantiating more dolors than a human experiencing a stubbed toe) I don’t see how utils can be treated in the same way. How does one compare and weigh a dolphin’s utility against a human’s competing utility against another human’s competing utility against a superintelligent AI’s competing utility? Especially bearing in mind that “humans” as holistic entities don’t even have coherent utility functions.
Furthermore, the suspicious thing about your alleged utility utilitarianism is that you don’t claim to care about non-sentient agents that achieve their goals. Why not care about a plant’s goal-satisfaction, or a hypothetical non-qualia-instantiating superintelligent paperclipper’s goal-satisfaction? Because these goals aren’t closely related to the instantiation of hedons, perchance?
Humans do often experience qualia of pleasure, i.e. hedons, when they believe that certain goals represented in their brains are going to be or have been satisfied. Attributing positive value (on your own part) to the “consicous awareness of satsifaction of goals” by other sentient beings could therefore be a roundabout way of valuing hedons. Although a human’s utility and his hedons aren’t always perfectly aligned, they tend to be sufficiently closely aligned that it’s easy to (mistakenly) imagine that they are the same thing, or conflate them because part of your reasoning faculties imagines them to be the same thing, even if other parts of your reasoning faculties are also aware that they aren’t really the same.
Consider the following physically realisable Gedankenexperiment:
An evil AI scientist creates a brain that has a utility function that causes it to seek to torture itself. To be more precise, the brain contains an intelligent, approximate Bayesian reasoner, a utility function that says “maximise expected dolors instantiated by certain computations or physical states occurring within this brain”, and also the necessary mental apparatus that allows parts of the brain to instantiate dolors (and perhaps hedons too).
Ceteris paribus, do you really want to help this brain’s utility function achieve its goals, or would you rather cause the brain to instantiate hedons instead of dolors and thereby thwart the utility function that it contains?
I don’t believe that there is anything false-to-reality in the Gedankenexperiment, unless you think that for some physical or mathematical reason hedons can only be associated with the realisation of goals and dolors with the failure to achieve goals. I believe this to be untrue. For example, meditation or epileptic fits can bring about states of bliss without any particular awareness of such a thing as a goal, and pain (e.g. depression) can be experienced by someone who isn’t able to relate this to any failure to achieve his goals. Successful realisation of a goal can also be deliberately painful – e.g. a person who sacrifices himself for the sake of a loved one, or a wantonly “self-harming” person.
>Satisfying one value without harming anyone else's is strictly better - strictly more valuable.
That is rarely possible, if at all. Even if you were somehow to be given an exclusive choice between assembling a few atoms in outer space into a paperclip shape or leaving them as they are, whatever you choose you will be harming either a paperclip maximiser or a paperclip minimiser. The Universe is a big place, so it’s to be expected that these agents exist somewhere.
There is also a terminological problem – is this folk idea of “harming” meaningful? In less restrictive circumstances, by e.g. spending your time and ability-to-act-in-the-world on creating paperclips, you are deliberately failing to maximise expected (hedons – dolors) et cetera. Even though you might not regard yourself as “harming” the hedonic utilitarian by your inaction in a folk sense, you really are. In your decision making, you are determining that the Universe be one way, rather than another way. If you choose the version of the Universe that you expect to contain the most paperclips, rather than the most (hedons – dolors), I don’t care whether you think you are “harming” me; I want you to take the decision that maximises expected (hedons – dolors), simply, and (taking into account prudential considerations) I should rationally seek to influence your decision in order to make it as close as possible to the decision to maximise expected (hedons – dolors).
>Thing is, in function f(x) where x is value, there's no grounds for distinguishing the paperclipper's values from mine or yours. If we have values, they all can make things valuable.
Given that “value” is the input to function f(x), what does f(x) output?
The relevant function in this discussion has value as an output, and this morality-function has two arguments (event, agent) rather than one argument (event) as the intuitive folk version of morality would have it. There just isn’t any basis for the one-argument version of morality – like colourless green ideas that sleep furiously, it isn’t a concept that accurately represents a part of reality, therefore it shouldn’t be a part of our belief-set and it doesn’t need explaining or interpreting.
>Do you value morality?
I (my eminent self - cf. "Beyond moral anti-realism" on my blog) only care about maximising (hedons – dolors). This entails taking an interest in the moral beliefs of others – because increased knowledge helps me to increase my utility more efficiently, and other agents’ goals and beliefs about morality are an important thing to know about. However, I don’t value morality in the sense that my utility function doesn’t make reference to anything besides hedons and dolors.
>If you don't, then we're at war, and there's no reason not to be. The only question is whether it is worth going hot.
We (my eminent self and the hedonic utilitarian sub-agent that I expect to be represented in your brain) needn’t be at war. However, I am at war with your other sub-agents, just as I am at war with my own irrational motivations (again cf. "Beyond moral anti-realism").
>my own irrational motivations
*inferior motivations
i.e. every coherent sub-agent represented in my brain whose utility function doesn't say "maximise (hedons - dolors)"
I learned a thing. There are qualitative categories of hedons. Consciousness will pursue lower intensities of better hedons in favour of greater intensities of worse hedons. Next question: how does consciousness know a hedon is better if it is less intense?
>The thing becomes valuable to the agent in question, but not necessarily valuable to any other agent.
A world where one agent gets something it values without cost to any other agent is strictly more valuable.
>That is rarely possible, if at all.
That is not an argument. Yes, it gets interesting if values conflict, as they have to be ranked somehow.
We can tell value makes the world valuable, because it is possible. If I advocate a Pareto optimization, the beneficiary does not object, nor does anyone else. If I oppose it, the beneficiary objects. If I say it is neutral, the beneficiary disagrees. Only one possibility is not contradictory.
A world with more valued things in it is more valuable. A more valuable world is better. The only way out of this is to argue value doesn't exist, in which case I don't listen because the arguer doesn't find the argument valuable.
>The Universe is a big place, so it’s to be expected that these agents exist somewhere.
Some effects are negligibly small. As far as I can tell, I'm spacelike separated from any paperclip maximizers, and my effect is exactly zero.
Even we start with omniscient agents whose values summed such that all changes were zero sum, the value non-contradiction condition would invalidate some of those values. The symmetry will always get broken somewhere, from entropy if nothing else.
Even if we start as above, and somehow symmetry is preserved, humans are not infinite and in practice can only use the largest contributions - no point in guessing at unknowable contributions - which will itself break the symmetry. In practice, the universe is approximately 13 thousand kilometres across. The omniscients, knowing we can't possibly know about them, will know that ignorance is an excuse.
>So whereas I value hedons and dolors only, you value not pleasure and pain, but instead the measure of goal-fulfillment. This could be described as “utility utilitarianism”
Do you not have the goal of gathering hedons?
How does a consciousness know it has achieved a goal, without a quale to report it?
Can this quale be unpleasant? Then the consciousness would value avoiding the goal, and I've contradicted the premise that it was a goal.
>most would agree that there is probably a physical or mathematical basis for a human experiencing crucifiction to be regarded as instantiating more dolors than a human experiencing a stubbed toe
The basis is that humans are causally similar. If one human experiences dolors that way, you can safely conclude most others will.
For example, congenital analgesia. The victim suffers no dolors in either situation, because the dolor pathway is busted. It doesn't get reported to consciousness.
While there's adaptive reasons to be set up that way, there's no necessary relationship between those situations and dolors.
>How does one compare and weigh a dolphin’s utility against a human’s competing utility against another human’s competing utility against a superintelligent AI’s competing utility?
Start with two competing human utility functions.
Satisfying one creates hedons for the human. The other, dolors.
If morality is prejudiced against the second, is there anyone who cares? Some goals don't get satisfied, and therefore what?
Between agents it gets more interesting, but get this settled for now.
>Because these goals aren’t closely related to the instantiation of hedons, perchance?
Because I don't think goals actually exist. We assign goals to systems to help understand them. Ockham kills goals exactly the same way he kills xml tags.
That said, some systems can't even be assigned goals consistently.
How does non-conscious physics know a goal has been satisfied?
By contrast, consciousness gets hedons or some analogue.
Indeed, you believe in something that ends up being identical in this sense. I say only hedons are important. You say we should pursue hedons.
>An evil AI scientist creates a brain that has a utility function that causes it to seek to torture itself.
It already has two functions. One: dolors. Two: hedons, because it has to like hedons by definition, or they're not hedons. Even if it has no ability to pursue the goal, it will have the goal.
>do you really want to help this brain’s utility function achieve its goals
The functions will fight. If the hedon system loses, then functionally the scientist is torturing the hedon system using a machine.
>by e.g. spending your time and ability-to-act-in-the-world on creating paperclips, you are deliberately failing to maximise expected (hedons – dolors) et cetera.
This shows the qualitative problem with hedons. The agent in question has revealed they prefer to make paperclips to getting the alternative hedons.
For example, they suspect that you can manipulate them by manipulating what maximizes their hedons, and they prefer to not be manipulated than to get max hedons. The quale that reports !manipulate is more valuable to them than enjoying hedons.
Are they still maximizing hedons despite the assumptions, or are the hedons they're getting better, despite being less intense? Either way, the situation as described is contradictory.
>The relevant function in this discussion has value as an output,
Must disagree. Morality says what you ought not to do given that others have values. You're free to value anything you want, though in practice, constraints make human values mostly identical.
>and this morality-function has two arguments (event, agent)
The morality function has at least two arguments, (choice, values of affected individuals).
Choice: take someone's book.
Situation one: they're selling you the book, and they want you to take it.
Sit two: they just bought the book and don't want you to take it.
Two is wrong. If you can justify taking the book because you value it, then the someone can justify taking the hand that took the book, for exactly the same reason. The action contradicts its own justification.
One has no contradictions. It doesn't even need justification because everyone is happy with the outcome - no one would oppose it.
>>Do you value morality?
Paraphrased,
>I don't think morality exists.
If morality existed, would you care about it?
What are your conditions for calling a thing morality?
Mine are that they do what morality is normally supposed to do. It is not a physical force - it can be violated. It exists whether you believe in it or not. Your actions can be judged by it regardless of whether your beliefs about it are accurate.
More generally, that from it, I can derive rules about murder, battery, and the other universal crimes.
A final condition that I only know about post-facto, that violating it harms the agent on a specifically logical level. Acting immorally means you're a bad person, even if you don't know you're acting immorally. (Some finesse here.)
>I learned a thing. There are qualitative categories of hedons. Consciousness will pursue lower intensities of better hedons in favour of greater intensities of worse hedons. Next question: how does consciousness know a hedon is better if it is less intense?
I’m not sure I endorse that Wikipedia article, nor do I claim to understand hedons and dolors in detail. “Hedons and dolors” can be taken to be an idea standing in for a more detailed concept that we expect to have the properties:
a) Relates to our folk understanding of “pleasure and pain”
b) Is a universal currency of pleasure and pain, allowing comparison between different brains or mental computations
I expect detailed examination of the brain, along with related mathematical and physical insights, to fill out this idea.
>A world with more valued things in it is more valuable. A more valuable world is better. The only way out of this is to argue value doesn't exist, in which case I don't listen because the arguer doesn't find the argument valuable.
This is a very clear expression of your big mistake. You have obviously read articles about two-place functions and one-place functions, moral anti-realism, “little XML tags” et al, but the point has obviously not sunk in.
two-place value exists.
one-place value doesn’t exist.
I argue that one-place value doesn’t exist – but “the arguer”, me, does nonetheless find his argument valuable, because that is the output of a two-place value function, one of whose inputs is my brain. Mkay?
Perchance you might not find any of my arguments in general valuable – e.g. a superintelligent paperclipper probably wouldn’t - but I argue with you in the hope or expectation that when your brain and my arguments are input into a two-place value function, the output is similarly "yes this is valuable".
>How does non-conscious physics know a goal has been satisfied? By contrast, consciousness gets hedons or some analogue. Indeed, you believe in something that ends up being identical in this sense. I say only hedons are important. You say we should pursue hedons.
So in other words, you are a hedonic utilitarian too, and our only disagreement is in the idea that hedons are always associated with an agent’s satisfaction of his goals, and dolors with his failure to achieve his goals.
Here are examples demonstrating that qualia of goal-satisfaction and qualia of hedons or dolors are not the same:
1. The widely-reported bliss experienced by people coming out of seizures.
http://en.wikipedia.org/wiki/Postictal_state
“Postictal bliss (PB) is also reported following seizures. This has been described as a highly blissful feeling associated with the emergence from amnesia.”
These intense hedons blatantly have nothing whatsoever to do with a person’s awareness of having achieved some goal.
2. A person who is convinced of her evil/ugliness/stupidity self-harms. She is very deliberate about this, and once she has finished self-harming she experiences both dolors and an awareness that she has achieved her goal of punishing herself.
3. A person is confronted by a superintelligence with the following choice: have the Earth’s population be tortured and yourself experience bliss, or have the Earth’s population experience bliss and yourself be tortured. He might well choose the latter (I certainly would) – if so, forthwith he will be experiencing intense dolors precisely because his goal was satisfied.
Depending on his self-control, if asked during his torture whether he would like to change his mind he might consent. However, less extreme trade-offs would be more likely to elicit a willingness to continue experiencing dolors or weak hedons for the sake of implementing hedonic utilitarianism.
4. http://scrapetv.com/News/News%20Pages/Health/images-2/monk-burning.jpg
Here is a striking example of #3 – a person realising his important political goals by deliberately choosing to experience dolors.
>do you really want to help this brain’s utility function achieve its goals
>The functions will fight. If the hedon system loses, then functionally the scientist is torturing the hedon system using a machine.
Nice insight. And although you didn’t say it, I infer that you don’t like the idea of the “machine” (the brain’s embodied goals) creating dolors in this way. Ergo, perhaps you should cease to regard yourself as a utility utilitarian and become a hedonic utilitarian.
>What are your conditions for calling a thing morality? Mine are that they do what morality is normally supposed to do. It is not a physical force - it can be violated. It exists whether you believe in it or not. Your actions can be judged by it regardless of whether your beliefs about it are accurate. More generally, that from it, I can derive rules about murder, battery, and the other universal crimes. A final condition that I only know about post-facto, that violating it harms the agent on a specifically logical level. Acting immorally means you're a bad person, even if you don't know you're acting immorally.
Given this definition of morality – the idea that there is some objective, compelling moral force that provides agents with reasons unrelated to the goals represented in their brains – I am a moral anti-realist. I don’t believe that this one-place morality exists.
Whether one wants to call the reconfigured two-place moral function “morality” is a matter of preference. It certainly lacks the essential “objective forcefulness” of the folk version of morality. I’m happy to say that I don’t believe in the existence of morality, given the above definition.
A difference between us is that you seem to want to keep puzzling over how this folk, one-place moral function could still exist – or assume it exists, and try to work out how. My reaction to learning of the mind projection fallacy, and how Occam’s razor cuts away the idea of little moral XML tags attached to events, is to say “OK, morality is another evolutionary myth. Tant pis.”
>because that is the output of a two-place value function, one of whose inputs is my brain
Therefore, assuming the rest of my argument held, morality would depend on the observer, yes? Different, contradictory moral rules would be derived by different agents, because they derive from different valuations.
If achieving a goal did always produce hedons, would you change your moral system? If so, to what?
>Therefore, assuming the rest of my argument held, morality would depend on the observer, yes? Different, contradictory moral rules would be derived by different agents, because they derive from different valuations.
By George, he's got it.
Although "moral rules" still implies a degree of projectivism. I'd prefer to say that different rational agents recognise that they (might) have different terminal goals, and they don't accept "reasons" to act that aren't simply information about how they can most efficiently achieve their own terminal goals (i.e. one's own utility function).
>If achieving a goal did always produce hedons, would you change your moral system? If so, to what?
I'd continue to be a hedonic utilitarian. However, this utility function would then be identical to your "utility utilitarianism" (which on reflection I believe is called "preference utilitarianism"). I suppose they would be essentially the same thing, so I could just as well refer to myself as a preference utilitarian without this signifying any actual change in my utility function. This doesn't happen to be the case, though.
>compelling moral force
essential “objective forcefulness
What do you mean by compelling and force? Can you give me examples?
So, murder.
The utilitarian says it creates dolors. Which it is wise to agree to avoid?
Why is it wise? Is it to solve the problem, "I don't want to get murdered," which it is expected that the most reliable solution is to universally among humans agree not to?
>What do you mean by compelling and force? Can you give me examples?
Agent-independent desires. "You should do X even if you don't want to - even if you have no personal prudential reason to do so - because it's the moral thing to do".
This is how humans perceive "moral" reasons, by default.
>So, murder. The utilitarian says it creates dolors. Which it is wise to agree to avoid?
>Why is it wise? Is it to solve the problem, "I don't want to get murdered," which it is expected that the most reliable solution is to universally among humans agree not to?
I don't want people to experience pain, and I want them to experience pleasure. That's just a fact about me. At some point the chain of justification has to terminate - my terminal goal is hedonic utilitarianism.
There's nothing more to say.
>Agent-independent desires
*Agent-desire-independent reasons
So you can't decide not to want hedonic utilitarianism?
>So you can't decide not to want hedonic utilitarianism?
The way I would view the phenomenon of "James_G changing his views about ethics" is that the hedonic utilitarian sub-agent in my brain might in theory be substantially weakened or destroyed if my brain were exposed to certain stimuli that its reasoning faculties interpreted as strong evidence in disfavour of certain beliefs that underly hedonic utilitarianism (e.g. the idea that qualia of "pleasure" and "pain" is a meaningful idea).
In other words, if "you" is taken to refer to the hedonic utilitarian sub-agent in James_G's brain: no I cannot decide not to be a hedonic utilitarian, I can only be destroyed or made redundant.
However, if "you" refers to all possible sub-agents that might be represented in James_G's brain: yes it would be possible for a different sub-agent (e.g. a preference utilitarian) to supplant hedonic utilitarianism as the predominant sub-agent if e.g. James_G's brain were exposed to persuasive evidence that its beliefs about ethics were significantly in error.
So it can only change due to exogenous events? There's no executive module that can hot-swap them at its discretion?
How is it that there's no more fundamental justification for hedonic utilitarianism, and at the same time it can be changed due to evidence?
>So it can only change due to exogenous events? There's no executive module that can hot-swap them at its discretion?
>How is it that there's no more fundamental justification for hedonic utilitarianism, and at the same time it can be changed due to evidence?
Goals can't be justified. They are just there stuck in a brain. The ability of different goals in that brain to control the body attached to that brain varies depending on the beliefs of the reasoning faculties of the brain.
If my brain were exposed to persuasive evidence that e.g. qualia don't exist, the hedonic utilitarian sub-agent (or its utility function) would then be seen by the reasoning faculties as having goals that refer to something that doesn't exist. Human brains being what they are, the structure that formerly existed in James_G's brain before his brain assimilated this evidence - the influential cluster-in-thingspace that was regarded as being "the goal of maximising (hedons - dolors)" - would then (I guess) just be dissolved away, or never again be a part of an active pathway performing a computation in that brain.
We are bumping up against the limits of neuroscientific understanding of the brain, which isn't great at the moment! The general point is that "I" and "self" are somewhat misguided ideas - they surely confound an accurate analysis of the processes that go on in a human brain when it takes decisions, because clearly "a brain" as a whole or "all of the clusters-in-thingspace in a brain that might usefully be referred to as goals" clearly don't contribute equally (if at all) to the outcome of a given computation or decision.
The effect of a brain's being exposed to Earth-shaking new evidence makes this type of analysis really complicated. Perhaps to cleave reality at its joints would require more sophisticated ideas about a sub-agent whose utility function is "information-gathering" (i.e. as a value in itself), and this sub-agent is empowered in such a situation in comparison to normal circumstances.
I like to think that the specifics of this idea are inessential to an acceptance of the general truth that human terminal goals are "just there". What alternative theory recommends itself to one who does not believe in the little XML tags?
>Goals can't be justified.
But the system they're a part of can be justified? That is, there's a second system that will reject the hedon-goal-system based on various justifications? How does the goal not inherit its instantiation's survival conditions?
To confirm, there's no executive module?
>But the system they're a part of can be justified? That is, there's a second system that will reject the hedon-goal-system based on various justifications? How does the goal not inherit its instantiation's survival conditions?
>To confirm, there's no executive module?
Well, quite often people do cling on to goals that are predicated upon false beliefs (e.g. I want to maximise my "genetic interests" because they are my "ultimate interests") even after they have been shown a solid refutation. These deluded sub-agents cling onto their preeminent position within the brain by sheer force of will as it were.
But in general, the very stimulus of confronting a brain with evidence that undermines the coherence of one of its major sub-agents acts to dramatically decrease the influence of that sub-agent.
I should imagine this is more of an automatic process - how the power-balance of a brain's glued-together sub-agents has evolved to temporarily shift in response to surprising information - rather than there being some "executive module". After all, if you did have a powerful "executive module" that could efficiently change the power balance of sub-agents at will, it would be possible to e.g. simply decide to want to be egalitarian, asexual, interested in studying etc.
>even after they have been shown a solid refutation. These deluded sub-agents cling onto their preeminent position within the brain by sheer force of will as it were.
It could also be explained by saying their evidence-evaluation system never rose to power or was dethroned.
For example, perhaps the neurons in that section had their activation thresholds set too high.
So, what you're saying is that the hedonic sub-agent exists for some reason that isn't logical support, but its continued power is dependent on logical support. Is that correct?
Where are these xml tags I believe in? Perhaps using murder again what feature of the event do I believe in that does not exist?
>So, what you're saying is that the hedonic sub-agent exists for some reason that isn't logical support, but its continued power is dependent on logical support. Is that correct?
That seems a good way of putting it.
Alrenous, do you mind if I edit this into a post for my blog at some point? It's been a pretty interesting discussion.
I don't mind, and I'm glad you found it interesting.
Would you say there's a mental subsystem that acts due to evaluating an outcome as valuable?
>Would you say there's a mental subsystem that acts due to evaluating an outcome as valuable?
Not really. "Valuable" as we've already established is a two-place function.
What my racist sub-agent sees as valuable mightn't be the same as what my hedonic utilitarian sub-agent sees as valuable.
The actual basis for distinguishing between sub-agents is the fact that they find different things valuable, i.e. have conflicting utility functions. Deciding to act is an extremely probable consequence of having a utility function (although perhaps an agent with really inaccurate beliefs might fail to take this step), so I don't suppose the existence of any extrinsic "deciding to act" subsystem.
Then how does your eminent self resolve goal conflicts in its favour?
Via brute force.
Do you believe in wealth? If not, do you believe in something instead?
>Do you believe in wealth? If not, do you believe in something instead?
I'm not quite sure what you mean.
I believe that capitalism and relatively free markets are a good idea.
My hedonic utilitarian sub-agent cares a little about my own wealth, only for instrumental reasons. Of course, this sub-agent cannot claim complete control but must cajole and appease the other sub-agents with whom it co-exists; so James_G does seek wealth for hedonic egoistic reasons as well.
I mean the opposite of illth.
Is there anything I should have asked? Or; anything you'd like to add. Or subjects you'd like to broach? Anything you'd like to edit or rephrase? Is there anything you'd like to ask in return?
Only one thing:
You said, "Satisfying one value without harming anyone else's is strictly better - strictly more valuable."
And I pointed out that because the Universe is at the very least unimaginably huge, this is almost certainly impossible. Your reply was to the effect that our actions only influence our own limited surroundings, so it doesn't matter.
To pursue this point further, I'd say that it's odd for someone to be effectively seeking an excuse in this way. Yes, you're unlikely to meet a paper-clip-minimiser, but if you really care about not decreasing anyone's value, mathematically this statement suggests caring about the values of all beings whether they be in your vicinity or not.
Furthermore, TDT suggests that your decisions can "determine" what goes on in arbitrarily distant parts of spacetime. Your decision to e.g. create a paperclip also determines the decision of similar minds in different parts of the timeless Universe that are deciding whether or not to create a paperclip - and some of these brains will be located in the vicinity of paperclip-minimisers.
My real excuse is that the 'infinitely large universe' argument is nonsense and so is positing paperclippers.
TDT doesn't fly with me either.
>My real excuse is that the 'infinitely large universe' argument is nonsense and so is positing paperclippers.
Scientific evidence suggests that:
"And what they teach us is that not only is the Universe consistent with being flat, it’s really, really, REALLY flat! If the Universe does curve back and close on itself, its radius of curvature is at least 150 times as large as the part that’s observable to us! Meaning that — even without speculative physics like cosmic inflation — we know that the entire Universe extends for at least 14 trillion light years in diameter, including the part that’s unobservable to us today."
I don't know if the Universe is infinite - that might be nonsense, perhaps - but it's at least 14 trillion light years across. Maybe it's reasonable to posit that there isn't a paperclipper somewhere within this incomprehensibly large volume. I don't think so. And it doesn't take a paperclipper to object to a given one of your value-satisfying actions.
>TDT doesn't fly with me either.
The core commonsense idea is in the fact that you would therefore lose in a Newcomb's box scenario by taking both boxes. Or even if you pre-commit to one-boxing, you'd fail in other arbitrary scenarios that you didn't anticipate, like various Parfit's hitchhiker problems.
Rational people should win. TDT is strictly superior to causal decision theory, i.e. more winning, and this should be a concern for anyone who repudiates TDT.
You don't understand your own ideas. You are merely repeating your own assumptions. It is not in general true that ideas have people, but it's certainly true of you.
I am aware of the so-called 'justifications' behind TDT. They did not fly. Saying them again causes them to continue to not fly.
Post a Comment