On the face it's really fucking dumb.
Humans can't even get good birthday gifts for conspecifics. How are they supposed to know the desires of a superintelligence defined as being unable to communicate with them? Doing tests instead of trusting your first guess is kind of the point of science. Can't do tests on future entities.
On the other hand...
Yudkowsky and co. seem highly convinced that the basilisk meme has nonzero effectiveness and produces donations to MIRI (etc) up to the level of self-destructiveness. It is implausible that Yudkowsky didn't know about the Streisand effect, as the term was coined five years before the basilisk theory.
A truly deviant machiavelli who benefits from donations to AI research would then try to maximize the amplitude of the Streisand effect by maximizing the amplitude of the attempted suppression, on the assumption there's a positive correlation.
Yudkowsky reacted with maximum plausible emotions and repression only restricted by diminishing returns.
So either he's a true defector, or he's really, really, really dumb. Also, plz into emotional continence.
--
More dumb:
For it to be possible to defect on me, I have to define 'me' as including sensations I do not perceive, namely the sensations of future simulations of me. Or, alternatively, I do feel those sensations, meaning it's not acausal, it's just interaction across spacelike separations, such as time travel. Because that wouldn't break the universe or anything.
Yudkowsky accepts that causal decision theory concludes you should defect in the prisoner's dilemma. In other words Yudkowsky could have discovered that conclusion is untrue rather than trying to invent a whole new theory which incidentally creates the apparent possibility of basilisks.
"Since there was no upside to being exposed to Roko's Basilisk, its probability of being true was irrelevant."
Xeno's paradox was a brilliant dig at the idea that Greek philosophy understood physics, motion in particular. Equally, the basilisk shivs Yudkowsky's decision theory. But there's no upside to knowing Yudkowsky's theory has holes in it, now is there?
Classical decision theory already resists blackmail, it simply requires the theory investigator to not stop when they find an emotionally valent conclusion, but to continue until the logic stabilizes.
--
Yudkowsky's sequences are pretty okay. I want to know whether applying logic consistently really is that hard or if Yudkowsky isn't even genuinely trying. Also, plz into emotional continence.
Yudkowsky's a cult leader, what else can you expect? Sure, he's brighter than the average Revelation-interpreting megalomaniac, and his followers all think they're Spock, but LW is a cult.
ReplyDeleteSimple innocent forums for philosophy enthusiasts don't have communes or group sex.
No offense, but... wut?
ReplyDeleteThey have houses where believers live a communal existence with fellow believers. They do a lot of "polyamory," which strangely enough, tends to involve the wise and benevolent leader mating with the most desirable females. They have "cuddlepiles." They have communal worship rituals like their solstice festivals. Their leader is borderline-superhuman, a man of destiny with the insight and spiritual development to save the world (which stubbornly refuses to properly recognize his greatness).
ReplyDeleteTheir worldview is, of course, extremely eschatological. Wits have been calling the singularity "the Rapture of the Nerds" for decades, but LW takes it a step farther and asks you to join a group of fellow believers and dedicate your life to it.
LW members tend to be smart, articulate, and media-savvy. They do a good job presenting themselves as nothing more than a discussion forum, or at worst a subculture, for people interested in philosophy, rationality, AI, and all that. But when you get deep enough, you go live with them. That is a cult.
Hmm. Two things on this post :
ReplyDeletehttps://www.amerika.org/politics/singularity/
Not sure if you've read it (maybe?) but the basilisk problem always reminded me of this.
Regarding the basilisk itself
- the actions of non-donators do not count as defection (I've donated zero dollars to the creation or upkeep of, say, General Motors. In fact I've never bought a single product from them, only second hand goods. I'm not defecting against them, nor co-operating).
- "incentive to hurt" - this seems to me like they expect the superintelligence to act like a petty child. Petty children level analysis then?
- I have never really read Yudkowskis responses before but, yikes. Isn't a rational agent meant to update their perceptions with new information? He just twisted himself into knots rather than point out the obvious flaws in the argument.
Non-donation is indeed not defection; the basilisk is defecting on you, and promises to defect harder if you don't submit. Standard extortion, except for the time-travel / telepathy / perpetual motion.
ReplyDeleteI wonder if it was intended to be a backhand dig at the idea of Hell. Or this: "Say Atheism is Christian without saying it's Christian," depending on how much irony is involved.