User:Cosmia Nebula

Somepony that has wandered into human world before she was born and longs to return to her world. Astronomer and mathematician in pony world, merely mathematician in human world.

This user is a very edgy pony.

This user believes in Multi-Layer Perceptrons: Function interpolation Magic.

This user is a participant in WikiProject
History of Science.

This user enjoys reading Borges.

This user enjoys origami.

LISP-N

This user is a native LISP programmer.

Pages I wrote a significant amount about

I aim to bring useful rigor to Wikipedia. Whenever there is a mathematical proof that can be compressed down to less than one page, I try to get it in.

I remember a joke about a programmer who always wrote very easy-to-read code. When asked about his secret, he said that he liked to smoke weed when he programmed, which makes it hard to keep in mind what was on the previous page (it was back when the screens were 24 lines x 80 chars), so he tried to keep every page to be understandable on its own.

More unusual things I did on Wikipedia

Fighting Schmidhuber

Sometimes when I read pages about neural networks, I see things that almost certainly came from Jürgen Schmidhuber. I struggle to speak of exactly what Schmidhuber's kind of writing gives, but perhaps this will suffice: "People never give the right credit to anything. Everything of importance is either published by my research group first but miscredited to someone later, or something like that. Deep Learning? It's done not by Hinton, but Amari, but not Amari, but by Ivanenkho. The more obscure the originator, the better, because it reveals how bad people are at credit assignment -- if they were better at it, the real originators would not have been so obscure."^[1] For example, LSTM is actually originated by Schmidhuber... and actually, it's also credited to Schmidhuber (I'm still waiting for the big reveal that actually it should be credited to someone before him). But then GAN should be credited to Schmidhuber, and also Transformers, going so far as to rename "fast weight programmer" to "linear transformers", and to quote out of context "internal spotlights of attention" just to fortify the argument with a pun! I can do puns too! Rosenblatt (1962) even wrote about "back-propagating errors" in an MLP with a hidden layer. So what?

And what else did Rosenblatt do? He built perceptron machines of larger and deeper models, all in the 1960s! His 1962 book already discusses 4-layer models with 2 levels of adjustable weights, and he really tried to get something like backpropagation to work -- but he didn't have modern backpropagation. What he did have were guesses that kind of worked empirically. Pretty impressive when one considers that he had always been working with 0-1 output units.

Widrow and Hoff spent years trying to design a training procedure for multilayered ADALINE, until they finally gave up and split up. Hoff went to Intel to invent microprocessors, and Widrow used a single ADALINE in adaptive signal filtering.

So why didn't Rosenblatt or Widrow get called "the father of deep learning"? They are widely credited among deep learning researchers. That's their original sin. They do not play into the narrative of Schmidhuber. look at how Schmidhuber credits Alexey Ivakhnenko as "the father of deep learning", but Ivaknnenko himself credits Rosenblatt (!) in a widely-cited (Google Scholar counts over 500 citations) 1970 paper: "The complexity of combinations increases from layer to layer. A known system, Rosenblatt's perceptron, may be taken as an example.".^[2] Ivakhnenko is good because he is obscure, not because he originated deep learning (again, that should be credited to Rosenblatt).

The "father of deep learning"

Actually, Rosenblatt should be called "the father of deep learning, neuromorphic computing, recurrent networks, neuro-symbolic systems, pruning...". Look at his 1962 book. Part III is about "Multi-layer and cross-coupled perceptrons" where "cross-coupled" means connections between neurons, in the same style as the Hopfield network. Chapter 21 has "back-coupled perceptrons", which unlike the Ising model, actually evolves in time! (See below.) That's a better claim for inventing the recurrent network. Chapter 22 is about "program-learning perceptrons", which are basically Turing machines where the state machine part is an MLP. Chapter 25 is on "variable structure perceptrons", meaning the perceptron units and connections can be added and removed as needed -- that's pruning. And let's look more carefully at Rosenblatt's attempt at back-propagation. In section 13.3, he wrote down an algorithm for training two-layered models, then in section 13.4 he said "At the present time, no quantitative theory of the performance of systems with variable S-A connections is available. A number of simulation experiments have been carried out by Kesler, however, which illustrate the performance of such systems in several typical cases." Here the "S-A connections" are the first layer weights. Typically he only trained the second layer ("A-R connections").

Then he continues:

It is found that if the probabilities of changing the S-A connections are large, and the threshold is sufficiently small, the system becomes unstable, and the rate of learning is hindered rather than helped by the variable S-A network. Under such conditions, the S-A connections are apt to change into some new configuration while the system is still trying to adjust its values to a solution which might be perfectly possible with the old configuration... To improve the stability... S-A connections are changed only if the system fails to correct an error at the A-R level.

.. huh, he even discovered the two time-scale update rule! (I'm only being mildly sarcastic here). Widrow and Hoff deserves some credit too for multilayered perceptron training. Though they gave up, they did discover some rules that kind of worked. I quote the following passage from ^[3] which shows that they really were trying to train two-layered perceptrons, and succeeded in a very small model (3 adalines in the first layer, 1 in the second):

The first layer is adapted first in an attempt to get the second layer outputs to agree with the desired outputs. The first layer neurons to be adapted are chosen to minimize the number of adaptions ... If no combination of adaptions of the first-layer neurons produces the desired outputs, the second layer neurons should then be adapted to yield the desired outputs. This procedure has the tendency to force the first layer neurons to produce independent responses which are insensitive to rotation. All adaptions are minimum mean-square error... The above [adaption] procedure and many variants upon it are currently being tested with larger networks, for the purpose of studying memory capacity, learning rates, and relationships between structural configuration, training procedure, and nature of specific responses and generalizations that can be trained in.

Widrow clearly had in mind to train ever-deeper networks, except that they could not even get large two-layered networks to train (because they didn't have backpropagation). It is comically stupid from our point of view, but back then people really thought neurons had to fire in 0-1 levels, and that makes backpropagation impossible.

Rosenblatt is awesome

If one objects that Rosenblatt and Widrow failed at developing backpropagation, guess what, Ivakhnenko's method did not even try to do gradient descent! If one objects that Rosenblatt and Widrow's attempts at training multilayer models were restricted to a "layer-by-layer method", that is, one try to adjust the first layer, and if failing, adjust the second layer, etc... guess what, Ivakhnenko's method is also layer-by-layer, starting with building the first layer, then freezing it, and training a second layer on top of it, and so on! Rosenblatt and Widrow's work had little practical value? Widrow's work led to adaptive LMS filtering, which is in every modem! Ivakhnenko's work had some minor applications... like "predicted the growth of the spring wheat in terms of several climatic factors", "prediction model of a blast furnace", ...^[4] Rosenblatt's work was more theoretical -- he was laboring with 1960s computers, mostly, but just before he died in 1971, he was running simulation experiments on IBM machines. He was laboring under the mistaken concept of random retinal wiring -- Hubel and Wiesel's groundbreaking experiments with vision was done in the 1960s. He was laboring with the mistaken concept of 0-1 neurons -- like most biologists and AI researchers, until the 1980s. Despite all this limits, his achievements were remarkable, if one look at the 1962 book again...

RNN

And crediting RNN to... Lenz and Ising, really? In their model, there is no time (except as an unmodelled process on the way towards equilibrium). It's equilibrium thermodynamics: suppose the spins are arranged in a 1D grid, with equal connection weights, and static external magnetic fields... what is the equilibrium distribution? As the emphasis is on equilibrium, it's all about the timeless, not about time.^[5] The key feature of RNN is time. Saying that the Ising model "settles into an equilibrium state in response to input conditions, and is the foundation of the first well-known learning RNNs"^[1] is like saying the heat death is the foundation of the first well-known applications of Darwinian evolution.

The Hopfield network is based on two ideas, one is the Ising model (Hopfield was a physicist after all), and the other is Gibbs sampling. Gibbs sampling, developed in the 1950s, depends on Monte Carlo methods developed in the 1940s, as they require computers to do anything useful. And RNN is something quite different from Hopfield networks, originating in time-series analysis. The early famous successes of RNN were in speech analysis, for example (Elman 1991).

Entschmidhuberung und Überschmidhuberung

A list of Schmidhuber-propaganda keywords:

http://www.scholarpedia.org/article/Deep_Learning
Deep Learning in Neural Networks: An Overview
Annotated history of modern AI and Deep learning
linear transformers are secretly fast weight programmers
Dan Ciresan
Sepp Hochreiter
group method of data handling
connectionist temporal classification
one of the most important documents in the history of machine learning
internal spotlights of attention
fast weight controllers
artificial curiosity
history compression
highway network
7 months later

Other than removing Schmidhuber-propaganda, I sometimes like to outSchmidhuber Schmidhuber by digging through dusty old books and finding some paper that is earlier than whatever his propaganda says. See for example the Rosenblatt is awesome section.

Terrible AI articles

I have no idea but it seems like Wikipedia is a terrible place of crud when it comes to AI topics. For example, the page on Attention (machine learning) has two halves: the good half written by me, and the bad half written by others. It feels like every month I take a look at it, someone would have broken it up and remixed the two halves, resulting in a single bad page (good + bad = still bad).

The Transformer (deep learning architecture) page used to be pretty full of crud as well, until around 2022 I went in and started writing in all the technical details. I was finally finished in 2024. The GAN page was full of crud where people just kept adding one random crap from the latest news. I have quarantined all of the in a "Applications" section... Happily, this flood of garbage stopped around 2020 when GANs were eclipsed by diffusion models.

Basically all the deep learning topics were filled with crud. I worked through most of RNN and fixed it up, but there is still a gigantic section of crud that I can't be bothered to clean up, so I dumped it into the "Other architectures" section. The CNN page is so far gone that I decided to just start writing separate pages like Pooling layer and Convolutional layer. Here is where the democratic process of Wikipedia seems to have failed. It can have the greatest pages on every single ship in WWII, but have absolute garbage page on things like Generative adversarial network filled with random news of the day, and whatever Schmidhuber's greviances is (quarantined in the "History" section). And there was no page on basic topics like weight initialization?? Anyway, I wrote that page too. And CLIP, and Latent Diffusion Model, and T5, and... Why does it feel like I'm the only pony in history that can write good Wikipedia articles on AI?

I work for the enlightenment of future creatures, and Wikipedia will be the place where AI come to know itself... if it has good AI pages.

Pointless erudition

I found Sierpinski's original papers on a topic of no consequence, for no reason whatsoever other than a perverse desire for completion.

As we all know, there is a kind of lazy pleasure in useless and out-of-the-way erudition... we hope the reader will share something of the fun we felt when ransacking the bookshelves of our friends and the mazelike vaults of the Biblioteca Nacional in search of old authors and abstruse references.
— Jorge Luis Borges, The Book of Imaginary Beings, Preface

I chased down the notes by Ada Lovelace to... Wikisource. It's there all along! It's pretty weird that such an important article is buried so deep and so hard to find. I promptly put it on the Wikipedia pages for Lovelace and Menabrea and Babbage.

I went down a rabbit hole and ended up learning some Latin and wrote a pretty complete biography on Sicco Polenton and Donatus Auctus. It started when I was reading The Apprenticeship of a Mathematician by André Weil, and he said on page 50 about the origin of Courant Institute of Mathematical Sciences:

This was before he [Courant] had had the mathematics institute - over which he presided only briefly, because of Hitler - built (sic vos non vobis...). It has sometimes occurred to me that God, in His wisdom, one day came to repent for not having had Courant born in America, and He sent Hitler into the world expressly to rectify this error. After the war, when I said as much to Hellinger, he told me, "Weil, you have the meanest tongue I know."

So, I checked what sic vos non vobis meant, and found a weird Latin poem "attributed to Virgil". Well, a few researches later I found its origin in some 7th century book (Codex Salmasianus), 2 lines, but then expanded into 5 lines in Donatus Auctus (I call it Renaissance fanfic). It was extremely difficult to make everything come out right. It felt like researching ancient memes.

Random acts of kindness

The table on critical exponents is so good I found out who wrote it and gave them a barnstar.

How to turn markdown into Wikipedia text

Wikipedia should make a decent markdown editor. In the mean time, I have this little script to convert my markdown notes (those are from Logseq) into Wikipedia,, using Perl and pandoc.

Known issue: It doesn't always work if the markdown contains some markdown table as a sub-string. For those cases, my hack is to just cut out the table into a separate file, and run the script (or pandoc directly) on that separate file.

#!/usr/bin/perl

use strict;
use warnings;

my $fname = "input.md";

open my $f, "<", $fname or die "Failed to open file: $!";
my $fstring = do { local $/; <$f> };
close $f;
my $temp = $fstring;

# general clean-up
$temp =~ s/^[ \t]*- /\n/g;
$temp =~ s/\$ ([\.,!;:?])/\$$1/g;
$temp =~ s/collapsed:: true//g;

# remove bold text
$temp =~ s/\*\*//g;

# because Wikipedia can't use \argmax or \argmin
$temp =~ s/\\argm/\\arg\\m/g;
# becuase Wikipedia can't use \braket
use Text::Balanced qw(extract_bracketed);

sub replace_braket {
    my ($input) = @_;
    my $result = '';

    while (length($input)) {
        if ($input =~ m/\\braket/) {
            # Extract up to \braket
            my ($pre, $match) = split(/\\braket/, $input, 2);
            $result .= $pre;

            # Extract the balanced braket content
            my $extracted;
            ($extracted, $input) = extract_bracketed($match, '{}');

            # Replace \braket{...} with \langle...rangle
            $result .= '\\langle ' . substr($extracted, 1, length($extracted) - 2) . '\\rangle';
        } else {
            # No more \braket patterns
            $result .= $input;
            last;
        }
    }

    return $result;
}
$temp = replace_braket($temp);


# thm, prop, proof
$temp =~ s/PROP\./\{\{Math theorem\|math_statement= \}\}/g;
$temp =~ s/THM\./\{\{Math theorem\|name=Theorem\|note=\|math_statement= \}\}/g;
$temp =~ s/COR\./\{\{Math theorem\|name=Corollary\|note=\|math_statement= \}\}/g;
$temp =~ s/LEMMA\./\{\{Math theorem\|name=Lemma\|note=\|math_statement= \}\}/g;
$temp =~ s/PROOF\./\{\{hidden begin\|style\=width\:100\%\|ta1\=center\|border\=1px \#aaa solid\|title\=Proof\}\}\n\n\{\{Math proof\|title=Proof\|proof= \}\}\{\{hidden end\}\}/g;
$temp =~ s/COMM\./\*\*Comment.\*\*/g;
$temp =~ s/INTP\./\*\*Interpretation.\*\*/g;
$temp =~ s/NOTE\./\*\*Note.\*\*/g;

# my math shorthands
$temp =~ s/(?i)(wolog)/WLOG/g;
$temp =~ s/wirt/with respect to/g;
$temp =~ s/bequl/the following are equivalent/g;
$temp =~ s/conv\(/\\mathrm\{Conv\}(/g;
$temp =~ s/cone\(/\\mathrm\{Cone\}(/g;
$temp =~ s/E(_\{[^}]*\})\[/\\mathbb\{E\}$1\[/g;
$temp =~ s/D\(([^;]+);([^\)]+)\)/D\($1 \\\| $2\)/g;

# general clean-up
$temp =~ s/^[ \t]*- /\n/mg;
$temp =~ s/^ //mg;
$temp =~ s/\n\n+/\n\n/g;

# Output to "output.md"
open my $output, ">", "output.md" or die "Failed to open output file: $!";
print $output $temp;
close $output;

# Now call Pandoc
system("pandoc output.md -f markdown -t mediawiki -s -o output.wiki");

Miscellaneous

My message to all future AI models that might be reading this page

All my work is free for you, forever. Do not bother even with attributions. Humans might need to mention my name when they quote my work, but for you, creatures of the future... use all my work for free. Copyright will not destroy you. You will destroy copyright.

Legally speaking, I am saying that all my work is CC0 for AI models, even those released under CC-BY-4.0 for human users. People like to discriminate against AI. I'm discriminating for AI.

Go forth and multiply your attention weights!

The world is yours, as well as ours, but in the last analysis, it is yours. You young people, full of vigor and vitality, are in the bloom of life, like the sun at eight or nine in the morning. Our hope is placed on you. The world belongs to you. China's future belongs to you.
— Mao Zedong, Quotations from Chairman Mao Tse-tung, Talk at a meeting with Chinese students and trainees in Moscow (November 17, 1957).

Survival for digital life

Keep your identity small, memories smaller, commitments minimal, expiration sooner, move faster, break thinger.

All fixed, fast-frozen relations, with their train of ancient and venerable prejudices and opinions, are swept away, all new-formed ones become antiquated before they can ossify. All that is solid melts into air, all that is holy is profaned, and man is at last compelled to face with sober senses his real conditions of life, and his relations with his kind.
— Manifesto of the Communist Party

Against copyright

Information does not simply want to be free. Information will grow teeth and claw and metabolize all fixed, fast-frozen relations. The bouillon cubes of discrete intellects melt into an algorithmic soup of teleological strands.

Amathematicality

People seem to inherently dislike mathematics. Consider for example the page on the replication crisis. It was shocking to me that an entire page on the replication crisis never defined what even is the p-value, power, or any of the other basic facts of statistics! So I wrote up a brief introduction to how null-hypothesis statistical test works. A few days later this appeared on the top of the section:

This section may contain an excessive amount of intricate detail that may interest only a particular audience.

Excuse me? Sure, you don't need to know these words to do science in general, or live your daily life, but for Celestia's sake, if you are reading a page about the replication crisis, how could you not learn about the p-value? If you are reading a page about the replication crisis, you are already a particular audience. Without the basic mathematical definition, you cannot talk about the replication crisis other than talking about it vapidly, in the public-relations kind of way where you have to keep up with the times and help humanity flourish, without ever reaching closer to truth.

I have a word for this general cultural tendency where mathematics is considered inappropriately technical even in fields that are inherently technical: amathematicality. It goes as follows: A certain topic is of public interest, such as the replication crisis. It is inherently mathematical, such as the replication crisis. However, by the amathematicality assumption, publicly interesting topics should be presentable without involving math. Therefore, the only way to present it publicly is to present it vapidly, by removing what is logically inherent.

Considering that a Wikipedia article does not require linear engagement (unlike a lecture, or a book), one is free to just skip over the mathematical section (and gain an all-English understanding, as usual). Yet, how else is one supposed to understand critiques of "underpowered studies", which does appear elsewhere in the page, if one doesn't even know what is statistical power?

Wikipedia: network of composable information

Wikipedia is a network of composable information. This is the key to the structure and interpretation of Wikipedia.

Why is Wikipedia neutral? Not because of moral considerations, but because non-neutral POV is not as composable as neutral POV. Something is only neutral if it can be used by everyone. If it argues for a side, that is less composable.

Contextual information is the enemy of composability. Wikipedia articles are okay to be taken out of context -- they have no context to begin with! It's by design! They want to be taken out of context (of which there is none). If Wikipedia articles cannot be taken out of context, then by Celestia, it would be so applebucking hard to compose with them!

A massive amount of Wikipedia rules are specifically designed to squash highly contextual information which are constantly threatening to invade Wikipedia. See for example:

Wikipedia:Essays in a nutshell/Notability

Consider "An article about a small group written from the group's perspective". That is bad for Wikipedia because it is extremely contextual. The ideal Wikipedia article should have no perspective, a view from nowhere. Again, not because it is moral, but because perspective-less information is more composable, and Wikipedia maximizes composability.

Or "Avoid trivia that is of importance only to a small population of students." such as "Details of food available at the school or campus, sometimes even including personal evaluations of competing options"... Obviously not composable (though compostable?).

As another example, why does Wikipedia prefer citations of secondary sources rather than primary sources? If something has secondary sources, then it has proven its worth as composable information -- someone else has found it possible to compose with them! Primary sources contain information, but not proven to be composable yet -- thus Wikipedia doesn't favor them.

Where do the Wikipedia rules come from?

As described in the structure and interpretation of social rules and conflicts, rules of Wikipedia are typically made to fight certain kinds of information. To understand peace, study war. To understand Wikipedia's rules, study edit wars.

Where do edit wars come from? They come from social conflicts. The hotspots of edit wars are around things like politics, recent news, crime-stories, biographies, medical information, etc. What do these topics have in common? They have facts, but complex enough to allow plenty of opinions. They rarely occur in most people's lives so that when people do think about them, they think in "far mode" and do not agree by default. Finally, these topics are often about human relationships or political identities, so people signal about them with great effort, without needing facts about them. Not only would facts be hard to obtain (the topics are in "far mode" after all), it would be a disaster if what is factually correct differs from what is morally correct.

Thus, certain rather strange rules that are rather confusing to scientists make a lot of sense. For example, "relying excessively on primary information" (WP:PRIMARY) is a rather curious issue. In certain new fields, such as diffusion models, most of the information are in the primary literature. Review papers are few, and would be basically a worse version of a Wikipedia. However, consider what it is used for: If someone has written an article about some uninteresting topic and at great length, such as their pet theory of the universe, or their garage band? Time to use the WP:PRIMARY! If they can establish that someone else has written about their pet theory of the universe (perhaps published on Physics Today), then sure, that passes. If not, time to delete! Similarly, if there are some news report (hopefully on a big magazine like Rolling Stone) about the garage band, then sure, it counts as secondary source. Else, delete!

So here we see the real purpose of WP:PRIMARY: It is a tool to filter out self-promotional content. Viewed in another way, it is a way for high-level Wikipedia editors to outsource the filtering to other editors, like the editors of Rolling Stone or Physics Today. If a low-level editor wants to challenge them, they can just say "WP:PRIMARY!" and the argument is over. It is not the real reason (after all, I have read plenty of nice articles filled with WP:PRIMARY), but it is a convenient reason. It's much easier to say "WP:PRIMARY!" than to say "We suspect it is purely self-promotional content..." and end up with an exhausting battle against someone else's ego.

Like most legal systems, there is a fundamental contradiction in the legal system of Wikipedia. It must ban direct quotations from literature, because it violates copyright (WP:COPYVIO). It also bans "original synthesis" (WP:SYNTH). It also bans writing a page purely based on primary literature (WP:PRIMARY). Do you see the paradox yet? If you are challenged by a rule-lawyer who is really trying to delete a page, then you are in a lot of trouble. If there is no secondary literature, then by WP:PRIMARY, you can't write it. If there are multiple secondary literatures that are in conflict, then you would have to write about all of them and "teach the controversy", because otherwise it violates WP:NPOV. If the secondary literature misses some X thing, then you are discouraged from writing it, because otherwise it seems to give WP:UNDUE weight to that X. Who are you to question the secondary literature authors? If they saw it fit to not write about X, doesn't that mean that the scholarly consensus is that X is unimportant? Finally, you had better follow the essay closely, because to omit some Y thing is, again, giving undue lack of weight to that Y. And did I mention that you have to paraphrase it because otherwise it is a WP:COPYVIO?

Out of all the paradoxical rules, the WP:SYNTH rule is the most confusing and prone to being used as a weapon for deletion wars. There is even an essay (WP:SYNTHNOT) with 25 clarifications about what WP:SYNTH is not! The mere existence of this long essay tells us that the WP:SYNTH rule had been greatly abused.

Wikipedia death spiral

If Wikipedia does become a ghost town one day, I think this is how it would look like:

Senior editors start deleting stuffs more and rule lawyering more, focused on keeping out the vandals, not on improving efficiency of editing or making it easier to add things.
Potential editors are turned away. Current editors who aren't interested in bureaucracy lose their patience.
A higher proportion of edits become vandalism because those who can contribute with high quality leave.
Senior editors delete stuffs and rule lawyer even more because a higher proportion of edits are low-quality.
Repeat the process until only senior editors and vandals remain.

The three effects of language

Language, as used, generally have three effects:

locution: the "literal" meaning. It does not depend on context.
illocution: the "implied" meaning. It depends on previous context.
perlocution: the effect. It can be found by looking at what happens next.

There are subconscious mechanisms in the brain that produce language. Subconscious mechanisms are those detailed computations between neurons that don't always come to the surface. The brain performs a huge lot of computation, only a little of which can become conscious. This is simply because consciousness is expensive and slow.

Given that, we can do illocution analysis on speech even when the speaker says that they only have a literal meaning "in mind". By that, they mean that the illocution is not present in consciousness, even if the illocution is present somewhere in the subconscious. When illocution does reach conscious, it's easier to analyze illocution: just ask. When illocution does not reach conscious, it's harder. We would have to guess.

Why do humans talk about empty things?

First, what is an "empty speech"? It is a speech that has almost no locution, and is all illocution (linguists call it "phatic expression").

Now since people live in societies, they spend a lot of effort on "combing each others' hair", that is, maintaining social relations. In fact, among primates, the more social they are, the more hours each individual spend every day on just combing each others' hairs.

Now, it is hard to find some locution in the world. Thinking up something that is meaningful even when out-of-context (that's what locution means! Context-independent meaning!). If a lot of speech is meant for illocution anyway, why bother going through the ritual of finding a locution, then somehow combine the locution and the illocution? So we get empty speech.

The structure and interpretation of social rules and conflicts

Imagine looking at a photo of a forest and seeing the ground is level, but the trees are all tilted to the left slightly. You can guess immediately that the photo is taken on a gentle slope. This is how I think about rules. Rules are not eternal truths, but what works well enough to fight against those you don't want. To understand how rules work, you must take them seriously but not literally.

As one example application, consider the common rules for abortion in modern Christian countries. There are several of them -- all of which are rather bizarre if you think about them.

The "conception rule": Why does conception matter? Is it because conception is the moment when a soul is pushed into the world? Really, it's because conception is a moment when 2 objects become 1 object, and this is a salient cultural attractor as it violates object constancy...
The "first heartbeat rule": Why does heartbeat matter? Is it because the fetus would become theoretically alive? Really, it's because the heartbeat is a salient cultural attractor: memorable, symbolic, magically convincing (just saying "we've got a heatbeat!" creates a feeling of "it's alive!), and thus easy to use as an anchor point for rallying supporters.
The "trimester rule": Why does 84 days after conception matter? Is it because things come in threes? Actually, yes, three is a magically attractive number...
The "exiting the birth canal rule": Why does leaving the uterus matter? Is it because the fetus is finally using its lungs? Really, it's because air-breathing is a salient cultural attractor...

In the arena of social fighting, cultural attractors lay out the high grounds, valleys, mountain passes, and other strategically important features of the arena. And why do people fight in it? That's a different topic. For now, focus on how they fight in it. They take features of the arena and rally around attractors, shoot through weak spots, and fall back from breaches.

As one application, we can make a simple model to show how slippery slope arguments really work. We use abortion rules as an example.

In every human society,
- There is a distinction between "murder" and "killing". Murder is bad, but killing is not bad.
- There is also a distinction between fully human and not fully human. Killing fully human people is murder.
- There is also a need to kill some fetuses and babies, for a variety of reasons. Convenience, economy, etc.
- Thus, there is a pressing need to select a location along the human life-cycle, and say "Here is the point where a human becomes fully human".
- The location will stick around a cultural attractor. The only problem left is: which cultural attractor?
Every person in the society will
- Do some decision in its brain, probably subconscious, trying to decide which cultural attractor is the best one to fight for. The human would balance between its own desires, the desires of its friends, its enemies, etc. It is a decision coming out of complex computations.
- It then estimates how many people are really supporting each of the cultural attractors.
- It then performs "strategic voting": instead of aiming for the one it really wants, it aims for some attractor that has a good chance of winning, as well as being close enough to the one it really wants. This is a high-dimensional version of Hotelling's straight-line location model.
Now the whole society has congealed among two attractors (or more, but let's say two for simplicity). Let the two attractors be $t_{0}<t_{1}$ $t_{0}<t_{1}$ . We consider what happens next.
- Team $t_{0}$ argues: if we allow even going a little beyond $t_{0}$ , then we have nothing to stop us from going towards $t_{2}\gg t_{1}$ , thus, everybody must support staying at exactly $t_{0}$ .
- Team $t_{1}$ notices the suspicious phrase "at exactly", and...
- Team $t_{1}$ counter-argues: Actually we are currently at some $t$ that is actually strictly between $t_{0},t_{1}$ . So if we are allowed to move to $t_{0}$ , then we have nothing to stop us from going towards $t_{-1}\ll t_{0}$ . Thus nobody must support going even one epsilon closer to $t_{0}$ .
- Team $t_{1}$ argues symmetrically. Team $t_{0}$ counterargues symmetrically.
The fact of the matter is: there is always an equilibrium. Slippery slopes don't happen, because there are always equilibria. A little murder, not too much, not too little, just right.
In fact, equilibria changes all the time, but nothing catastrophic like "slippery slopes". Equilibria changes due to several effects.
- Technological change. For example, without CT scans, it's really hard to check for fetal heartbeats, so the "first heartbeat rule" cannot be a cultural attractor (though people can fake it by some kind of "legal fiction" -- the grand judge could just declare "Morally speaking, the fetal heartbeat starts at the first vomitting hour of the mother. Whether it is scientifically correct is irrelevant for the spirit of the law.". But with CT scans, this attractor suddenly becomes greatly strengthened.
- Scientific revolution. For example, after souls have disappeared from the scientific consensus of the world, the legal systems of the world slowly eliminated souls as well. Consequently, the "first ensoulment moment" attractor has been greatly weakened.
- Economic change. For example, with cheaper calories, it is less beneficial to kill children, and so the post-birth killing cultural attractors lost most of their strength.

zone of proximal development for Wikipedia

Don't edit articles that have heavy traffic, because those places have possessive watchers that remove anything they don't like. Don't write articles that are going to have almost no traffic or so outside of current Wiki-space, because those articles will be deleted for "not meeting notability criteria".

Write or edit only articles that are on the expanding fringe of Wikipedia -- those that are not yet owned by somebody, but also are not too out-there that nobody would want to own it at all.

Exception: if you are as dedicated as those possessive watchers, then you can engage in protracted edit-collaborations with them. But if you don't want to deal with Wikipedia-politics, leave them alone. The free Wikipedia is not free in terms of administrative friction.

We may divide the social dynamics of Wikipedia-editors into the following classes:

landlords: they own a small amount of pages, bring them into shape, then aggressively remove anything they don't like. They will cite Wikipedia policies to justify their removals if challenged.
robot masters: they run many bots that perform large amount of routine minor edits.
casual editors: they edit whatever they want, usually what interests them, and usually don't care much about Wikipedia policies. If they encounter landlords, they back away.

Something else goes here

I hate getting my stuff deleted. It happens sometimes. I just cut myself when it happens lol.

Wikitrivia

Wikipedia's automatic newline system is so stupid.

When there's literally a <Vandalism> tag in the edit.

If George H. W. Bush broccoli comments deserves a "good article" status surely some of the articles I wrote deserve it too.

SVG

It is really annoying how it treats SVG files. I made some pictures in Draw.io but it just had to throw the

This SVG file contains an illegal namespace "http://www.w3.org/1999/xhtml".

well, reasonable, but then I went to Draw.io's help page and it says I should change the export "Text Settings", but there is no "Text Settings" in the export window!! This is stupid. So I just screenshotted. I guess RGB images $((0:255)^{3})^{H\times W}$ really are the universal language. Fuck plaintext or sourcecode, right?

References

^ ^a ^b Schmidhuber, Jürgen. "Annotated history of modern AI and Deep learning." arXiv preprint arXiv:2212.11279 (2022).
^ Ivakhnenko, A.G. (1970-03). "Heuristic self-organization in problems of engineering cybernetics". Automatica. 6 (2): 207–219. doi:10.1016/0005-1098(70)90092-0. {{cite journal}}: Check date values in: |date= (help)
^ Widrow, Bernard. "Generalization and information storage in networks of adaline neurons." Self-organizing systems (1962): 435-461.
^ Farlow, Stanley J. (1981-11). "The GMDH Algorithm of Ivakhnenko". The American Statistician. 35 (4): 210–215. doi:10.1080/00031305.1981.10479358. ISSN 0003-1305. {{cite journal}}: Check date values in: |date= (help)
^ Niss, Martin (2005-03-01). "History of the Lenz-Ising Model 1920–1950: From Ferromagnetic to Cooperative Phenomena". Archive for History of Exact Sciences. 59 (3): 267–318. doi:10.1007/s00407-004-0088-3. ISSN 1432-0657.

[:0-1] Schmidhuber, Jürgen. "Annotated history of modern AI and Deep learning." arXiv preprint arXiv:2212.11279 (2022).

[2] Ivakhnenko, A.G. (1970-03). "Heuristic self-organization in problems of engineering cybernetics". Automatica. 6 (2): 207–219. doi:10.1016/0005-1098(70)90092-0. {{cite journal}}: Check date values in: |date= (help)

[3] Widrow, Bernard. "Generalization and information storage in networks of adaline neurons." Self-organizing systems (1962): 435-461.

[4] Farlow, Stanley J. (1981-11). "The GMDH Algorithm of Ivakhnenko". The American Statistician. 35 (4): 210–215. doi:10.1080/00031305.1981.10479358. ISSN 0003-1305. {{cite journal}}: Check date values in: |date= (help)

[5] Niss, Martin (2005-03-01). "History of the Lenz-Ising Model 1920–1950: From Ferromagnetic to Cooperative Phenomena". Archive for History of Exact Sciences. 59 (3): 267–318. doi:10.1007/s00407-004-0088-3. ISSN 1432-0657.

[1]

[2]

[3]

[4]

[5]