Chomsky and the Two Cultures of Statistical Learning (2011)

13 days ago

2 replies

A lot of Chomsky’s appeal I believe is due to his politics as his universal grammar theories turned out to be an academic dead end.

But his politics centers around the moral failings of the West so I think yes, if he was involved in the sexual exploitation of trafficked children, then this would devalue his criticism of the morality of the Western political system.

breppp

13 days ago

2 replies

His criticism of the Western political system was always way too simplicist and why it has immense appeal to college students.

Essentially it can be summed as any Western action must be rationalized as evil, and any anti-west action is therefore good. This is also in line with Christian dualism so the cultural building blocks are already in place.

Then you get Khmer Rouge, Putin, Hezbollah, Iran apologetism or downright support

13 days ago

1 reply

I doubt you can find any essay or such where he said anti-Western action was good on the sole grounds that it was anti-Western.

It's difficult to summarise so many years of writing in a few sentences but from my own reading, he pointed out

a) many things done by the US lead to death or destruction b) many of these things are justified in the name of good that doesn't stand up to scrutiny c) the US government is often hypocritical d) US citizens are heavily propagandized both for foreign policy and domestic policy e) as a US citizen, it his duty to try and oppose these actions and since he's not a citizen of Iran, he isn't in a position to do anything about Iran f) a) through d) explain why he is often seen as an apologist, to use your word, for Iran; he tries to explain, from his point of view, why Iran etc. do the things they do g) a strong support of freedom of speech and opposition to censorship, including what he regards as private censorship as opposed to merely government censorship.

breppp

13 days ago

1 reply

That doesn't explain why he visited Hezbollah and showed overwhelming support, probably aware of the organization roots and past actions such as kidnapping journalists or killing politicians or its self professed goal of creating a theocracy in Lebanon.

He of course has very complex rationalizing but essentially he assumes the opposite of mainstream western opinion and then tries to build ideological structures upon that.

That creates a very simplified version of reality wrapped in a nice intellectual wrapping

12 days ago

Chomsky had been involved in linguistics and politics since the 60s, which is nearly six decades covering a multitude of events and issues. To simplify his work down to even a paragraph is an impossible task, let alone as you have done as simply saying "anti Western".

For example, during the 2003 US invasion of Iraq, Germany and France were opposed to the invasion, leading to "Freedom Fries" to insult French opposition to the war. The British public was also opposed to the war, although the the Blair government went along with it anyway. Australia had a similar position - public opposition but government went along with it anyway. Canada official refused entry into the Iraq war. Chomsky was also opposed to the Iraq war. Does this mean that France, Germany, Canada and the British and Australian general public are "anti-Western"? Since Chomsky agreed with these countries, does that make him anti-Western or pro-Western? Does it make the US anti-Western since they proceeded with a war despite formal or popular opposition in many Western countries?

I fear you have a certain definition of the "Western" that simply excludes Western opinions that don't fit your understanding.

As to who Chomsky met him; well as part of this Epstein story, Chomsky met with former Israeli prime minister Ehud Barak. In your opinion, does this make him anti-Western? Indeed, prior to his stroke Chomsky explained that this kind of meeting is why Chomsky associated with Epstein - for the contacts.

I suspect Chomsky is just generally interested in understanding an issues and not bothered by what it's seen as, seemingly to his detriment in this Epstein story.

13 days ago

I am not a fan of Chomsky - the opposite in fact. I was deliberately avoiding judging his actual arguments - to make the point that his own morality undermines his lecturing others on their moral failings.

plastic-enjoyer

13 days ago

1 reply

> But his politics centers around the moral failings of the West so I think yes, if he was involved in the sexual exploitation of trafficked children, then this would devalue his criticism of the morality of the Western political system.

Why would it devalue his criticism assuming he was right?

13 days ago

1 reply

Moral arguments for me don’t stand alone like a mathematical proof or scientific findings which can be examined as some sort of platonic form.

Morality arguments are social and contextual. That 2+2 is 4 won’t change and captures some sort of eternal truth while what is deemed moral is constantly changing over time and differs across different societies and social groupings.

So morality arguments require and appeal to a particular shared sense of right and wrong. If Chomsky was guilty of sexually abusing children, then I do not share his moral foundation and so his appeals to morality arguments do not convince me.

plastic-enjoyer

13 days ago

1 reply

Do you have an example where Chomsky might be right but you disagree with him because of his moral depravity?

13 days ago

1 reply

Why? There are some of Chomsky’s positions I’m sure I agree with and some I disagree with. What’s the relevance to my point?

plastic-enjoyer

12 days ago

If it turns out that Chomsky was sexually abusing children would you start disagreeing with Chomskys positions you agreed previously?

retrac

13 days ago

1 reply

I won't try to defend Chomsky, per se. But if the mere mention of him is "sus" to you then I advise you to not study either linguistics or formal languages (in the computer science sense) because it's Chomsky hierarchies and Chomsky normal forms and there's still people clinging to some iteration of the universal grammar despite the beating it has taken.

He's also, for better and mostly worse, one of the most prominent political thinkers on the American hard left for the last half century.

There's a joke going around for a while now that you either know Chomsky for his politics, or for his work in linguistics ad discrete mathematics, and are shocked to discover his moonlighting work. I guess we can extend that to a third category of fame, or infamy.

cma

13 days ago

The merge operation in the later Chomsky modern linguistics program is similar in a lot of ways to transformer's softmax merging of representations to the next layer.

There's also still a lot to his arguments that we are much more sample efficient and likely have some built in capacity from genetic endowment rather than strictly learned.

13 days ago

Not sure that's relevant? People still discuss what Einstein did, and he's long dead.

(I don't like Chomsky for other reasons, but having an obituary ain't no reason to disregard someone's thoughts.)

poaching

13 days ago

1 reply

Who else in tech/AI did they whale?

mmooss

13 days ago

Are you implying Norvig is a victim or otherwise not responsible for their choices and actions?

13 days ago

1 reply

Dude would talk about manufacturing consent, elitist circles, and what Israel is doing with poor Palestinians and then go aboard Israeli-spy, super elitist, consent manufacturing, sex trafficker, rapist, Epstein's private jet. What a total insult to everyone who ever read his things

13 days ago

2 replies

[delayed]

13 days ago

Chomsky had a stroke a couple of years ago and isn't capable of speaking; the family is trying to maintain their privacy and so there isn't much public information about it but it came out that he can raise his arm when he sees something he dislikes and it doesn't look like much beyond that.

emsign

13 days ago

He already said he had nomoral objections to deal with Epstein knowing about his first conviction for sex trafficking, because in Chomsky's view the man served his time and justice had been served. Yes, to Chomsky Epstein was an innocent man after serving a few months for sex trafficking and having sex with a dozen of minors. The socialist anarchist Chomsky had no ethical objections when he asked a convicted billionaire sex trafficker how to invest a few millions.

https://www.stat.berkeley.edu/~aldous/157/Papers/shmueli.pdf

13 days ago

1 reply

But there is no indication or even accusation that he was involved in any sexual activity, let alone anything inappropriate.

It's innuendo and guilt by association, mainly by his political opponents, both on the left and right, that are taking advantage of his inability to defend himself due to his stroke. I think many people are being _justly maligned_ by their association with Epstein, but in a way that distracts from the wider issue of what exactly does it mean when so many powerful and prominent people are found in compromising or potentially compromising situations and to what ends it served. It's US kompromat and the discussion is largely restricted to maligning people without discussing the significance of it.

In terms of Chomsky himself, given his career spanned both linguistics and politics, an honest critique would either deal with their disagreements with Chomsky like how Norvig did in this essay, or how Hitchens did over the Afghan and Iraq wars rather than saying "he had dinner with Epstein" or "he had dinner with Bannon".

In terms of the Epstein issue, the best criticism I can see is that his association with Epstein, Bannon etc. makes him a hypocrite although I don't find this personally convincing. Part of the problem for me here is that his present infirmities make it difficult for him to defend or explain himself and I find it poor form to kick the man when he's down, mainly by people who just didn't like that Chomsky didn't agree with them personally. Especially when he largely made a contribution to the debates even if one doesn't agree with him.

iwontberude

13 days ago

Bullshit.

userbinator

13 days ago

Along with a bunch of other, arguably far more famous people.

intalentive

13 days ago

3 replies

This essay is missing the words “cause” and “causal”. There is a difference between discovering causes and fitting curves. The search for causes guides the design of experiments, and with luck, the derivation of formulae that describe the causes. Norvig seems to be confusing the map for the territory.

gsf_emergency_6

13 days ago

1 reply

A related essay (2010) by a statistician on the goals of statistical modelling that I've procrastinating on

To Explain Or To Predict?

Nice quote

We note that the practice in applied research of concluding that a model with a higher predictive validity is “truer,” is not a valid inference. This paper shows that a parsimonious but less true model can have a higher predictive validity than a truer but less parsimonious model.

Hagerty+Srinivasan (1991)

0928374082

12 days ago

1 reply

is it more than a commentary on overfitting to the tune of "with enough epicycles you can make the elephant wiggle its trunk"?

gsf_emergency_6

12 days ago

If you are referring to Hagerty+Srinivasan:

They certainly didn't think that a better fit => "truer".

They used the term "truer" to describe a model that more accurately captures the underlying causal structure or "true" relationship between variables in a population.

As for the paper I linked, I still haven't read it closely enough to confirm that the comment below this is a good dismissal..

13 days ago

1 reply

This essay frequently uses the word "insight", and its primary topic is whether an empirically fitted statistical model can provide that (with Norvig arguing for yes, in my opinion convincingly). How does that differ from your concept of a "cause"?

musicale

13 days ago

1 reply

> I agree that it can be difficult to make sense of a model containing billions of parameters. Certainly a human can't understand such a model by inspecting the values of each parameter individually. But one can gain insight by examing (sic) the properties of the model—where it succeeds and fails, how well it learns as a function of data, etc.

Unfortunately, studying the behavior of a system doesn't necessarily provide insight into why it behaves that way.

13 days ago

4 replies

Norvig's textbook surely appears on the bookshelf of researchers including those building current top LLMs. So it's odd to say that such an approach "may not even provide a good predictive model". As of today, it is unquestionably the best known predictive model for natural language, by huge margin. I don't think that's for lack of trying, with billions or more at stake.

Whether that model provides "insight" (or a "cause"; I still don't know if that's supposed to mean something different) is a deeper question, and e.g. the topic of countless papers trying to make sense of LLM activations. I don't think the answer is obvious, but I found Norvig's discussion to be thoughtful. I'm surprised to see it viewed so negatively here, dismissed with no engagement with his specific arguments and examples.

[1] https://plato.stanford.edu/entries/causal-models/

12 days ago

1 reply

> I'm surprised to see it viewed so negatively here, dismissed with no engagement with his specific arguments and examples.

It isn't much worth engaging with because it is unfortunately quite out of touch with (or just ignores) the core issues and ignores the major advances in causal modeling and causal modeling theory, i.e. Judea Pearl and do-calculus, structural equation modeling, counterfactuals, etc [1].

It also, IMO, makes a (highly idiosyncratic) distinction between "statistical" (meaning, trained / fitted to data) and "probabilistic" models, that doesn't really hold up too well.

I.e. probabilistic models in quantum physics are "fit" too in that the values of fundamental constants are determined by experimental data, but these "statistical" models are clearly causal models regardless. Even most quantum physical models can be argued to be causal, just the causality is probabilistic rather than absolute (i.e. A ==> B is fuzzy implication rather than absolute implication).

IMO I don't want to engage much with the arguments because it starts on the wrong foot and begins by making, in my opinion, an incoherent / unsound distinction, while also ignoring (innocently, or deliberately) the actual scientific and philosophical progress already made here.

12 days ago

1 reply

Unless and until neurologists find evidence of a universal grammar unit (or a biological Transformer, or whatever else) in the human connectome, I don't see how any of these models can be argued to be "causal" in the sense that they map closely to what's physically happening in the brain. That question seems so far beyond current human knowledge that any attempt at it now has about as much value as the Greek philosophers' ideas on the subatomic structure of matter.

So in the meantime, Norvig et al. have built statistical models that can do stuff like predicting whether a given sequence of words is a valid English sentence. I can invent hundreds of novel sentences and run their model, checking each time whether their prediction agrees with my human judgement. If it doesn't, then their prediction has been falsified; but these models turned out to be quite accurate. That seems to me like clear evidence of some kind of progress.

You seem unimpressed with that work. So what do you think is better, and what falsifiable predictions has it made? If it doesn't make falsifiable predictions, then what makes you think it has value?

I feel like there's a significant contingent of quasi-scientists that have somehow managed to excuse their work from any objective metric by which to evaluate it. I believe that both Chomsky and Judea Pearl are among them. I don't think every human endeavor needs to make falsifiable predictions; but without that feedback, it's much easier to become untethered from any useful concept of reality.

11 days ago

1 reply

I would think it was quite clear from my last two paragraphs that I agree causal models are generally not as important as people like Chomsky think, and that in general are achievable only in incredibly narrow cases. Besides, all models are wrong: but some are useful.

> You seem unimpressed with that work

I didn't say anything about Norvig's work, I was saying the linked essay is bad. It is correct that Chomsky is wrong, but is a bad essay because it tries to argue against Chomsky with a poorly-developed distinction while ignoring much stronger arguments and concepts that more clearly get at the issues. IMO the essay is also weirdly focused on language and language models, when this is a general issue about causal modeling and scientific and technological progress, and so the narrow focus here also just weakens the whole argument.

Also, Judea Pearl is a philosopher, and do-calculus is just one way to think about and work with causality. Talking about falsifiability here is odd, and sounds almost to me like saying "logic is unfalsifiable" or "modeling the world mathematically is unfalsifiable". If you meant something like "the very concept of causality is incoherent", that would be the more appropriate criticism here, and more arguable.

11 days ago

1 reply

I could iterate with an LLM and Lean, and generate an unlimited amount of logic (or any other kind of math). This math would be correct, but it would almost surely be useless. For this reason, neither computer programs nor grad students are rewarded simply for generating logically correct math. They're instead expected to prove a theorem that other people have tried and failed to prove, or perhaps to make a conjecture with a form not obvious to others. The former is clearly an achievement, and the latter is a falsifiable prediction.

I feel like Norvig is coming from that standpoint of solving problems well-known to be difficult. This has the benefit that it's relatively easy to reach consensus on what's difficult--you can't claim something's easy if you can't do it, and you can't claim it's hard if someone else can. This makes it harder to waste your life on an internally consistent but useless sidetrack, as you might even agree (?) Chomsky has.

You, Chomsky, and Pearl seem to reject that worldview, instead believing the path to an important truth lies entirely within your and your collaborators' own minds. I believe that's consistent with the ancient philosophers. Such beliefs seem to me halfway to religious faith, accepting external feedback on logical consistency, but rejecting external evidence on the utility of the path. That doesn't make them necessarily bad--lots of people have done things I consider good in service of religions I don't believe in--but it makes them pretty hard to argue with.

11 days ago

I'm not sure how you can square anything you said in your last paragraph with anything I said about all models being wrong, and causal modeling being extremely limited.

12 days ago

1 reply

Chomsky's talking about predictive models in the context of cognitive science. LLMs aren't really a predictive model of any aspect of human cognitive function.

12 days ago

2 replies

The generation of natural language is an aspect of human cognition, and I'm not aware of any better model for that than current statistical LLMs. The papers mapping between EEG/fMRI/etc. and LLM activations have been generally oversold so far, but it's active area of research for good reason.

I'm not saying LLMs are a particularly good model, just that everything else is worse. This includes Chomsky's formal grammars, which fail to capture the ways humans actually use language per Norvig's many examples. Do you disagree? If so, what model is better and why?

12 days ago

1 reply

I’m not really sure what you’re getting at. Most work in the area of LLMs is not even attempting to model any aspect of human cognition. Could you point to some papers exemplifying the kind of work that you’re thinking of?

11 days ago

1 reply

This comment and GP comment are why the word "causal model" is needed. LLMs are predictive* models of human language, but they are not causal models of language.

If you believe that some of human cognition is linguistic (even if e.g. inner monologue and spoken language are just the surface of deeper more unconscious processes), then, yes, we might say LLMs can predictively model some aspects of human cognition, but, again, they are certainly not causal models, and they are not predictive models of human cognition generally (as cognition is clearly far, far more than linguistic).

* I avoid calling LLMs "statistical" because they really aren't even that. They are not calibrated, and including a softmax and log-loss in things doesn't magically make your model statistical (especially since other loss functions and simplex mappings, e.g. sparsemax, often work better). LLMs really are more accurately just doing curve/manifold-fitting.

11 days ago

2 replies

They are not predictive models in the domains Chomsky investigated. LLMs make no predictions about, say, when non-surface quantifier scope should or should not be possible, or what should or shouldn’t be a wh-island. They are predictive in a sense that’s largely irrelevant to cognitive science. (Trying to guess what words might come after some other words isn’t a problem in cognitive science.)

https://www.biorxiv.org/content/10.1101/2020.06.26.174482v3....

11 days ago

"What should or shouldn’t be a wh-island" is literally a statement of "what words might come after some other words"! An LLM encodes billions of such statements, just unfortunately in a quantity and form that makes them incomprehensible to an unaided human. That part is strictly worse; but the LLM's statements model language well enough to generate it, and that part is strictly better.

As I read Norvig's essay, it's about that tradeoff, of whether a simple and comprehensible but inaccurate model shows more promise than a model that's incomprehensible except in statistical terms with the aid of a computer, but far more accurate. I understand there's a large group of people who think Norvig is wrong or incoherent; but when those people have no accomplishments except within the framework they themselves have constructed, what am I supposed to think?

Beyond that, if I have a model that tells me whether a sentence is valid, then I can always try different words until I find one that makes it valid. Any sufficiently good model is thus capable of generation. Chomsky never proposed anything capable of that; but that just means his models were bad, not that he was working on a different task.

As to the relationship between signals from biological neurons and ANN activations, I mean something like the paper linked below, whose authors write:

> Thus, even though the goal of contemporary AI is to improve model performance and not necessarily to build models of brain processing, this endeavor appears to be rapidly converging on architectures that might capture key aspects of language processing in the human mind and brain.

I emphasize again that I believe these results have been oversold in the popular press, but the idea that an ANN trained on brain output (including written language) might provide insight into the physical, causal structure of the brain is pretty mainstream now.

11 days ago

Correct, LLMs are predictive also only in a narrow sense!

https://news.ycombinator.com/item?id=46288415

11 days ago

Also, in case you missed the recent big thread, fMRI has taught us almost nothing due to its serious limitations and various measurement and design issues in the field. IMO it is way too slow and clunky to ever yield insights into something as fast as linguistic thought.

musicale

12 days ago

Thanks for the response, but your quote deletes important context. I wasn't talking about what but more why and how. Norvig seems to be saying that studying an LLM's behavior provides insight into that behavior. My observation is that LLMs seem to fail in surprising ways (for example yielding drastically different output quality when a prompt is reworded slightly), and that a model built of LLM behavior, even after studying it, may not accurately predict those failures.

atomicnatureAuthor

13 days ago

You can look into Judea Pearl's definitions of causality for more information.

Pearl defines a ladder of causation:

1. Seeing (association) 2. Doing (intervention) 3. Imagining (counterfactuals)

In his view - most ML algos are at level 1 - they look at data and draw associations, and "agents" have started some steps in level 2 - doing.

The smartest of humans operate mostly in level (3) of abstractions - where they see things, gain experience, and later build up a "strong causal model" of the world and become capable of answering "what if" questions.

12 days ago

I had this exact reaction, no discussion of "causal modeling" makes the whole thing seem horribly out of touch with the real issues here. You can have explanatory and predictive models that are causal models, or explanatory and predictive models that are non-causal, and that this the actual issue, not "explanation" vs. "prediction", which is not a tight enough distinction.

13 days ago

4 replies

Here's Chomsky quoted in the article, from 1969:

> But it must be recognized that the notion of "probability of a sentence" is an entirely useless one, under any known interpretation of this term.

He was impressively early to the concept, but I think even those skeptical of the ultimate value of LLMs must agree that his position has aged terribly. That seems to have been a fundamental theoretical failing rather than the computational limits of the time, if he couldn't imagine any framework in which a novel sentence had probability other than zero.

I guess that position hasn't aged worse than his judgment of the Khmer Rouge (or Hugo Chavez, or Epstein, or ...) though. There's a cult of personality around Chomsky that's in no way justified by any scientific, political, or other achievements that I can see.

dleeftink

13 days ago

1 reply

> novel sentence

The question then becomes on of actual novelty versus the learned joint probabilities of internalised sentences/phrases/etc.

Generation or regurgitation? Is there a difference to begin with?

13 days ago

1 reply

I'm not sure what you mean? As the length of a sequence increases (from word to n-gram to sentence to paragraph to ...), the probability that it actually ever appeared (in any corpus, whether that's a training set on disk, or every word ever spoken by any human even if not recorded, or anything else) quickly goes to exactly zero. That makes it computationally useless.

If we define perplexity in the usual way in NLP, then that probability approaches zero as the length of the sequence increases, but it does so smoothly and never reaches exactly zero. This makes it useful for sequences of arbitrary length. This latter metric seems so obviously better that it seems ridiculous to me to reject all statistical approaches based on the former. That's with the benefit of hindsight for me; but enough of Chomsky's less famous contemporaries did judge correctly that I get that benefit, that LLMs exist, etc.

dleeftink

13 days ago

My point is, that even in the new paradigm where probabilistic sequences do offer a sensible approximation of language, would novelty become an emergent feature of said system, or would such a system remain bound to the learned joint probabilities to generate sequences that appear novel, but are in fact (complex) recombinations of existing system states?

And again the question being, whether there is a difference at all between the two? Novelty in the human sense is also often a process of chaining and combining existing tools and thought.

13 days ago

1 reply

I agree that Chomsky's influence, especially in this century, has done more harm than good.

There's no point minimizing his intelligence and achievements, though.

His linguistics work (eg: grammars) is still relevant in computer science, and his cynical view of the West has merit in moderation.

13 days ago

1 reply

If Chomsky were known only as a mathematician and computer scientist, then my view of him would be favorable for the reasons you note. His formal grammars are good models for languages that machines can easily use, and that many humans can use with modest effort (i.e., computer programming languages).

The problem is that they're weak models for the languages that humans prefer to use with each other (i.e., natural languages). He seems to have convinced enough academic linguists otherwise to doom most of that field to uselessness for his entire working life, while the useful approach moved to the CS department as NLP.

As to politics, I don't think it's hard to find critics of the West's atrocities with less history of denying or excusing the West's enemies' atrocities. He's certainly not always wrong, but he's a net unfortunate choice of figurehead.

13 days ago

3 replies

I have the feeling we're focusing on different time periods.

Chomsky already was very active and well-known by 1960.

He pioneered areas in Computer Science, before Computer Science was a formal field, that we still use today.

His political views haven't changed much, but they were beneficial back when America was more naive. They are harmful now only because we suffer from an absurd excess of cynicism. If Nixon had been president in the current environment, he would have served his full term (just imagine "the tapes are a forgery!" or "why would I believe establishment shills like Woodward and Bernstein?")

How would you feel about Chomsky and his influence if we ignored everything past 1990 (two years after Manufacturing Consent)?

12 days ago

1 reply

I agree that his contributions to proto-computer-science were real and significant, though I think they're also overstated. Note the link to the Wikipedia page for BNF elsewhere in these comments. There's no evidence that Backus or Naur were aware of Chomsky's ideas vs. simply reinventing them, and Knuth argues that an ancient Indian Sanskrit grammarian deserves priority anyways.

I think Chomsky's political views were pretty terrible, especially before 1990. He spoke favorably of the Khmer Rouge. He dismissed "Murder of a Gentle Land", one of the first Western reports of their mass killing, as a "third rate propaganda tract". As the killing became impossible to completely deny, he downplayed its scale. Concern for human rights in distant lands tends to be a left-leaning concept in the West, but Chomsky's influence neutralized that here. This contributed significantly to the West's indifference, and the killing continued. (The Vietnamese communists ultimately stopped it.)

Anyone who thinks Chomsky had good political ideas should read the opinions of Westerners in Cambodia during that time. I'm not saying he didn't have other good ideas; but how many good ideas does it take to offset 1.5-2M deaths?

12 days ago

Judging by that comment, you probably know more about him than I do. I won't try to rebut it, but I enjoyed reading it.

13 days ago

I wrote "when America was more naive" but that isn't entirely correct. Americans are more naive today in certain areas. If my comment weren't locked, I would change that sentence to something like "when Americans believed most of what they read in the newspaper"

jeremyjh

13 days ago

> Just imagine if Nixon had been president in today's environment... the public would say "the tapes are a forgery!" or "why would I believe establishment shills like Woodward and Bernstein?" Too much skepticism is as bad as too little.

Today it would not matter in the least if Nixon were understood to have covered up a conspiracy to break into the DNC headquarters. Most of his party would approve of it and the rest would support him anyway so as not to damage "their side".

agumonkey

13 days ago

1 reply

wasn't his grammar classification revolutionary at the time ? it seems it influenced parsing theory later on

13 days ago

3 replies

His grammar classification is really useful for formal grammars of formal languages. Like what computers and programming languages do.

It's of rather limited use for natural languages.

ogogmad

13 days ago

1 reply

Don't you think people would have figured it out by themselves the moment programmers started writing parsers? I'm not sure his contribution was particularly needed.

12 days ago

1 reply

Lots of things get invented / discovered multiple times when it's in the air. But just because Newton (or Leibnitz) existed, doesn't mean Leibnitz (or Newton) were any less visionary.

ogogmad

11 days ago

1 reply

Just checked after reading your comment. Surprisingly to me, AFAs (Alternating Finite Automatons) do let you introduce the Complement operation into Regex while preserving the O(mn) complexity of running NFAs.

That's really subtle, because deciding Regex universality (i.e. whether a regex accepts every input) is PSPACE-COMPLETE. And since NFAs make it efficient to decide whether a regex matches NO inputs, any attempts to combine NFAs with regex Complement would trip on a massive landmine.

ogogmad

11 days ago

1 reply

Actually, I've now checked more thoroughly, and RE+complement matching is O(n^2 m), which is not that good.

See https://en.wikipedia.org/wiki/Regular_language#Closure_prope...

11 days ago

The complement of a regular language is a regular language, and for any given regular language we can check whether a string is a member of that language in O(length of the string) time.

Yes, depending on how you represent your regular language, the complement operator might not work play nicely with that representation. But eg it's fairly trivial for finite state machines or when matching via Brzozowski derivatives. See https://en.wikipedia.org/wiki/Brzozowski_derivative

koolala

13 days ago

1 reply

"BNF itself emerged when John Backus, a programming language designer at IBM, proposed a metalanguage of metalinguistic formulas ... Whether Backus was directly influenced by Chomsky's work is uncertain."

https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form

I'm not sure it required Chomsky's work.

12 days ago

Oh, lots of stuff gets invented multiple times, when it's "in the air". Nothing special about Chomsky here. And I wouldn't see that distracting from this particular achievement.

adamddev1

13 days ago

1 reply

It's incredibly useful for natural languages.

12 days ago

I'm a big Chomsky nerd, Chomsky fan, and card-carrying ex Chomskyan linguist. (That said, his ties to Epstein are obviously indefensible). I hate to break it to you, but not even Chomsky thought that the Chomsky hierarchy had any particular application to natural languages.

techsystems

13 days ago

1 reply

He did say 'any known' back in the year 1969 though, so judging it to today's knowns would still not be a justification to the idea's age.

13 days ago

Shannon first proposed Markov processes to generate natural language in 1948. That's inadequate for the reasons discussed extensively in this essay, but it seems like a pretty significant hint that methods beyond simply counting n-grams in the corpus could output useful probabilities.

In any case, do you see evidence that Chomsky changed his view? The quote from 2011 ("some successes, but a lot of failures") is softer but still quite negative.

barrenko

13 days ago

1 reply

Is this bayesian vs. frequentist?

tgv

13 days ago

1 reply

In one word: no.

In more detail: Chomsky is/was not concerned with the models themselves, but rather with the distinction between statistical modelling in general, and "clean slate" models in particular on the one hand, and structural models discovered through human insight on the other.

With "clean slate" I mean models that start with as little linguistically informed structure as possible. E.g., Norvig mentions hybrid models: these can start out as classical rule based models, whose probabilities are then learnt. A random neural network would be as clean as possible.

barrenko

13 days ago

Thank you!

codeulike

13 days ago

2 replies

(this is from 2017)

1: https://news.ycombinator.com/item?id=2591154

13 days ago

No it is from 2011. The text mentions an event in 2011, so it couldn't have been written earlier, and the first HN submission [1] was in 2011, so it also wasn't written later.

The title should say (2011), otherwise the whole piece is confusing.

atomicnatureAuthor

13 days ago

It was available earlier. Here's the HN history:

https://hn.algolia.com/?query=Chomsky%20and%20the%20Two%20Cu...

The oldest submission is from 15 y.o ago - that is 2010.

I resubmitted it - thinking - with the success of LLMs - felt this was worth a revisit from "how real-world scientific progress works" point of view.

13 days ago

5 replies

I have many books from Chomsky, and I want to throw them away because it disgusts me to have them. Then I think, why should I throw away things I spent so much on? It makes me more angry. So I have pilled them up somewhere to figure out what ti do with them and each time I walk past it I feel sad to ever passed by his work.

f1shy

13 days ago

2 replies

I assume this comes from his views in politics and/or association with things like Epstein. I would say, independent of that, some works of him can be very valuable. Private life of persons and their work, are better put in totally different context, and not mixed.

spwa4

13 days ago

1 reply

The thing is, nothing that usually changes things applies to Chomsky. What he did was most certainly not a normal thing to do in his time. Like one might say about George Washington or even further back, like Clovis. By today's standards they were morally wrong, but not by the standards of their time and they advanced morals. They made things better.

Chomsky is wrong by the standards of his time and is making things worse rather than better.

It was very much the opposite of Chomsky's ideology as well. So it additionally means he's fake. BOTH on his morals and politics/activism, from both sides (ie. both helping a paedophile, and helping/entertaining a billionnaire).

So it's (yet another) case of an important figure that supposedly stands for something, not just demonstrating he stands for nothing at all, but being a disgusting human being as well.

mikojan

13 days ago

1 reply

> It was very much the opposite of Chomsky's ideology as well.

On the contrary. Chomsky was open about his civil-libertarian principles: If you are convicted, and you complete your court-ordered obligations, you have a clean slate.

spwa4

13 days ago

1 reply

Tell me, did that attitude extend to helping billionnaires who are having sex with minors? Because that's what he did. Is that what this ideology stands for?

mikojan

13 days ago

Yes, of course. It is the whole point. Nobody is going to be mad about your 20 year old parking tickets.

darubedarob

13 days ago

Is that a Werner von Braun quote?

andyjohnson0

13 days ago

3 replies

I don't understand. What is it about Chomsky's work that disgusts you? Or is this a reference to his political opinions?

13 days ago

1 reply

Read the article above. There is a link at the top of this submission to an essay by Peter Norvig, arguing (correctly, in retrospect) that Chomsky's approach to language modelling is mistaken.

andyjohnson0

13 days ago

1 reply

I did read the article. I have a passing familiarity with the debate over Chomsky's theories of universal grammar etc. I didn't notice anything that would cause disgust, and so I wondered what I was failing to understand.

13 days ago

1 reply

If you have read many books by Chomsky, it might make you angry that you have wasted so much time on what turned out to be a fundamentally mistaken theory.

12 days ago

1 reply

The people who downvoted this apparently didn't read the article.

andyjohnson0

11 days ago

Isn't "theory turned out to be wrong" just the price of doing science? And its a good thing: something has been learned.

darubedarob

13 days ago

1 reply

His russian imperialism support and his broad rejection of the eastern european civilian uprising against the communist project. Like many idealists he took a utopian, idealizing view and ran with it reality and real suffering caused be damned. Like many idealists he offered basically a API for sociopaths to be hijacked and used as a useful idiot against humanity. This way predictable leads to ruin and ashes as legacy and it did so for him. The epstein connection is just the cherry on top.

wanderlust123

13 days ago

1 reply

Sounds like bit of an over-reaction if I am being honest.

Some of his books are deeply insightful even if you decide to draw the opposite conclusion. I wouldn’t say anything would create disgust unless you had a conclusion you wanted supported before reading the book.

Regarding the Epstein thing, bizarre to bring that up when discussing his works, seems like you hate him on a personal level.

kroaton

12 days ago

1 reply

I think it is fair to hate pedophiles.

wanderlust123

12 days ago

1 reply

Pretty massive stretch making that inference based on the data don’t you think? Or is this an underhand way to get back at someone you disagree with politically?

darubedarob

12 days ago

No, but he should be in a prison cell with trump, clinton and the other creeps

11 days ago

The fact that he wrote volumes about manufacturing consent, death of the American dream and Israel's invasion of Palestine while he used to travel in luxurious jets with Epstein who was everything that he pretended to fight against.

eucyclos

13 days ago

2 replies

There's an interview with Dan schmachtenberger where he talks about the worst book ever written (his opinion is that it's 'the 48 laws of power'). He made the point that being consistently wrong is actually pretty impressive, and there are worthwhile lessons from watching someone getting taken seriously despite being wrong. Maybe you could revisit them with that approach.

malvim

13 days ago

1 reply

I don’t think they’re disgusted by Chomsky’s work because it’s wrong. They’re disgusted because of the recently surfaced ties with Epstein.

Not sure the approach holds.

> https://www.youtube.com/watch?v=eIzRV4TxHo8

11 days ago

Actually, it's both. I wanted to study media theory, and it was interesting that his work both appeared in compilers and philosophy, so I thought, “Let’s buy some books and dig into them.” The content was stupid, but I didn't need to throw the books away. After writing that comment here, I actually went and sent all of them to paper recycling...

aleph_minus_one

13 days ago

> There's an interview with Dan schmachtenberger where he talks about the worst book ever written (his opinion is that it's 'the 48 laws of power').

Could it be this?

rixed

13 days ago

1 reply

[delayed]

11 days ago

1 reply

It's not about the science, I keep all the deprecated or rendered wrong/irrelevant books because they shaped me at some point and I'm proud of that. But finding out an author sitting on your bookshelf can possibly be a child abuser and definitely in-ties with Epstein disgusts me and I no longer keep anything from them.

rixed

11 days ago

1 reply

[delayed]

9 days ago

I actually put in the time and searched many times. Again and again. I'm more confident of his guilt than of innocence. The very fact that he even walked past Epstein devalues his work altogether. I don't need to hear of his guilt, just the fact that he required Epstein's help with his finances makes him no one to talk about the elites in power.

IndySun

13 days ago

[delayed]

MoravecsParadox

12 days ago

1 reply

> derided researchers in machine learning who use purely statistical methods to produce behavior that mimics something in the world, but who don't try to understand the meaning of that behavior.

It's crazy how wrong Chomsky was about machine learning. Maybe the real truth is that humans are stochastic parrots who have an underlying probability distribution - and because gradient descent is so good at reproducing probability distributions - LLMs are incredibly good at reproducing language.

AuthAuth

12 days ago

Is it crazy? Chomsky is wrong on so many of the topics he speaks about.

adamddev1

12 days ago

> Chomsky has focused on the generative side of language

The answers to "why" that Chomsky pushes so hard for are very valuable to adult language learners. There are basic syntactic rules to generating broadly correct language. Having these rules discovered and explained in the simplest possible form is irreplaceable by statistical models. Neural networks, much like native speakers can say "well this just sounds right," but adult learners need a mathematical theory of how and why they can generate sentences. Yes, this changes with time and circumstances, but the simple rules and theories are there if we put the effort in to look for them.

There are many languages with a very small corpus of training data. The LLMs fail miserably at communicating with them or explaining things about their grammar, but if we look hard for the underlying theories Chomsky was looking for, we can make huge leaps and bounds in understanding how to use them.

bo1024

13 days ago

Is this essay from 2011?

ur-whale

13 days ago

Chomsky is truly exceptional in the following sense:

I have a yet to witness a man so smart yet who ended up being so profoundly wrong on everything he did in his life.

Both on the linguistics side of things and on his politics.

And to see him at such an advanced age still rejecting what is an absolutely clear and painful proof that all he's done in linguistics was wrong ... how sad.

What a terrible waste of an intellect.