It’s not always easy to distinguish between existentialism and a bad mood.

  • 1 Post
  • 39 Comments
Joined 3 years ago
cake
Cake day: July 2nd, 2023

help-circle
  • I like how even by ACX standards scoot’s posts on AI are pure brain damage

    One level lower down, your brain was shaped by next-sense-datum prediction - partly you learned how to do addition because only the mechanism of addition correctly predicted the next word out of your teacher’s mouth when she said “three plus three is . . . “ (it’s more complicated than this, sorry, but this oversimplification is basically true). But you don’t feel like you’re predicting anything when you’re doing a math problem. You’re just doing good, normal mathematical steps, like reciting “P.E.M.D.A.S.” to yourself and carrying the one.

    The most compelling analogy: this is like expecting humans to be “just survival-and-reproduction machines” because survival and reproduction were the optimization criteria in our evolutionary history. […] This simple analogy is slightly off, because it’s confusing two optimization levels: the outer optimization level (in humans, evolution optimizing for reproduction; in AIs, companies optimizing for profit) with the inner optimization level (in humans, next-sense-datum prediction; in AIs, next-token prediction). But the stochastic parrot people probably haven’t gotten to the point where they learn that humans are next sense-datum predictors, so the evolution/reproduction one above might make a better didactic tool.

    He also threatens an Anti-Stochatic-Parrot FAQ.

    Here’s hoping if this happens Bender et al enthusiastically point out this is coming from a guy whose long term master plan is to fight evil AI with eugenics. Or who uses the threat of evil AI to make eugenics great again if they are feeling less charitable.









  • That was a good read.

    Corey doc wrote:

    It’s not “unethical” to scrape the web in order to create and analyze data-sets. That’s just “a search engine”

    Equivocating what LLMs do and what goes into LLM web scraping with “a search engine” is messed up. His article that he links about scraping is mostly about how badly copyright works and how analysing trade-secret-walled data can be beneficial both to consumers and science but occasionally bad for citizen privacy, which you’ll recognize as mostly irrelevant to the concerns people tend to have against LLM training data providers ddosing the fuck out of everything, and all the rest of the stuff tante does a good job of explaining.

    Corey also provides this anecdote:

    As a group of human-rights defending forensic statisticians, HRDAG has always relied on cutting edge mathematics in its analysis. With its Colombia project, HRDAG used a large language model to assign probabilities for responsibility for each killing documented in the databases it analyzed.

    That is, HRDAG was able to rigorously and legibly say, “This killing has an X% probability of having been carried out by a right-wing militia, a Y% probability of having been carried out by the FARC, and a Z% probability of being unrelated to the civil war.”

    The use of large language models — produced from vast corpuses of scraped data — to produce accurate, thorough and comprehensible accounts of the hidden crimes that accompany war and conflict is still in its infancy. But already, these techniques are changing the way we hold criminals to account and bring justice to their victims.

    Scraping to make large language models is good, actually.

    what the actual shit

    edit: I mean, he tried transformer powered voice-to-text and liked it, and now he’s all in on the LLMs are a rigorous and accurate tool actually bandwagon?

    Also the web scraping article is from 2023 but CD linked it in the recent pluralistic post so I assume his views haven’t changed.













  • The common clay of the new west:

    transcription

    Twitter post from @BenjaminDEKR

    “OpenClaw is interesting, but will also drain your wallet if you aren’t careful. Last night around midnight I loaded my Anthropic API account with $20, then went to bed. When I woke up, my Anthropic balance was $O. Opus was checking “is it daytime yet?” every 30 minutes, paying $0.75 each time to conclude “no, it’s still night.” Doing literally nothing, OpenClaw spent the entire balance. How? The “Heartbeat” cron job, even though literally the only thing I had going was one silly reminder, (“remind me tomorrow to get milk”)”

    Continuation of twitter post

    “1. Sent ~120,000 tokens of context to Opus 4.5 2. Opus read HEARTBEAT md, thought about reminders 3. Replied “HEARTBEAT_OK” 4. Cost: ~$0.75 per heartbeat (cache writes) The damage:

    • Overnight = ~25+ heartbeats
    • 25 × $0.75 = ~$18.75 just from heartbeats alone
    • Plus regular conversation = ~$20 total The absurdity: Opus was essentially checking “is it daytime yet?” every 30 minutes, paying $0.75 each time to conclude “no, it’s still night.” The problem is:
    1. Heartbeat uses Opus (most expensive model) for a trivial check
    2. Sends the entire conversation context (~120k tokens) each time
    3. Runs every 30 minutes regardless of whether anything needs checking That’s $750 a month if this runs, to occasionally remind me stuff? Yeah, no. Not great.”