How an artificial intelligence (as in large language model based generative AI) could be better for information access and retrieval than an encyclopedia with a clean classification model and a search engine?

If we add a step of processing – where a genAI “digests” perfectly structured data and tries, as bad as it can, to regurgitate things it doesn’t understand – aren’t we just adding noise?

I’m talking about the specific use-case of “draw me a picture explaining how a pressure regulator works”, or “can you explain to me how to code a recursive pattern matching algorithm, please”.

I also understand how it can help people who do not want or cannot make the effort to learn an encyclopedia’s classification plan, or how a search engine’s syntax work.

But on a fundamental level, aren’t we just adding an incontrolable step of noise injection in a decent time-tested information flow?

  • Feyd@programming.dev
    link
    fedilink
    arrow-up
    30
    arrow-down
    2
    ·
    6 days ago

    But on a fundamental level, aren’t we just adding an incontrolable step of noise injection in a decent time-tested information flow?

    Yes.

  • e0qdk@reddthat.com
    link
    fedilink
    arrow-up
    16
    ·
    6 days ago

    If it actually worked reliably enough, it would be like having a dedicated, knowledgeable, and infinitely patient tutor that you can ask questions to and interactively explore a subject with who can adapt their explanations specifically to your way of thinking. i.e. it would understand not just the subject matter but also you. That would help facilitate knowledge transfer and could reduce the tedium of trying to make sense of something that’s not explained well enough for you to understand (as written) with your current background knowledge but which you are capable of understanding.

    • Doomsider@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      5 days ago

      Looks like we just found our next head of the Department of Education!

      Now we just got a tweak Grok a little and our children will be ready for the first lesson of their new AI education checks notes Was the Holocaust real or just a woke story?

  • SolOrion@sh.itjust.works
    link
    fedilink
    arrow-up
    11
    arrow-down
    1
    ·
    edit-2
    6 days ago

    Well, the primary thing is that you can ask extremely specific questions and get tailored responses.

    That’s the best use case for LLMs, imo. It’s less of a replacement for a traditional encyclopedia- though people use it like that also- and more of a replacement for googling your question and getting a Reddit thread where someone explains.

    The issue comes when people take everything it spits out as gospel, and do zero fact checking on it- basically the way that they hallucinate is the problem I have with it.

    If there’s a chance it’s going to just flatly make things up, invent statistics, or just be entirely wrong… I’d rather just use a normal forum and ask a real person that probably has a clue whatever question I have. Or try to find where someone has already asked that question and got an answer.

    • NigelFrobisher@aussie.zone
      link
      fedilink
      arrow-up
      5
      ·
      5 days ago

      If you have to go and fact check the results anyway, is there even a point? At work now I’m getting entirely AI generated pull requests with AI generated descriptions, and when I challenge the dev on why they went with particular choices they can’t explain or back them up.

      • SolOrion@sh.itjust.works
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        5 days ago

        That’s why I don’t really use them myself. I’m not willing to spread misinformation just because ChatGPT told me it was true, but I also have no interest in going back over every response and double checking that it’s not just making shit up.

    • Hotzilla@sopuli.xyz
      link
      fedilink
      arrow-up
      4
      ·
      edit-2
      6 days ago

      Google is so shit nowadays, it’s main purpose is to sell you things, not to actually retrieve the things you ask.

      Mainly you see this with coding related questions, they were much better 5 years ago. Now only way to get results is to ask LLM and hope it doesn’t hallusinate some library that doesn’t exist.

      Part of the issue is that SEO got better and google stopped changing things to avoid SEO manipulation.

  • Perspectivist@feddit.uk
    link
    fedilink
    arrow-up
    11
    arrow-down
    1
    ·
    6 days ago

    Looking at my ChatGPT “random questions” tab and the things I’ve asked from it, much of it are the kind of things you probably couldn’t look up on encyclopedia.

    For example:

    “Is a slight drop in the engine rpm when shifting from neutral to 1st gear while holding down the clutch pedal a sign of worn out clutch”?

    Or:

    “What’s the difference between Mirka’s red and yellow sandpaper?”

    • XeroxCool@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      6 days ago

      Hopefully, it told you that’s not a sign of a worn clutch. Assuming no computer interference and purely mechanical effects, then that’s a sign the clutch is dragging. A worn clutch would provide more of an air gap with the pedal depressed than a fresh clutch. If you want to see a partial list of potential causes, see my reply to the other comment that replied to you.

      Your questions are still not proof that LLMs are filling some void. If you think of a traditional encyclopedia, of course it’s not going to know what the colors of one manufacturer’s sandpapers mean. I’m sure that’s answered somehow on their website or wherever you came across the two colors in the same grit and format. Chances are, if one is more expensive and doesn’t have a defined difference in abrasive material, the pricier one is going to last longer by way of having stronger backing paper, better abrasive adhesive, and better resistance to clogging. Whether or not the price is necessary for your project is a different story. ChatGPT is reading the same info available to you. But if you don’t understand the facts presented on the package, then how can you trust the LLM to tokenize it correctly to you?

      Similarly, a traditional encyclopedia isn’t going to have a direct answer to your clutch question, but, if it has thorough mechanical entries (with automotive specifics), you might be able to piece it together. You’d learn the “engine” spins in unison up to the flywheel, the flywheel is the mating surface for the clutch, the clutch pedal disengages the clutch from the flywheel, and that holding the pedal down for 5+ seconds should make the transmission input components spin down to a stop (even in neutral). You’re trusting the LLM here to have a proper understanding of those linked mechanical devices. It doesn’t. It’s aggregating internet sources, buzzfeed style, and presenting anything it finds in a corrupted stream of tokens. Again, if you’re not brought up to speed on how those components interact, then how do you know what it’s saying is correct?

      Obviously, the rebuttal is how can you trust anyone’s answer if you’re not already knowledgeable? Peer review is great for forums/social sites/wikipedias in the way of people correcting other comments. But beyond that, for formal informational sites, vetting places as a source - a skill being actively eroded with Google or ChatGPT “giving” answers. Neither are actually answering your questions. They’re regurgitating things they found elsewhere. Remember, Google was happy to take reddit answers as fact and tell you elmers glue will hold cheese to pizza and cockroaches live in cocks. If you saw those answers with their high upvote count, you’d understand the nuance that reddit loves shitty sarcastic answers for entertainment value. LLMs don’t because they, literally, don’t understand anything. It’s up to you to figure out if you should trust an algorithm-promoted Facebook page called “car hacks and facts” filled with bullshit videos. It’s up to you to figure out if everythingcar. com is untrustworthy because it has vague, expansive wording and has more ad space than information.

      • XeroxCool@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        6 days ago

        It’s not. A worn clutch is losing its ability to connect the engine to the transmission. With the pedal depressed, the clutch should not be touching the engine [flywheel] at all. So a worn clutch would provide slightly more of an air gap between the engine and the transmission. So to answer OP’s question, assuming there’s no computer programming involved with the drop and it’s a purely mechanical effect, then the clutch is dragging. There’s many possibilities, including misadjusted clutch mechanisms (cable/plunger nut, pedal free play screw), worn clutch mechanisms (bent clutch fork, leaking fluid/worn cable sheath/stretched cable, broken pedal mount, bent levers), or a jam (extra carpet under the pedal, debris in transmission lever) to new several possibilities.

        I had both a worn clutch and a dragging clutch in my Geo at different points. The only result of a worn clutch is having the engine rev up faster than the trucklet was accelerating, as if it was a loosey goose automatic. No shifting issues. When the cable was out of adjustment, it wasn’t disengaging properly. It happened while driving and made it very difficult to drive since I came to a stop. I had to ride the poor synchro to get it up in speed to, essentially, clutchless shift into 1st. 3 blocks later, I forced it in just in time to climb my driveway.

        But, to a much less dramatic experience, often enough, the aftermarket floormat would slip under the pedal and just slightly limit the clutch pedal travel to an effect more like the parent comment’s experience. It go into gear with a little crunch and a little shudder and a little engine drop.

        Side note, it’s normal for letting the clutch out in neutral and having the engine drop a little. If the clutch pedal is up, the engine will be driving multiple input components - they just won’t be further connected to the output components. It takes a little energy to spin those back up to 700rpm. They should spin down after a few seconds. If 5-10 seconds pass with the pedal depressed and the gears still resist then comply being engaged with the shifter, they aren’t slowing down. That’d be another symptom/diag point for OP to test for a dragging clutch. A caveat is that if there’s zero input and output speed on the transmission, the dogs may not be lined up and will still prevent engagement. It takes a few tries to confirm “sometimes won’t engage” vs “really will not engage”

  • Valmond@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    6 days ago

    AI today in LLMs are quite juvenile and quite bonkers in image generation. They will probably get better like all information technology do (anyone remember the mobile phone? It went from popular, bad and expensive to a 20€ perfectly working door-stop).

    So to answer your question, imagine an AI functioning like a personal teacher, so that when you see that pressure regulator valve, you can ask for why it works, what happens if the gas is not isotropic? How’s the reaction to pressure changes? Where is it used? Show me a simulation of it in a real world situation. Calculate which one to get ad this specific replacement. Can you 3D print one? Why was it used on steam engines, or was it? Thousands of informations that won’t fit on one page, that can be explained to you at your level too, if the teacher is smart enough.

    I mean, that could be quite neat IMO.

  • Flax@feddit.uk
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    6 days ago

    LLMs are nice for basic research or explaining stuff in your terms. Kind of like an interactive encyclopedia. This does sacrifice accuracy, though

  • hera@feddit.uk
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    2
    ·
    6 days ago

    One of the ways I’ve found it to be useful so far that it can contextualise knowledge for you.

  • Hackworth@sh.itjust.works
    link
    fedilink
    arrow-up
    1
    ·
    6 days ago

    In much the same way people think of digital storage as external memory, I think of generative A.I. as external imagination. Of course, human memory doesn’t work like a hard drive, and LLMs don’t work like our imaginations. But as a guiding metaphor, it seems to work well for identifying good/bad use cases.

  • Alsjemenou@lemy.nl
    link
    fedilink
    Nederlands
    arrow-up
    1
    ·
    edit-2
    6 days ago

    The problem will always be that you have to use an llm to ask questions in natural language. Which means it gets training data from outside whatever database you’re trying to get information from. There isn’t enough training data in an encyclopedia to make an llm.

    So it can’t be better because if it doesn’t find anything it will still respond to your questions in a way that makes it seem it did what you ask. It just isn’t as reliable as you yourself checking and going through the data. It can make you faster and find connections you wouldn’t make yourself easily. But you can just never trust it as you can trust an encyclopedia.

  • 211@sopuli.xyz
    link
    fedilink
    arrow-up
    1
    ·
    6 days ago

    To me the value has come mostly from “ok, so it sounds to me you are saying that…” and the ability to confirm that I haven’t misunderstood something (of course with current LLMs both the original answer and the verification have to be taken with a heaping of salt). And the ability to adapt it on the go to a concrete example. So, kind of like a having a teacher or an expert friend, and not just search engine.

    Like the last time I relied heavily on a LLM to help/teach me with something it was to explain the PC boot process and BIOS/UEFI to me, and how it applied step by step on how successfully deal with USB and bootloader issues on an “eccentric” HP laptop when installing Linux. The combination of explaining and doing and answering questions was way better than an encyclopedia. No doubt it could have been done with blog posts and textbooks, and I did have to make “educated guesses” on occasion, but all in all it was a great experience.

  • fxdave@lemmy.ml
    link
    fedilink
    arrow-up
    1
    arrow-down
    8
    ·
    6 days ago

    What’s understanding? Isn’t understanding just a consequence of neurons communicating with each other? This case LLMs with deep learning can understand things.

      • fxdave@lemmy.ml
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        6 days ago

        Any explanation? If they can write text, I assume they understand grammar. They are definetly skilled in a way. If you do snowboarding, do you understand snowboarding? The word “understand” can be misleading. That’s why I’m asking what’s understanding?

        • Solumbran@lemmy.world
          link
          fedilink
          arrow-up
          1
          ·
          5 days ago

          https://en.wikipedia.org/wiki/Disjunctive_sequence

          With your logic, these numbers understand grammar too because they can form sentences.

          Even better, anything that any human could ever say is contained in those, and as such, humanity has a more limited grammar understanding than a sequence.

          You cannot define understanding by the results, and even if you did, AIs give horrible results that prove that they do nothing else than automatically put words next to each other based on the likelihood of it making sense to humans.

          They do not understand grammar just like they do not understand anything, they simply are an algorithm made to spit out “realistic” answers without having to actually understand them.

          Another example of that is AIs that generate images: they’re full of nonsense because the AI doesn’t understand what it’s making, and that’s why you end up with weird artifacts that seem completely absurd to any human with basic understanding of reality.

          • fxdave@lemmy.ml
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            5 days ago

            But LLMs are not simply probabilistic machines. They are neural nets. For sure, they haven’t seen the world. They didn’t learn the way we learn. What they mean by a caterpillar is just a vector. For humans, that’s a 3D, colorful, soft object with some traits.

            You can’t expect that a being that sees chars and produces chars knows what we mean by a caterpillar. Their job is to figure out the next char. But you could expect them to understand some grammar rules. Although, we can’t expect them to explain the grammar.

            For another example, I wrote a simple neural net, and with 6 neurons it could learn XOR. I think we can say that it understands XOR. Can’t we? Or would you say then that an XOR gate understands XOR better? I would not use the word understand for something that cannot learn. But why wouldn’t we use it for a NN?

            • Solumbran@lemmy.world
              link
              fedilink
              arrow-up
              1
              ·
              5 days ago

              Your whole logic is based on the idea that being able to do something means understanding that thing. This is simply wrong.

              Humans feel emotions, yet they don’t understand them. A calculator makes calculations, but no one would say that it understands math. People blink and breathe and hear, without any understanding of it.

              The concept of understanding implies some form of meta-knowledge about the subject. Understanding math is more than using math, it’s about understanding what you’re doing and doing it out of intention. All of those things are absent in an AI, neural net or not. They cannot “see the world” because they need to be programmed specifically for a task to be able to do it; they are unable to actually grow out of their programming, which is what understanding would ultimately cause. They simply absorb data and spit it back out after doing some processing, and the fact that an AI can be made to produce completely incompatible results shows that there is nothing behind it.

              • fxdave@lemmy.ml
                link
                fedilink
                arrow-up
                1
                ·
                edit-2
                2 days ago

                The concept of understanding implies some form of meta-knowledge about the subject.

                That can be solved if you teach it the meta-knowledge with intermediary steps, for example:

                prompt: 34*3=
                
                step1: 4*3 + 30*3 = 
                step2: 12 + 10*3*3 = 
                step3: 12 + 10*9=
                step4: 12 + 90 =
                step5: 100 + 2 =
                step6: 102
                
                result: 102
                

                It’s hard to find such learning data though, but e.g. claude already uses intermediary steps. It preprocesses your input multiple times. It writes code, runs code to process your input, and that’s still not the final response. Unfortunately, it’s already smarter than some junior developers, and its consequence is worrying.

    • Alsjemenou@lemy.nl
      link
      fedilink
      Nederlands
      arrow-up
      1
      ·
      6 days ago

      that’s just circular reasoning, since understanding is needed for communication.