I’m a tech interested guy. I’ve touched SQL once or twice, but wasn’t able to really make sense of it. That combined with not having a practical use leaves SQL as largely a black box in my mind (though I am somewhat familiar with technical concepts in databasing).

With that, I keep seeing [pic related] as proof that Elon Musk doesn’t understand SQL.

Can someone give me a technical explanation for how one would come to that conclusion? I’d love if you could pass technical documentation for that.

  • KillingTimeItself
    link
    fedilink
    English
    10
    edit-2
    2 months ago

    TL;DR de-deuplication in that form is used to refer a technique where you reference two different pieces of data in the file system, with one single piece of data on the drive, the intention being to optimize file storage size, and minimize fragmentation.

    You can imagine this would be very useful when taking backups for instance, we call this a “Copy on Write” approach, since generally it works by copying the existing file to a second reference point, where you can then add an edit on top of the original file, while retaining 100% of the original file size, and both copies of the file (its more complicated than this obviously, but you get the idea)

    now just to be clear, if you did implement this into a DB, which you could do fairly trivially, this would change nothing about how the DB operates, it wouldn’t remove “duplicates” it would only coalesce duplicate data into one single tree to optimize disk usage. I have no clue what elon thinks it does.

    The problem here, as a non programmer, is that i don’t understand why you would ever de-duplicate a database. Maybe there’s a reason to do it, but i genuinely cannot think of a single instance where you would want to delete one entry, and replace it with a reference to another, or what elon is implying here (remove “duplicate” entries, however that’s supposed to work)

    Elon doesn’t know what “de-duplication” is, and i don’t know why you would ever want that in a DB, seems like a really good way to explode everything,

    • @valtia@lemmy.world
      link
      fedilink
      22 months ago

      i genuinely cannot think of a single instance where you would want to delete one entry, and replace it with a reference to another

      Well, there’s not always a benefit to keeping historical data. Sometimes you only want the most up-to-date information in a particular table or database, so you’d just update the row (replace). It depends on the use case of a given table.

      what elon is implying here (remove “duplicate” entries, however that’s supposed to work)

      Elon believes that each row in a table should be unique based on the SSN only, so a given SSN should appear only once with the person’s name and details on it. Yes, it’s an extremely dumb idea, but he’s a famously stupid person.

      • DacoTaco
        link
        fedilink
        1
        edit-2
        2 months ago

        Ssn being unique isnt a dumb idea, its a very smart idea, but due to the us ssn format its impossible to do. Hence to implement the idea you need to change the ssn format so it is unique before then.

        Also, elons remark is stupid as is. Im sure the row has a unique id, even if its just a rowid column.

        • KillingTimeItself
          link
          fedilink
          English
          1
          edit-2
          2 months ago

          Also, elons remark is stupid as is. Im sure the row has a unique id, even if its just a rowid column.

          even then, i wonder if there’s some sort of “row hash function” that takes a hash of all the data in a single entry, and generates a universally unique hash of that entry, as a form of “global id”

      • KillingTimeItself
        link
        fedilink
        English
        12 months ago

        Well, there’s not always a benefit to keeping historical data. Sometimes you only want the most up-to-date information in a particular table or database, so you’d just update the row (replace). It depends on the use case of a given table.

        in this case you would just overwrite the existing row, you wouldn’t use de-duplication because it would do the opposite of what you wanted in that case. Maybe even use historical backups or CoW to retain that kind of data.

        Elon believes that each row in a table should be unique based on the SSN only, so a given SSN should appear only once with the person’s name and details on it. Yes, it’s an extremely dumb idea, but he’s a famously stupid person.

        and naturally, he doesn’t know what the term “de-duplication” means. Definitionally, the actual identity of the person MUST be unique, otherwise you’re going to somehow return two rows, when you call one, which is functionally impossible given how a DB is designed.

        • @valtia@lemmy.world
          link
          fedilink
          12 months ago

          in this case you would just overwrite the existing row, you wouldn’t use de-duplication because it would do the opposite of what you wanted in that case.

          … That’s what I said, you’d just update the row, i.e. replace the existing data, i.e. overwrite what’s already there

          Definitionally, the actual identity of the person MUST be unique, otherwise you’re going to somehow return two rows, when you call one, which is functionally impossible given how a DB is designed.

          … I don’t think you understand how modern databases are designed

          • KillingTimeItself
            link
            fedilink
            English
            02 months ago

            … That’s what I said, you’d just update the row, i.e. replace the existing data, i.e. overwrite what’s already there

            u were talking about not keeping historical data, which is one of the proposed reasons you would have “duplicate” entries, i was just clarifying that.

            … I don’t think you understand how modern databases are designed

            it’s my understanding that when it comes to storing data that it shouldn’t be possible to have two independent stores of the exact same thing, in two separate places, you could have duplicate data entries, but that’s irrelevant to the discussion of de-duplication aside from data consolidation. Which i don’t imagine is an intended usecase for a DB. Considering that you literally already have one identical entry. Of course you could simply make it non identical, that goes without saying.

            Also, we’re talking about the DB used for the social security database, not fucking tigerbeetle.

  • @owenfromcanada@lemmy.world
    link
    fedilink
    English
    62 months ago

    TIL Elon doesn’t know SQL or have any basic human decency.

    J/K, I already knew he doesn’t have basic human decency.

    If he knew anything about SQL, he could have run a quick search to see whether any SSNs are actually duplicated. (spoiler alert: they’re not, he’s just stupid).

    • Phoenixz
      link
      fedilink
      82 months ago

      Elin musk is a (criminal) scammer, he always has been.

      He was fired for incompetence from his own company

      Pretty much everything he’s promised for every company he has headed had been a lie. Tesla full self driving? Lie. Hyperloop? All lies to successful kill high speed rail and start a movement that wasted billions of dollars including tax payer money. Even SpaceX, the least shit of all, is shit. Once you really look at it, its all promises with no results and lots of cheering when millions of tax payer dollars -yet again- blow up in the sky.

      The guy has one quality: convincing people that he’s smart even though he literally doesn’t know shit

    • @Snothvalpen@lemmy.blahaj.zone
      link
      fedilink
      32 months ago

      Wait, SSNs weren’t designed to be GUIDs? I mean, I fully follow that they aren’t and we’ve had to reuse them when the circle of life does its thing, but I thought they were just designed poorly and we found out the hard eay they don’t work as GUIDs. What purpose were they designed for if not to act as GUIDs?

      • @jonne@infosec.pub
        link
        fedilink
        62 months ago

        They were designed to be only used for the administration of social security. Since they were sending monthly checks, they needed a way to know that the person going to the office and saying their address changed was who they said they were. This was at a time before driver’s licences were common and they didn’t have any other type of ID, and there were just a lot fewer people.

        Later on the SSN started to be used by banks and other entities even though it was never meant for that, and the risks associated with the relatively insecure design just compounded, because instead of just fraudulently claiming someone else’s social security checks (which, unless the target died, would probably be figured out within a month), it opened up all sorts of extra avenues for fraud.

      • @Maggoty@lemmy.world
        link
        fedilink
        182 months ago

        But I was assured he was a materials engineer, rocket scientist, computer programmer, and businessman extraordinaire!

    • @Dkarma@lemmy.world
      link
      fedilink
      182 months ago

      Lol talk about burying the lede… The issue here is that the government absolutely uses SQL to traverse a DB and anyone who thinks otherwise is an idiot.

      • @DahGangalang@infosec.pubOP
        link
        fedilink
        42 months ago

        Naw, I definitely meant to be asking about duplication of data in databases (vs if the government actually uses SQL).

        Sorry to have communicated that so poorly. Everyone seems to be taking the angle you’re arguing though. Guess I’ll need to work on that.

    • @orcrist@lemm.ee
      link
      fedilink
      152 months ago

      Elon Musk is also an idiot. He thinks he’s smart enough to quickly understand complex situations and complex problems about which he knows next to nothing, within just a few minutes.

      Most people would only try to claim that level of understanding in areas with which they have professional experience or about which they’re extremely geeky. He does it with everything, and nobody can be an expert in everything, and everybody knows that except for narcissistists.

      I suppose for non-tech people it might be convenient to assume that because someone knows something about some kind of tech, they therefore know a lot about all kinds of tech, and the reality is that’s just not true. There are so many fields that are totally different. But if it did, actually he would look even more idiotic, because Twitter is a train wreck, so clearly he’s incompetent in tech field, right?

      • @thatKamGuy@sh.itjust.works
        link
        fedilink
        132 months ago

        The SSN is 9 digits long; so technically they would have to start re-using them after the billionth one. Given the current population size, and how many people have been born/died since its implementation - it’s fair to say they haven’t had to re-use any figures yet.

    • @bitchkat@lemmy.world
      link
      fedilink
      English
      302 months ago

      The sheer size of the federal government and its age would mean there are thousands of databases out there. Some may be so old that they predate RDBMS/SQL.

      That alone makes his comment come from a place of ignorance. Of course it’s confident ignorance. The worst kind.

      • @lmmarsano@lemmynsfw.com
        link
        fedilink
        English
        2
        edit-2
        2 months ago

        Some may be so old that they predate RDBMS/SQL.

        I don’t follow. Wouldn’t that lend credence to his assertion that it’s incorrect to assume that everything in government is SQL?

        People here are being irrationally obtuse about the possibility that an agency that’s existed since the 1930s may keep business-critical records on legacy systems predating relational databases. Systems serving a national agency may not migrate databases frequently.

          • @lmmarsano@lemmynsfw.com
            link
            fedilink
            English
            1
            edit-2
            2 months ago

            Were those his exact words? When words are ambiguous, are we selecting interpretations that serve best in the contention? Does the context suggest something obvious was left unstated? Yours seems like a forced interpretation.

            1. He complains about 1 specific database.
            2. Some rando assumes it’s SQL & retorts he doesn’t know it.
            3. He literally writes “This retard thinks the government uses SQL.”

            Always, sometimes, here? In typical Twitter fashion, it’s brief and leaves room for interpretation.

            In context, always or here makes the most sense as in “This dumbass thinks the government always uses SQL.” or “This dumbass thinks the government uses SQL here.” Does it matter some other database is SQL if this one isn’t? No. With your interpretation, he pointlessly claims that it does matter for no better reason than to discredit himself. With narrower interpretations, he doesn’t. In a contention, people don’t typically make pointless claims to discredit themselves. Therefore, narrower interpretations make more sense. Use context.

            All I did here was apply textbook guidelines for analyzing arguments & strawman fallacies as explained in The Power of Logic. I welcome everyone to do the same.

            A problem with objecting to a proposition that misrepresents the original proposition is that the objector fails to engage with the actual argument. Instead, they argue with themselves & their illusions, which looks foolish & isn’t a valid argument. That’s why strawman is a fallacy.

            The fact is there’s very little information here. We don’t know which database he’s referring to exactly. We don’t know its technology. Some of us have worked enough with local government & legacy enterprise systems to know that following any sort of common industry standards is an unsafe assumption. No one here has introduced concrete information on any of that to draw clear conclusions, though there’s an awful lot of conjecture & overreading.

            He seemed to use the word de-duplicated incorrectly. However, he also explained exactly what he meant by that, so the word hardly matters. Is there a good chance he’s wrong that multiple records with the same SSN indicate fraud? Without a clear explanation of the data architecture, I think so.

            I despise idiocy. Therefore, I despise what Musk is doing to the government. Therefore, I despise it when everyone else does it.

            Seeing this post keep popping up in the lemmy feed is annoying when it’s clear from context that there’s nothing there but people reading more into it.

            Wow! It's fucking nothing!

            We don’t have to become idiots to denounce idiocy.

            • @bitchkat@lemmy.world
              link
              fedilink
              English
              32 months ago

              He literally writes “This retard thinks the government uses SQL.”

              That is all you need. He’s not saying “This retard thinks the SSA uses SQL”. He is saying “the government” which means all of it. Saying someone is a retard because they think the government uses SQL means Elon doesn’t think they do because we all know he doesn’t consider himself a retard.

              You are looking for ambiguity where there is none.

              • @lmmarsano@lemmynsfw.com
                link
                fedilink
                English
                0
                edit-2
                2 months ago

                Nah, that’s ignoring context irrationally. Context matters. I’ll show.

                He’s not saying “This retard thinks the SSA uses SQL”.

                Can SSA not be called “the government”?

                He is saying “the government” which means all of it.

                So, let’s try your suggested interpretation.

                This retard thinks all the government uses SQL.

                That seems to agree with mine.

                However, you denied ambiguity of language, and that context matters, so let’s explore that: which government? The Brazilian government? Your state government? Your local government? No? How do you know? That’s right: context.

                Why stop there? There’s more context: a Social Security database was specifically mentioned.

                Does “the government” always mean all of it? When a federal agent knocks someone’s door & someone gripes “The goddamn government is after me!” do they literally mean the entire government? I know from context I or anyone else can informally refer to any part of the government at any level as “the government”. I think you know this.

                Likewise, when people refer to the ocean or the sky or the people, they don’t necessarily mean all of it or all of them.

                Another way to check meaning is to test whether a proposition still makes sense when something obvious unstated is explicitly written out.

                This retard thinks the government uses SQL. Why assume they use SQL here?

                Still make sense? Yes. Could that be understood from context without explicitly writing it out? Yes.

                A refrain:

                Use context.

    • @Aeao@lemmy.world
      link
      fedilink
      42 months ago

      I’m not arguing that Elon musk is anything but an absolute tool.

      SS numbers have 999 million options. Are we already repeating them?

      • @vonbaronhans@midwest.social
        link
        fedilink
        72 months ago

        We have over 300 million people in the US right now. Social security started in the US in 1935 with just over 127 million people then.

        Yeah, we probably have gone through 999 million options by now.

        • @starman2112@sh.itjust.works
          link
          fedilink
          6
          edit-2
          2 months ago

          I don’t think we’ve gone through 999 million options yet. Only about 350 million people have been born since 1933, so even if we add all 127 million US citizens alive in 1935, that’s just over half of the possible social security numbers.

          The reason we’ve likely reused numbers is because they weren’t randomly assigned until like 2011. Knowing that I was born in 1995 in Wichita, KS, you could make an educated guess at the first three digits of my SSN

          • @vonbaronhans@midwest.social
            link
            fedilink
            3
            edit-2
            2 months ago

            We have 335 million people in this country literally right now. I don’t think “350 million born since 1933” makes sense. There gotta be a lot of churn just from early deaths alone.

            Edit: number fixin

            • @starman2112@sh.itjust.works
              link
              fedilink
              2
              edit-2
              2 months ago

              I mean you can check my math, I just added up all the births per year in this article

              https://www.usatoday.com/story/money/2020/06/12/how-many-people-were-born-the-year-you-were-born/111928356/

              Rounding to one significant figure, it’s 311.9 million people born in the US between 1933 and 2018. Adding an average of 4 million births per year since then, it’s 335.9. I rounded up to 350 to bring it to a nice round number

              A bit of research tells me that around 44.8 million of us are first generation immigrants, so 291.1 million were born here. Is it reasonable to assume that 291.1 out of the 335.9 million people born since 1933 have survived so far? I have absolutely no idea, I’m not a professional census taker

            • @tempest@lemmy.ca
              link
              fedilink
              42 months ago

              Not every person in the United States was born in the United States and even temporary workers can get a SSN

          • @vonbaronhans@midwest.social
            link
            fedilink
            32 months ago

            Just read that, and it says they’ve only issued 453 million numbers so far. Huh. I really thought it would’ve been a lot more than that.

  • @nednobbins@lemm.ee
    link
    fedilink
    15
    edit-2
    2 months ago

    It’s so basic that documentation is completely unnecessary.

    “De-duping” could mean multiple things, depending on what you mean by “duplicate”.

    It could mean that the entire row of some table is the same. But that has nothing to do with the kind of fraud he’s talking about. Two people with the same SSN but different names wouldn’t be duplicates by that definition, so “de-duping” wouldn’t remove it.

    It can also mean that a certain value shows up more than once (eg just the SSN). But that’s something you often want in database systems. A transaction log of SSN contributions would likely have that SSN repeated hundreds of times. It has nothing to do with fraud, it’s just how you record that the same account has multiple contributions.

    A database system as large as the SSA has needs to deal with all kinds of variations in data (misspellings, abbreviations, moves, siblings, common names, etc). Something as simplistic as “no dupes anywhere” would break immediately.

      • @nednobbins@lemm.ee
        link
        fedilink
        22 months ago

        Yeah. And the fix for that has nothing to do with “de-duping” as a database operation either.

        The main components would probably be:

        1. Decide on a new scheme (with more digits)
        2. Create a mapping from the old scheme to the new scheme. (that’s where existing duplicates would get removed)
        3. Let people use both during some transition period, after which the old one isn’t valid any more.
        4. Decide when you’re going to stop issuing old SSNs and only issue new ones to people born after some date.

        There’s a lot of complication in each of those steps but none of them are particularly dependant on “de-duped” databases.

      • DacoTaco
        link
        fedilink
        22 months ago

        Just read the format of the us ssn in that wikipedia. That wasnt a smart format to use lol. Only supports 99*999 ( +/- 100k ) people per area code. No wonder numbers are reused.
        In some countries its birthday+sequence number encoded with gender+checksum and that has been working since the 80’s.
        Before that was a different number, but it wasnt future proof like the us ssn so we migrated away in the 80’s :')

        • @Wispy2891@lemmy.world
          link
          fedilink
          32 months ago

          In my country the only way that someone has the same number is if someone was born on the same day (±1 century), in the same city and has the same name and family name. Is extremely difficult to have duplicates in that way (exception: immigrants, because the “city code” is the same for the whole foreign country, so it’s not impossible that there are two Ananya Gupta born on the same day in the whole India)

          • DacoTaco
            link
            fedilink
            2
            edit-2
            2 months ago

            Oh ye, our system wouldnt fit india as its limited to 500 births a day ( sequence is 3, digits and depending if its even or uneven describes your gender ). Your system seems fine to me and beats the us system hands down haha

  • knightly the Sneptaur
    link
    fedilink
    31
    edit-2
    2 months ago

    To oversimplify, there are two basic kinds of databases: SQL (Structured Query Language, usually pronounced like “sequel” or spelled aloud) and noSQL (“Not Only SQL”).

    SQL databases work as you’d imagine, with tables of rows and columns like a spreadsheet that are structured according to a fixed schema.

    NoSQL includes all other forms of databases, document-based, graph-based, key-value pairs, etc.

    The former are highly consistent and efficient at processing complicated queries or recording transactions, while the latter are more flexible and can be very fast at reads/writes but are harder to keep in sync as a result.

    All large orgs will have both types in use for different purposes; SQL is better for banking needs where provable consistency is paramount, NoSQL better for real-time web apps and big data processing that need minimal response times and scalable capacity.

    That Musk would claim the government doesn’t use SQL immediately betrays him as someone who is entirely unfamiliar with database administration, because SQL is everywhere.

    • @schteph@lemmy.world
      link
      fedilink
      52 months ago

      I didn’t read it like that. What I take from it is that he’s implying that the government uses something much stupider than sql, like Lotus1-2-3 or plain txt files or excel. I really wouldn’t be surprised that there’s some government department that had their IT done during the first Bush administration and didn’t really upgrade from it since.

      There are also probably some departments that don’t get much funding, so they organise part of their work into some shared excel files.l

      Nothing really wrong with that. Unless he’s implying that the entire federal government works like that, which is preposterously stupid.

      • knightly the Sneptaur
        link
        fedilink
        1
        edit-2
        2 months ago

        Seems to me that the most generous interpretation would be the preponderance of Oracle’s DBs in the government, and Musk being pedantic since they aren’t literally called “SQL” like MySQL, MSSQL, or PostgreSQL (even though most Oracle DBs still fall into that category).

    • @DahGangalang@infosec.pubOP
      link
      fedilink
      82 months ago

      Just so I’m clear, you’re implying that a given SSN could appear associated to multiple “keys” because the key-value pair in a NoSQL database could have complex data.

      An example I can imagine is a widow collecting her dead husband’s Social Security. Her SSN could appear in her own entry and also in her dead husband’s as a payee of that benefit, thus appearing as a “duplicate” SSN.

      Is that in line with what you’re saying?

      • knightly the Sneptaur
        link
        fedilink
        7
        edit-2
        2 months ago

        Indeed, that’s a possibility, but I’m not privy to the structure of the social security administration’s databases so I couldn’t say if it was indeed the case.

        The deeper point being, if the government has any databases at all, then some form of Structured Query Language is being used to read and write it.

        • @DahGangalang@infosec.pubOP
          link
          fedilink
          62 months ago

          Thats how I feel too.

          Lol, I’d love to see the data hes trying to speak about (not that that’d be any kind of concerning for privacy /s). I don’t think he’s outright lying, but it definitely feels like a misrepresentation / wrong conclusion from the data.

          But thanks for your part in helping me understand all this!

  • @jacksilver@lemmy.world
    link
    fedilink
    182 months ago

    If SSNs are used as a primary key (a unique identifier for a row of data) then they’d have to be duplicated to be able to merge data together.

    However, even if they aren’t using ssn as an identifier as it’s sensitive information. It’s not uncommon to repeat data either for speed/performance sake, simplicity in table design, it’s in a lookup table, or you have disconnected tables.

    Having a value repeated doesn’t tell you anything about fraud risk, efficency, or really anything. Using it as the primary piece of evidence for a claim isn’t a strong arguement.

    • @DahGangalang@infosec.pubOP
      link
      fedilink
      32 months ago

      This sounds like a reasonable argument.

      Can you pass any resources with examples on when having duplicate values would be useful/best practices?

    • @credo@lemmy.world
      link
      fedilink
      22 months ago

      This is the answer… it seems few on lemmy have ever normalized a database. But they do know how to give answers!

      • @jacksilver@lemmy.world
        link
        fedilink
        22 months ago

        Thanks, OP seemed more curious about the technical aspects than just the absurdity of the comment (since pretty much every business uses SQL) so hoped a more technical explanation might be appreciated.

  • ThePowerOfGeek
    link
    fedilink
    English
    472 months ago

    Elon Musk is the walking talking embodiment of the Dunning-Kruger effect.

    • @utopiah@lemmy.world
      link
      fedilink
      22 months ago

      100%

      What’s fascinating is you can take pretty much ANY topic, beside scamming at scale because there he truly is a master, you have some knowledge about and see very fast that he has no fucking clue. From engineering to video game, the guy has no idea. Sure his entourage, paid or not, might actually be World expert about said topic, but not him. So obvious.

  • @SolidShake@lemmy.world
    link
    fedilink
    262 months ago

    How come republicans keep saying that doggy is going to expose all the fraud in the government but yet the biggest fraud with 37 felonies is president? What the actual fuck to these people think?

  • @rational_lib@lemmy.world
    link
    fedilink
    12
    edit-2
    2 months ago

    To me I’m not really sure what his reply even means. I think it’s some attempt at a joke (because of course the government uses SQL), but I figure the joke can be broken down into two potential jokes that fail for different, embarrassing reasons:

    Interpretation 1: The government is so advanced it doesn’t use SQL - This interpretation is unlikely given that Elon is trying to portray the government as in need of reform. But it would make more sense if coming from a NoSQL type who thinks SQL needs to be removed from everywhere. NoSQL Guy is someone many software devs are familiar with who takes the sometimes-good idea of avoiding SQL and takes it way too far. Elon being NoSQL Guy would be dumb, but not as dumb as the more likely interpretation #2.

    Interpretation 2: The government is so backward it doesn’t use SQL - I think this is the more likely interpretation as it would be consistent with Elon’s ideology, but it really falls flat because SQL is far from being cutting-edge. There has kind of been a trend of moving away from SQL (with considerable controversy) over the last 10 years or so and it’s really surprising that Elon seems completely unaware of that.

    • KillingTimeItself
      link
      fedilink
      English
      12 months ago

      it’s probably using some sort of proprietary home grown database, because it’s probably old enough that no database could support what they needed, could be wrong on that one, but it was my best guess.

    • @dnick@sh.itjust.works
      link
      fedilink
      22 months ago

      My guess is that he thinks SQL is an app or implementation like MS-SQL. It would be pretty surprising if the government didn’t use SQL as in relational databases, but if it doesn’t it’s even more unlikely that he understands even the first part over whether having duplicate SS numbers is in any way unexpected or unreasonable. Most likely one of the junior devs somewhere along the lines misunderstood a query and said something uninformed and mocking, and he took that as a good dig to toss into a tweet.

    • @DahGangalang@infosec.pubOP
      link
      fedilink
      22 months ago

      Thanks for genuine response. Lol, most who interpret my question that way you did don’t seem interested in a good faith discussion. But ol’ boy is def tripping if he thinks SQL isn’t used in the government.

      Big thing I’m intending to pry at is whether there would be a legitimate purpose to have duplicated SSNs in the database (thus showing the First Bro doesn’t understand how SQL works).

  • John Doe
    link
    fedilink
    53
    edit-2
    2 months ago

    Musk’s statement about the government not using SQL is false. I worked for FEMA for fourteen years, a decade of which was as a Reports Analyst. I wrote Oracle SQL+ code to pull data from a database and put it into spreadsheets. I know, I know. You’re shocked that Elon Musk is wrong. Please remain calm.

    • @DahGangalang@infosec.pubOP
      link
      fedilink
      12 months ago

      Yeah, obviously ol’ boy is tripping if he thinks SQL isn’t used in the government.

      Big thing I’m prying at is whether there would be a legitimate purpose to have duplicated SSNs in the database (thus showing the First Bro doesn’t understand how SQL works).

    • @jve@lemmy.world
      link
      fedilink
      72 months ago

      As a former DOD contractor I can also confirm we built whole platforms that use Oracle (shudder) SQL

    • @whoisearth@lemmy.ca
      link
      fedilink
      112 months ago

      I work for a crown corp in Canada we have, off the top of my head, about 800 MSSQL, Oracle, MySQL/MariaDB, Postgres databases across the org (I manage our CMDB). Musk is a retard. The world runs on SQL.

      He wouldn’t know this though because he’s a techbro that builds apps with MongoDB b cause he doesn’t understand what normalizing data is and why SQL is the best option for 99.9999999% of applications.

      Fucking idiots.

  • Nate Cox
    link
    fedilink
    English
    652 months ago

    Because a simple query would have shown that SSN was a compound key with another column (birth date, I think), and not the identifier he thinks it is.

    • BombOmOm
      link
      fedilink
      English
      4
      edit-2
      2 months ago

      Why would one person, one SSN ever have two different birth dates? That sounds like an issue all onto itself.

      • @geoff@lemm.ee
        link
        fedilink
        152 months ago

        I think what he means is that the unique identifier for a database record is a composite of two fields: SSN + birth date. That doesn’t mean that SSN to birth date is a one-to-many relation.

        • @DahGangalang@infosec.pubOP
          link
          fedilink
          22 months ago

          But they are implying SSN to SSN+Birthdate is a one-to-many relationship. Since SSN to SSN should be one-to-one, you can conclude the SSN to Birthdate is one-to-many, right?

          • Nate Cox
            link
            fedilink
            English
            112 months ago

            No, who said there was a relationship?

            A compound key is a composite key where one or both sides can be foreign keys to other tables themselves; it’s a safe assumption this is probably true in a large data set like social security. A composite key is a candidate key (a uniquely identified key) made up of more than one column.

            This basically means that there is a finite number of available SSNs because they’re only 10 digits long and someone intends to recycle SSNs after the current user of one dies. Linking it to birthday is “unique enough” as to never recur.

            • @DahGangalang@infosec.pubOP
              link
              fedilink
              12 months ago

              I think I was getting some wires crossed and/or misunderstood what geoff (parent commentor to my last comment) was saying, so my comment may be misdirected some.

              But according to The Social Security FAQ page, SSNs are not recycled, so that data (especially when compounded and hashed with other data) should be able to establish a one-to-one relationship between each primary key and an SSN, thusly having SSNs appear associated with multiple primary keys is a concern.

              Other comments have pointed to other explanations for why SSNs could appear to occur multiple times, but those amount to “it appeared in a different field associated with the same primary key”. I think thats the most likely explanation of things.

              • @jj4211@lemmy.world
                link
                fedilink
                7
                edit-2
                2 months ago

                Note that it being only part of a key is a technology choice that does not require the reality map to it. It may seem like overkill, but someone may not trust the political process to preserve that promise and so they add the birthdate, just in case something goes sideway in the future. Lots of technical choices are made anticipating likely changes and problems and designing things to be extra robust in the face of those

                • Nate Cox
                  link
                  fedilink
                  English
                  22 months ago

                  Yeah this strikes me as safeguarding against a possible bad decision.

      • @DahGangalang@infosec.pubOP
        link
        fedilink
        5
        edit-2
        2 months ago

        A weak example would be my grandma. She was born before social security and was told as a kid she was born in 1938. Because I guess in the olden days, you just didn’t need to pass your birth certificate around for anything, it wasn’t until she went to get married at ~age 25 that she needed her birth certificate and when she got it, it actually said she was born in 1940 (I forget the actual years, but I remember it was a two year and two day gap between dates).

        Its a weak example that should apply to only a microscopic portion of the population, but I could see her having some weird records in the databases as a result.

        Edit: brain dropped out and I forgot part of a sentence.

  • @vorb0te@lemmynsfw.com
    link
    fedilink
    22 months ago

    He could also refer to the mere possibility of having duplicates which does not mean there are duplicates. And even then it could be by accident. Of course db design could prevent this. But I guess he is inflating the importance of this issue.

  • @Garlicsquash@lemmings.world
    link
    fedilink
    English
    152 months ago

    Having never seen the database schema myself, my read is that the SSN is used as a primary key in one table, and many other tables likely use that as a foreign key. He probably doesn’t understand that foreign keys are used as links and should not be de-duplicated, as that breaks the key relationship in a relational database. As others have mentioned, even in the main table there are probably reused or updated SSNs that would then be multiple rows that have timestamps and/or Boolean flags for current/expired.

    • @werefreeatlast@lemmy.world
      link
      fedilink
      52 months ago

      Is this is true, then by this time we are all fucked. Like Monday someone checks their banking or retirement and it all gone. That’s gonna be a crazy day.

      I hope they’re not using the actual SSN as the primary key. I hope its a big ass number that is otherwise unrelated.

  • @Geometrinen_Gepardi@sopuli.xyz
    link
    fedilink
    122 months ago

    Rows in a SQL table have a primary key which works as the unique identifier for that row. The primary key can be as simple as an incrementing number.

      • knightly the Sneptaur
        link
        fedilink
        8
        edit-2
        2 months ago

        Not unless the data associated with that SSN is itself inconsistent.

        For example, when multiple people are fraudulently using the same SSN, the fraud monitoring DB would neccessarily need to record several entries with the same SSN.

        • @DahGangalang@infosec.pubOP
          link
          fedilink
          22 months ago

          Ah the old “malware detectors have the selectors for malware and so they show up as malware to other malware detection systems” problem.

          Yeah, that seems like a reasonable case to have duplicate SSNs.