@CodeInvasion

CodeInvasion@sh.itjust.works · 4 months ago

You do realize that every posted on the Fediverse is open and publicly available? It’s not locked behind some API or controlled by any one company or entity.

Fediverse is the Wikipedia of encyclopedias and any researcher or engineer, including myself, can and will use Lemmy data to create AI datasets with absolutely no restrictions.

CodeInvasion@sh.itjust.works · 5 months ago

To add to this insight, there are many recent publications showing the dramatic improvements of adding another modality like vision to language models.

While this is my conjecture that is loosely supported by existing research, I personally believe that multimodality is the secret to understanding human intelligence.

CodeInvasion@sh.itjust.works · edit-2 5 months ago

I am an LLM researcher at MIT, and hopefully this will help.

As others have answered, LLMs have only learned the ability to autocomplete given some input, known as the prompt. Functionally, the model is strictly predicting the probability of the next word⁺, called tokens, with some randomness injected so the output isn’t exactly the same for any given prompt.

The probability of the next word comes from what was in the model’s training data, in combination with a very complex mathematical method to compute the impact of all previous words with every other previous word and with the new predicted word, called self-attention, but you can think of this like a computed relatedness factor.

This relatedness factor is very computationally expensive and grows exponentially, so models are limited by how many previous words can be used to compute relatedness. This limitation is called the Context Window. The recent breakthroughs in LLMs come from the use of very large context windows to learn the relationships of as many words as possible.

This process of predicting the next word is repeated iteratively until a special stop token is generated, which tells the model go stop generating more words. So literally, the models builds entire responses one word at a time from left to right.

Because all future words are predicated on the previously stated words in either the prompt or subsequent generated words, it becomes impossible to apply even the most basic logical concepts, unless all the components required are present in the prompt or have somehow serendipitously been stated by the model in its generated response.

This is also why LLMs tend to work better when you ask them to work out all the steps of a problem instead of jumping to a conclusion, and why the best models tend to rely on extremely verbose answers to give you the simple piece of information you were looking for.

From this fundamental understanding, hopefully you can now reason the LLM limitations in factual understanding as well. For instance, if a given fact was never mentioned in the training data, or an answer simply doesn’t exist, the model will make it up, inferring the next most likely word to create a plausible sounding statement. Essentially, the model has been faking language understanding so much, that even when the model has no factual basis for an answer, it can easily trick a unwitting human into believing the answer to be correct.

—-

⁺more specifically these words are tokens which usually contain some smaller part of a word. For instance, understand and able would be represented as two tokens that when put together would become the word understandable.

CodeInvasion@sh.itjust.works · 5 months ago

Agreed.

Nevertheless, the Federal regulators will have an uphill battle as mentioned in the article.

Neither “puffery” nor “corporate optimism” counts as fraud, according to US courts, and the DOJ would need to prove that Tesla knew its claims were untrue.

The big thing they could get Tesla on is the safety record for autosteer. But again there would need to be proof it was known.

CodeInvasion@sh.itjust.works · 5 months ago

I am a pilot and this is NOT how autopilot works.

There is some autoland capabilities in the larger commercial airliners, but autopilot can be as simple as a wing-leveler.

The waypoints must be programmed by the pilot in the GPS. Altitude is entirely controlled by the pilot, not the plane, except when on a programming instrument approach, and only when it captures the glideslope (so you need to be in the correct general area in 3d space for it to work).

An autopilot is actually a major hazard to the untrained pilot and has killed many, many untrained pilots as a result.

Whereas when I get in my Tesla, I use voice commands to say where I want to go and now-a-days, I don’t have to make interventions. Even when it was first released 6 years ago, it still did more than most aircraft autopilots.

CodeInvasion@sh.itjust.works · 7 months ago

“If it wasn’t hard, it wouldn’t be worth doing”

CodeInvasion@sh.itjust.works · 8 months ago

AFAIK, there’s nothing stopping any company from scraping Lemmy either. The whole point pf reddit limiting API usage was so they could make money like this.

Outside of morals, there is nothing to stop anybody from training on data from Lemmy just like there’s nothing stopping me from using Wikipedia. Most conferences nowadays require a paragraph on ethics in the submission, but I and many of my colleagues would have no qualms saying we scraped our data from open source internet forums and blogs.

CodeInvasion@sh.itjust.works · 8 months ago

I’m convinced that we should use the same requirements to fly an airplane as driving a car.

As a pilot, there are several items I need to log on regular intervals to remain proficient so that I can continue to fly with passengersor fly under certain conditions. The biggest one being the need for a Flight Review every two years.

If we did the bare minimum and implemented a Driving Review every two years, our roads would be a lot safer, and a lot less people would die. If people cared as much about driving deaths as they did flying deaths, the world would be a much better place.

CodeInvasion@sh.itjust.works · 9 months ago

I hate that I am defending Israel when I say this because what is occurring in Gaza is tragic, but a lot of people are confusing “Genocide” for perceived “War Crimes” as defined by international law and also confusing “Hamas” for “Palestine” or the “Palestinian Authority”.

Hamas is terrorist government (similar in nature to the Taliban) that receives a lot of external funding from countries that actively wish to see the death of Israel and all Jews, making Hamas the chief perpetrators of Genocide in this conflict despite how ineffective they have been in their goals.

Israel was attacked by this terrorist government, and is now defending itself with the expressed war goal of destroying Hamas. While Israel has had a tenuous relationship with the Palestinian people (namely the government’s active efforts to limit the Palestinian Authority and drag their feet on grant the PA more autonomy and their own state which is deplorable and inexcusable), they do not and have not wished to kill an entire culture of people.

Complicating matters, Hamas commonly employs warfare techniques that go against the Geneva Convention like placing government and military headquarters in basements of protected buildings like Hospitals and places of worship. The moment they do that, and abuse those international recognized sanctuaries, they become legitimate military targets leading to the tragic deaths of unwitting civilians.

People can object to the war on the grounds that war is tragic and results in many civilian casualties, but to make meritless claims is detrimental to both international institutions and to the definition of a Genocide. South Africa calls what Israel is doing a genocide, but also explicitly looks the other way with Ukraine and continues to forge close ties with Putin? (For the record, Russia’s actions in Ukraine are also not considered genocide under it’s strict international definition, but they have been found guilty of war crimes).

Israel has an internationally recognized right to defend itself, and it is doing that by dismantling Hamas through force. The Palestinian people are unfortunately caught in the crossfire. With that said, Israel’s methods to this end are not above criticism, and they have faced pressure from the US and Biden to limit civilian casualties wherever possible, and use ground forces to directly attack Hamas rather than relying on airstrikes that have resulted in many innocent deaths.

For those reading who think all war is bad, I’ll leave you with this quote from John Stuart Mills:

War is an ugly thing, but not the ugliest of things: the decayed and degraded state of moral and patriotic feeling which thinks that nothing is worth a war, is much worse. When a people are used as mere human instruments for firing cannon or thrusting bayonets, in the service and for the selfish purposes of a master, such war degrades a people. A war to protect other human beings against tyrannical injustice; a war to give victory to their own ideas of right and good, and which is their own war, carried on for an honest purpose by their free choice, — is often the means of their regeneration. A man who has nothing which he is willing to fight for, nothing which he cares more about than he does about his personal safety, is a miserable creature who has no chance of being free, unless made and kept so by the exertions of better men than himself. As long as justice and injustice have not terminated their ever-renewing fight for ascendancy in the affairs of mankind, human beings must be willing, when need is, to do battle for the one against the other.

CodeInvasion@sh.itjust.works · 11 months ago

I’m an AI researcher at one of the world’s top universities on the topic. While you are correct that no AI has demonstrated self-agency, it doesn’t mean that it won’t imitate such actions.

These days, when people think AI, they mostly are referring to Language Models as these are what most people will interact with. A language model is trained on a corpus of documents. In the event of Large Language Models like ChatGPT, they are trained on just about any written document in existence. This includes Hollywood scripts and short stories concerning sentient AI.

If put in the right starting conditions by a user, any language model will start to behave as if it were sentient, imitating the training data from its corpus. This could have serious consequences if not protected against.

CodeInvasion@sh.itjust.works · 1 year ago

I am a satellite software engineer turned program manager. This is not unexpected in this current environment, however the conditions that created the environment are abnormal.

This solar cycle is much stronger than past cycles. I’m on mobile, so I can’t get a good screenshot, but you can go here to see this cycle and the last cycle, as well as an overlay of a normal cycle https://www.swpc.noaa.gov/products/solar-cycle-progression

As solar flux increases, the atmosphere expands considerably, causing more drag than predicted. During periods of solar minimum, satellites can remain in a very low orbit with minimal station keeping. However, at normal levels of solar maximum, 5 year orbits can easily degrade to 1 year orbits. Forecasters says we are still a year away from solar maximum, and flux is already higher than last cycle’s all time high (which was also an anomalously strong cycle). So it will get worse before it gets better.

TLDR: Satellites are falling out of the sky because the sun is angy

CodeInvasion@sh.itjust.works · 1 year ago

“Beep… Beep… Beep…” -Sputnik

CodeInvasion@sh.itjust.works · edit-2 1 year ago

The honeymoon phase for Lemmy is over. No one cares anymore about someone expressing themselves (I still do–this is based on my observations).

CodeInvasion@sh.itjust.works · 1 year ago

Tenet is criminally under rated at 69%. Easily one of my favorite movies of all time.

CodeInvasion@sh.itjust.works · 1 year ago

I don’t go around using that word because of how many people find it disrespectful. But, and I ask this out of honest curiousity, why is it offensive in the first place?

I see it as synonymous with ‘idiot’ or ‘stupid’ when used colloquially. The argument that it’s a medical term doesn’t really hold as ‘idiot’ and ‘moron’ are also medical terms that refer to a lacking of intellectual acuity. In many ways ‘retarded’ has the same meaning both colloquially and medically. To be mentally retarded is to be mentally slowed or lacking that similar mental acuity that ‘idiot’ or ‘moron’ convey.

Retarded just means slow and it’s a perfectly apt description. Where I think people get confused is when retardation is linked with a specific attribute like physical retardation or emotional retardation, those convey very different meanings.

I’m not saying that we should start using it again, but that I find it odd how society has latched onto a very specific word and labelled it as bad in the matter of a decade. At the end of the day, any word that can be used to insult or demean, is rude. It’s not the word being used, it’s what is meant by them. The term 'Cis-gender ’ is also being used in a highly exclusionary way and often times is conveyed as an insult. However, it’s real meaning is not insulting in the least.

CodeInvasion@sh.itjust.works · 1 year ago

The only upside I can think of is they’d actually start caring about the planet instead of thinking they’ll be dead in 100 years anyway.

CodeInvasion@sh.itjust.works · 1 year ago

Do pilots count?