Do you really think the guy branding thousands of people working at Meta as basically pedophiles can really be said to care about "solidarity"? I certainly wouldn't consider someone a peer if they randomly go and call me a pedophile because of where I work. I'm sure 95% of people working there have 0 relation to the algorithm decisions and definitely have no particular fixation on giving teenage girls depression.
Material outcomes are the only thing that matters, not personal wants or wishes.
If you worked as an accountant for Epstein after 2006, then yeah you may not be exactly a pedophile but you have shown you're okay with not only working with pedophiles but having a pedophile actively enrich you.
Very few people on this planet are willing to actually do this. Those are Meta are willing to do exactly this.
Also solidarity is a two way street and acting as if the literal 1% of earners in the nation will suddenly develop empathy is foolish. I have solidarity for actual workers trying to better society; not for those that exacerbate the climate crisis, help minorities get cancer through data center expansions, personally profit off of scamming seniors, are okay with allowing a genocide on their watch, do nothing to help prevent the erosion of democracy, and have directly caused many suicides/self harm/deaths of despair.
They have to commit to a lifetime of repenting and self-flagellation isn't going to cut it. It's not the middle ages.
Many people do have unlimited budgets because their work pays for it. Just because the latest SOTA model isn't something most consumers can afford for personal use doesn't mean it's not worth releasing or discussing. I would guess that the vast majority of software that benefits from being on the bleeding edge is developed by people working at companies on API pricing.
No, that's not what it means at all even if just doing it purely in math terms. Really it is just a reasonable amount to cap at to stop the long tail of super spenders (tokenmaxxers). You could also call it "the amount of AI spend after which Uber has decided there is diminishing returns for the average engineer".
Target market for stacked PRs are ICs who don't have much decision making power and let's be real do not care too much about the look and feel of a "launch site" for the feature. It's also something few if anyone is making a purchasing decision over.
Target market for copilot includes people with actual purchasing power and also many new users where this is an actual make or break feature. So this is worth the investment into design while stacked PRs is questionable. I actually question why they bothered with anything more than a blog post at all for stacked PRs (looking at the post it doesn't seem like too too much more than a blog post though).
Claude is so in demand at the moment that there aren't really volume discounts. Anthropic sets the terms and you either accept them or get lost they have that much of a lead (mindshare/desirability wise).
Bitflips specifically may not be; things like network issues, noisy neighbors, row/rack/host maintenance (leading to a downed and migrated host) absolutely are things that happen at high frequency at scale and cause your background level of errors to be more than 0.
This was always a little weird to be because Microsoft internally is actively hostile to cross-org collaboration. If you worked in most of Azure you basically have 0 lanes of communication with someone from the Windows team and vice versa. Triply so for stuff like Kusto or Teams which you'd be dogfooding daily. I guess if there's a horrible stop the world bug it'd get surfaced through telemetry but normal user feedback is not a thing.
Compared to working at other big techs, where I was able to direct msg the engineers on the team for internal protobuf or datalake services in addition to user groups that were generally responsive it was just strange. Also Microsoft doesn't have a monorepo so you can't just commit patches to their service because you don't have access to their repos which I pretty regularly do elsewhere.
> Microsoft internally is actively hostile to cross-org collaboration
The Copilot CLI has ushered in the beginning of a change in this dogma -- I've helped dozens of Microsoft engineers get access to GitHub source code so they can contribute to Copilot CLI! It's fun to subvert expectations when a Microsoft IC pitches an improvement and I can respond with "submit a PR!"
Calling #2 more sustainable has no basis in reality, it's just a feeling. It's like saying that clothing before the loom or farming before the tractor were "more sustainable". No, it isn't, it just appeals to yeoman farmer instincts that somehow technology=bad when it's what powers (and sustains) our modern world of 8 billion people.
#1 may well put #2 out of a living but that isn't the same as stealing and doesn't (at least in and of itself) make it unsustainable. The fact that models were trained on scraped content isn't a matter of technical necessity but rather the path of least resistance (lowest cost in this case). Synthetic data is increasingly used for reasons of quantity, quality, and various technical considerations.
All of the major players in AI currently, literally stole to build their models. There isn’t one out there that hasn’t. So yes, it is the same as stealing because they were LITERALLY, in the literal sense, stealing.
Well, pirated. Piracy and stealing aren't the same thing.
Regardless, I acknowledged the general issue. However I pointed out that doing so was not a technical necessity. If you base your worldview or actions around X implying Y but then it turns out that actually Y was merely a matter of convenience you're probably going to arrive at a wrong conclusion.
There's also the issue where you're emphatically calling it stealing without providing a clear criteria. The legal system as a whole has yet to conclusively resolve the various piracy accusations. The legality of consuming publicly available content remains quite controversial.
It absolutely is a technical necessity. You could build a model from scratch today without doing the same thing. And every model attempting to train on AI generated output degrades into nonsense almost immediately.
There’s a reason Reddit is making millions of dollars letting these companies mine their human generated content. You think OpenAI or anyone else would pay for that if they could just cyclically train on AI generated content???
I said nothing about that. Good synthetic data does not (typically) involve ML algorithms. Although that might be changing.
I'll politely suggest that you go read the literature before engaging further.
Reddit, Twitter, and similar are valuable because the data covers current events. Their content makes up a reasonably comprehensive timeline of the world at large. You don't need that to train a barebones functional model but it's certainly useful in order to train a knowledgeable one. Regardless, if they're charging for access it clearly isn't piracy so it doesn't seem like your original objection would hold any water in that case.
> I'll politely suggest that you go read the literature before engaging further.
Which commercial AI vendor has not stolen any content when creating their models? I’ll wait.
Which commercial AI vendor has created their models exclusively training on datasets created and created by other AI?
> Regardless, if they're charging for access it clearly isn't piracy so it doesn't seem like your original objection would hold any water in that case.
Given that they were previously violating the site’s terms of service when scraping the content: yes, they were absolutely stealing.
It's sustainable in the literal sense, I.E. a tailor can simply tailor forever without needing to constantly worry about keeping up with new tools or technologies, or needing to upgrade or change their methodology constantly.
The tech world is obsessed with moving fast and breaking things, and you can't just do the same thing forever and expect it to always work.
Think about how much food we throw away in the developed and developing worlds. How often we buy new clothing when we could mend old clothing. How often we ask for more when we could do with less. How often we want to eat at a restaurant when we could make leftovers. How often we want something sweet when we could just eat something bland. How often we heat and cool our homes when we could wear more or less clothing.
It turns out that while these are all truisms, nobody wants to fix them. Developed countries are okay passing pigovian taxes, to a limited extent, to help fix these problems. Developing countries are even less interested in fixing these problems. It turns out that austerity is incredibly unpopular. Everyone wants to tell other people not to do the things they don't like but nobody wants to listen to what other people tell them not to do.
Just a reminder that Europe colonized Asia, Africa, and the Americas in the search for spices. Later on the interest changed to tea. Literally the only thing that Europe wanted was better tasting food and drink (initially at least.) By the time the potato had become widespread, they could have had enough calories to feed the continent, and yet the desire for flavor is what lead to untold misery for hundreds of years for millions of people.
We need to be realistic about what works and what doesn't. Austerity never wins.
“More sustainable” than burning hydrocarbons to produce chatbot tokens. Humanity could sustain itself on those resources much longer if we were more careful with them. The very definition of sustainability.
The bigger issue is that the AI we currently have available has been produced by setting a giant pile of money on fire with no real plan for ever earning it back. And that is not sustainable because at some point the checks dry up. We have zero idea what the economics of unsubsidized inference are because we still don't actually know how much money the frontier labs are currently losing.
One of my favorite things to come out of Alice in Wonderland is the Red Queen's Race, which has been used as a metaphor in many many fields for the concept of needing to work as hard as you can just to keep up with others, not even to get ahead.
> "Well, in our country," said Alice, still panting a little, "you'd generally get to somewhere else—if you ran very fast for a long time, as we've been doing."
> "A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"
reply