Past Yonder

The Path is Past Yonder


  • Reaching the end of the Internet

    In a 2002 television commercial for DirecTV’s blazing fast DSL Internet service, a man is seen aimlessly browsing the Internet, clicking link after link. Suddenly, his computer informs him that he has “reached the end of the Internet” and has “seen everything there is to see.” He stares at the screen in amazement.

    YouTube video of a 2002 DirecTV DSL commercial.

    Going back to the late 1990s, jokesters have set up numerous web pages that purport to be the last page of the Internet. Of course, the notion that any human could visit every on-line page is preposterous: according to WorldWideWebSize.com, which attempts to track the daily estimated size of the web, the Internet contained at least 5 billion web pages as of April 6, 2024.

    But even though AI might have difficulty re-creating the human hand, it can click through web pages really fast.

    In a recent story for The Wall Street Journal, Deepa Seetharaman writes that AI companies are running out of training data after burning through the entire Internet.

    Companies racing to develop more powerful artificial intelligence are rapidly nearing a new problem: The internet might be too small for their plans.

    Ever more powerful systems developed by OpenAI, Google and others require larger oceans of information to learn from. That demand is straining the available pool of quality public data online at the same time that some data owners are blocking access to AI companies.

    Some executives and researchers say the industry’s need for high-quality text data could outstrip supply within two years, potentially slowing AI’s development.

    While companies like OpenAI don’t disclose the exact sources of their training data, it is believed that they vacuum up on-line content such as research papers, news articles, and Wikipedia pages. They also likely ingest content from social media platforms, although much of that content is “walled off” from automated bots.

    That doesn’t stop those social media platforms from utilizing the data themselves: the WSJ story notes that Meta CEO Mark Zuckerberg recently touted his company’s access to hundreds of billions of publicly shared images and videos across Facebook and Instagram, creating a rather robust training set for Meta’s own ambitious AI goals.

    As AI companies begin to run out of training data on the Internet, they’re turning to synthetically-generated data, but that practice is controversial.

    In a story for Futurism, Noor Al-Sibai touches on some of the challenges of using synthetic data sets.

    Synthetic data, meanwhile, has been the subject of ample debate in recent months after researchers found last year that training an AI model on AI-generated data would be a digital form of “inbreeding” that would ultimately lead to “model collapse” or “Habsburg AI.”

    Habsburg AI refers to the House of Habsburg (or House of Austria), one of the most prominent dynasties in European history. Seeking to consolidate their power, the Habsburgs relied on frequent consanguine marriages between closely related individuals. This inbreeding negatively affected the gene pool and was believed to lead to health ailments such as epilepsy, insanity, and early death.

    Some are now predicting that AI models may suffer a similar fate as they begin to train against data they themselves created, recycling biases and errors and making it increasingly difficult to check the results for reliability.

    In a 2023 research paper titled The Curse of Recursion: Training on Generated Data Makes Models Forget, researchers caution about an effect called Model Collapse that could occur when large language models are trained on data produced by other AIs instead of by humans.

  • AI image generators could use a hand

    In an episode of the TV sitcom Seinfeld, George Costanza almost becomes a hand model after an agent notices his “beautiful” hands. This exciting career possibility is abruptly cut short after his hands meet the heat of a clothes iron, causing his modeling dreams to slip from his (charred) fingertips.

    Ironically, the fictional character George Costanza is perhaps the most famous (almost) hand model. While fashion models like Cindy Crawford, Kate Moss, and Tyson Beckford rose to the level of household names, the more specialized field of hand modeling has yet to produce an equivalent star.

    And that may be part of the reason why AI image generators aren’t so good at drawing human hands.

    To be fair, image generators such as OpenAI’s Dall-E 3 have gotten pretty good at drawing humans, after training against countless snapshots of the Homo sapiens species.

    But hands are a small part of the human body, and rather complicated ones at that. They don’t tend to be the focus of photo shoots or selfies, so in training sets consisting of people, they play a small role. In fact, many AI models have focused on faces, as that tends to be the first thing people look at. (At least the non-introverted ones.) Modern digital cameras also prioritize focusing on a subject’s face and eyes, paying little attention to extremities such as hands or feet. As William Shakespeare said, the eyes are the window to your soul.

    Even when hands are featured prominently in a photo, they can be much more dynamic than, say, a nose. Each human hand consists of 27 bones, 34 muscles, and over 100 ligaments and tendons, making it a rather complex part of our anatomy. This enables hands and fingers to move and bend in countless ways. And that trips up AI systems. Consider the vastly different appearances of a hand clenched in a fist, a thumb rising up from the palm of a hitchhiker, or two figures interlacing their fingers as they walk down a street. And hands may disappear entirely as they are slipped into pockets, or the comfort of warm gloves.

    AI systems don’t have an inherent understanding of human anatomy. They don’t know what directions fingers can (or can’t) bend in, nor do they have insight into the different emotions represented by various hand gestures. They focus on identifying patterns in large amounts of data, and those patterns can be elusive when it comes to hands.

    Humans are also really good at noticing when something is off about a hand, making even minor errors easy to spot. Because they are such an important part of our bodies, we’re more likely to notice an unusually long or bent finger than we are to notice shoelaces that don’t quite interlace correctly on a pair of shoes.

    But with each update to AI image generators, their hand-drawing skills continue to improve. This is likely due to a focus on selecting training sets that include clear photos of human hands, while eliminating sets where the hands are partially obscured. More accurate hands are within AI’s grasp.

  • AI worker shortage leads to talent war, high salaries

    Unemployment remains below 4% in the United States, but workers in the tech industry are still experiencing layoffs at rates greater than other professions.

    According to an NBC News story, heading into 2024, tech remains one of the few soft spots in an otherwise strong labor market.

    Brian Cheung quotes one tech worker:

    “There was a time when working in tech seemed like the most stable career you could have,” said Ayomi Samaraweera, who was laid off as chief of staff at the content creator platform Jellysmack in December 2022. After about 10 years in the industry, she said, “tech does not seem safe and secure.”

    There is one area in tech that is bucking this trend, however. Software and hardware engineers versed in AI are finding employers desperate to hire – or retain – their skills.

    Gareth Vipers and Kimberley Kao from the Wall Street Journal report that Elon Musk has recently raised the salaries of Tesla’s AI engineers.

    Tesla is raising compensation for its artificial intelligence engineers in a bid to ward off poaching from the likes of OpenAI, Chief Executive Elon Musk said.

    Musk said his electric-vehicle company is boosting pay at a time when OpenAI has been “aggressively recruiting Tesla engineers with massive compensation offers,” in a series of posts on social-media platform X late Wednesday.

    The competition for AI engineers “is the craziest talent war I’ve ever seen,” he said.

    Futurism’s Maggie Harrison Dupré reports that Meta has experienced a steady exodus of its top AI talent in recent months, leading CEO Mark Zuckerberg to send personally written recruiting e-mails to AI staffers at competitor Google. During March 2024, Meta AI experts Devi Parikh, Abhishek Das, and Erik Meijer all left the company to pursue other opportunities, creating a talent vacuum Zuckerberg is undoubtedly trying to fill.

    Meijer, who served as Meta’s Director of Engineering, wrote on X that he’s “more bullish than ever about Meta with the company’s increased focus on AI,” but seemed to steer budding AI engineers away from larger companies.

    “Given the incredible competitive pressure in the field, there is really no advantage to be inside a large corp if you want to build cool stuff on top of [Large Language Models],” Meijer wrote.

    Yet, smaller companies may be at a disadvantage compared to the likes of Microsoft, Google, and Meta, as only a few companies can afford the massive cloud infrastructure necessary to work with the largest models, creating a scale disadvantage.

    MIT Sloan Management Review described this scale disadvantage in a story last summer. Yannick Bammens and Paul Hünermun write:

    Deep pockets, access to talent, and massive investments in computing infrastructure only partly explain why most major breakthroughs in artificial intelligence have come from a select group of Big Tech companies that includes Amazon, Google, and Microsoft. What sets the tech giants apart from the many other businesses seeking to gain an edge from AI are the vast amounts of data they collect as platform operators.

    Whether AI engineers choose to work for large or small companies, they can expect to find generous job offers. The Daily Mail reports that software engineers with specialized training in AI are seeing average wages $100,000 higher than their non-AI engineering peers. And depending on experience, those salaries can approach seven figures.

    James Cirrone writes:

    Tech companies are willing to pay top dollar – up to $1 million or more – to poach talented software engineers with experience in generative artificial intelligence.

    Employees who know how to work with large language models and semiconductor chips – which is the technology undergirding popular apps like ChatGPT – are becoming a rarity in the job market, according to executives. 

    While engineers who have the skills to create systems like ChatGPT are in strongest demand, Business Insider reports that companies are also interested in hiring non-software developers who understand the technology and how to apply it across a range of industries.

    Aaron Mok writes:

    While many AI-related jobs posted on Indeed and LinkedIn are for software developers and machine learning engineers with advanced degrees, some don’t require a technical background. Organizations want to use AI tools in their workflows to boost productivity, save time, and make more money — but they also need workers who can link the very technical side and the business side.

    “Companies are desperate to get people figuring out AI for their organizations,” J.T. O’Donnell, a career coach at Work It Daily, told Business Insider.

    Business Insider identifies nine non-programming jobs in demand, including AI product managers, ethics specialists, sales engineers, business analysts, data annotators, prompt engineers, product designers, policy analysts, and sector specialists.

    There continues to be much debate about whether AI will be a net job creator or job destroyer, but for those who understand how to implement AI systems or effectively utilize them, the job market looks bright.

  • Want generative-powered search results? Soon, that might cost you

    Google is working to incorporate generative artificial intelligence into its traditional web search results, but Ars Technica reports that ads might not be enough to cover the costs (and need to profit) from these results.

    Kyle Orland writes:

    Under the proposed plan, Google’s standard search (without AI) would remain free, and subscribers to a paid AI search tier would still see ads alongside their Gemini-powered search results, according to the FT report. But search ads—which brought in a reported $175 billion for Google last year—might not be enough to fully cover the increased costs involved with AI-powered search.

    Going back to the days of Altavista, users are accustomed to performing their Internet searches for free, and search companies have competed for their queries. (Microsoft has even tried to bribe pay users to use its Bing search engine via its Microsoft Rewards program.) But performing a generative AI computation is much more expensive (in terms of processor cycles and electricity) than a traditional web search.