Slashdot Mirror


Researchers Are Training Image-Generating AI With Fewer Labels (venturebeat.com)

An anonymous reader shares a report: Generative AI models have a propensity for learning complex data distributions, which is why they're great at producing human-like speech and convincing images of burgers and faces. But training these models requires lots of labeled data, and depending on the task at hand, the necessary corpora are sometimes in short supply.

The solution might lie in an approach proposed by researchers at Google and ETH Zurich. In a paper [PDF] published on the preprint server Arxiv.org ("High-Fidelity Image Generation With Fewer Labels"), they describe a "semantic extractor" that can pull out features from training data, along with methods of inferring labels for an entire training set from a small subset of labeled images. These self- and semi-supervised techniques together, they say, can outperform state-of-the-art methods on popular benchmarks like ImageNet.

"In a nutshell, instead of providing hand-annotated ground truth labels for real images to the discriminator, we ... provide inferred ones," the paper's authors explained. In one of several unsupervised methods the researchers posit, they first extract a feature representation -- a set of techniques for automatically discovering the representations needed for raw data classification -- on a target training dataset using the aforementioned feature extractor.

6 of 18 comments (clear)

  1. Headline, meet story by tomhath · · Score: 1
    Headline:

    Researchers Are Training Image-Generating AI...

    Story:

    The solution might lie in an approach proposed by researchers at Google and ETH Zurich...

    The researchers aren't training anything. They just hypothesized that it might be possible to use AI to train AI. Then their heads exploded.

  2. So statistical classifiers fail on complex things? by gweihir · · Score: 1

    Who would have thought. Oh, right, I learned that about 30 years ago at university in my CS studies.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  3. Algorithms all the way down by mrwireless · · Score: 2

    So we're training machine learning algorithms with data that was generated by machine learning algorithms?

    And we're using those algorithms in situations where we didn't have much data, which may often mean they are complex situations?

    This sounds like a bias-factory, breaking some kind of law of entropy.

  4. one step forward, two steps back by epine · · Score: 1

    In one of several unsupervised methods the researchers posit, they first extract a feature representation -- a set of techniques for automatically discovering the representations needed for raw data classification -- on a target training dataset using the aforementioned feature extractor.

    My comprehension regressed upon encountering this sentence.

  5. Firirre Turukcs by Tablizer · · Score: 1

    Those artificially generated fire trucks sure are funky looking. They immediately stand out as fire-trucks, but as you look more closely, they have weird details in weird spots, and duplicate things that shouldn't be duplicated in practice. iLSD or a transporter accident.

  6. Re:So statistical classifiers fail on complex thin by gweihir · · Score: 1

    This type of "AI" is really not more. Non-statistical approaches are different, but about as "intelligent".

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.