• About
  • Advertise
  • Privacy & Policy
  • Contact
HK Businesswire
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World

    Ping An Good Doctor Upgrades AI Doctor Service “Ping An AI Doctor”, Expanding Access to Ping An Ecosystem’s 90 Million MAUs

    Dioseve Secures JPY 1.45 Billion to Advance iPS Cell‑Based IVF Support, Adds TIME100 Scientist as Advisor

    Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

    Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

    Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

    Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

    SIFF Strengthens Talent Pipeline as Emerging Filmmakers Re-Engage With Festival Programs

    SIFF Strengthens Talent Pipeline as Emerging Filmmakers Re-Engage With Festival Programs

    No turning back in US-Iran talks: Wang Yi

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup

    Xiaohongshu Prepares Confidential Hong Kong IPO Filing

    SpaceX Raises $75 Billion in Historic IPO Amid $350 Billion Investor Demand

    Chinese firms double down on tech: Xiaomi, Haier

    Xiaomi Launches MiMo Code AI Programming Assistant to Enter Coding Agent Market

    Apple Unveils Overhauled Siri AI and Major OS Updates at WWDC 2026

    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    SpaceX scrubs launch of ISS replacement crew mission

    SpaceX Valued at $780 Billion Ahead of Potential IPO, Morningstar Says

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World

    Ping An Good Doctor Upgrades AI Doctor Service “Ping An AI Doctor”, Expanding Access to Ping An Ecosystem’s 90 Million MAUs

    Dioseve Secures JPY 1.45 Billion to Advance iPS Cell‑Based IVF Support, Adds TIME100 Scientist as Advisor

    Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

    Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

    Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

    Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

    SIFF Strengthens Talent Pipeline as Emerging Filmmakers Re-Engage With Festival Programs

    SIFF Strengthens Talent Pipeline as Emerging Filmmakers Re-Engage With Festival Programs

    No turning back in US-Iran talks: Wang Yi

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup

    Xiaohongshu Prepares Confidential Hong Kong IPO Filing

    SpaceX Raises $75 Billion in Historic IPO Amid $350 Billion Investor Demand

    Chinese firms double down on tech: Xiaomi, Haier

    Xiaomi Launches MiMo Code AI Programming Assistant to Enter Coding Agent Market

    Apple Unveils Overhauled Siri AI and Major OS Updates at WWDC 2026

    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    SpaceX scrubs launch of ISS replacement crew mission

    SpaceX Valued at $780 Billion Ahead of Potential IPO, Morningstar Says

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
HK Businesswire
No Result
View All Result
Home News Science

AI tool generates high-quality images faster than state-of-the-art approaches

David Lee by David Lee
21 March 2025
in Science
0
AI tool generates high-quality images faster than state-of-the-art approaches
0
SHARES
8
VIEWS
Share on FacebookShare on Twitter

The ability to generate high-quality images quickly is crucial for producing realistic simulated environments that can be used to train self-driving cars to avoid unpredictable hazards, making them safer on real streets.But the generative artificial intelligence techniques increasingly being used to produce such images have drawbacks. One popular type of model, called a diffusion model, can create stunningly realistic images but is too slow and computationally intensive for many applications. On the other hand, the autoregressive models that power LLMs like ChatGPT are much faster, but they produce poorer-quality images that are often riddled with errors.Researchers from MIT and NVIDIA developed a new approach that brings together the best of both methods. Their hybrid image-generation tool uses an autoregressive model to quickly capture the big picture and then a small diffusion model to refine the details of the image.Their tool, known as HART (short for hybrid autoregressive transformer), can generate images that match or exceed the quality of state-of-the-art diffusion models, but do so about nine times faster.The generation process consumes fewer computational resources than typical diffusion models, enabling HART to run locally on a commercial laptop or smartphone. A user only needs to enter one natural language prompt into the HART interface to generate an image.HART could have a wide range of applications, such as helping researchers train robots to complete complex real-world tasks and aiding designers in producing striking scenes for video games.“If you are painting a landscape, and you just paint the entire canvas once, it might not look very good. But if you paint the big picture and then refine the image with smaller brush strokes, your painting could look a lot better. That is the basic idea with HART,” says Haotian Tang SM ’22, PhD ’25, co-lead author of a new paper on HART.He is joined by co-lead author Yecheng Wu, an undergraduate student at Tsinghua University; senior author Song Han, an associate professor in the MIT Department of Electrical Engineering and Computer Science (EECS), a member of the MIT-IBM Watson AI Lab, and a distinguished scientist of NVIDIA; as well as others at MIT, Tsinghua University, and NVIDIA. The research will be presented at the International Conference on Learning Representations.The best of both worldsPopular diffusion models, such as Stable Diffusion and DALL-E, are known to produce highly detailed images. These models generate images through an iterative process where they predict some amount of random noise on each pixel, subtract the noise, then repeat the process of predicting and “de-noising” multiple times until they generate a new image that is completely free of noise.Because the diffusion model de-noises all pixels in an image at each step, and there may be 30 or more steps, the process is slow and computationally expensive. But because the model has multiple chances to correct details it got wrong, the images are high-quality.Autoregressive models, commonly used for predicting text, can generate images by predicting patches of an image sequentially, a few pixels at a time. They can’t go back and correct their mistakes, but the sequential prediction process is much faster than diffusion.These models use representations known as tokens to make predictions. An autoregressive model utilizes an autoencoder to compress raw image pixels into discrete tokens as well as reconstruct the image from predicted tokens. While this boosts the model’s speed, the information loss that occurs during compression causes errors when the model generates a new image.With HART, the researchers developed a hybrid approach that uses an autoregressive model to predict compressed, discrete image tokens, then a small diffusion model to predict residual tokens. Residual tokens compensate for the model’s information loss by capturing details left out by discrete tokens.“We can achieve a huge boost in terms of reconstruction quality. Our residual tokens learn high-frequency details, like edges of an object, or a person’s hair, eyes, or mouth. These are places where discrete tokens can make mistakes,” says Tang.Because the diffusion model only predicts the remaining details after the autoregressive model has done its job, it can accomplish the task in eight steps, instead of the usual 30 or more a standard diffusion model requires to generate an entire image. This minimal overhead of the additional diffusion model allows HART to retain the speed advantage of the autoregressive model while significantly enhancing its ability to generate intricate image details.“The diffusion model has an easier job to do, which leads to more efficiency,” he adds.Outperforming larger modelsDuring the development of HART, the researchers encountered challenges in effectively integrating the diffusion model to enhance the autoregressive model. They found that incorporating the diffusion model in the early stages of the autoregressive process resulted in an accumulation of errors. Instead, their final design of applying the diffusion model to predict only residual tokens as the final step significantly improved generation quality.Their method, which uses a combination of an autoregressive transformer model with 700 million parameters and a lightweight diffusion model with 37 million parameters, can generate images of the same quality as those created by a diffusion model with 2 billion parameters, but it does so about nine times faster. It uses about 31 percent less computation than state-of-the-art models.Moreover, because HART uses an autoregressive model to do the bulk of the work — the same type of model that powers LLMs — it is more compatible for integration with the new class of unified vision-language generative models. In the future, one could interact with a unified vision-language generative model, perhaps by asking it to show the intermediate steps required to assemble a piece of furniture.“LLMs are a good interface for all sorts of models, like multimodal models and models that can reason. This is a way to push the intelligence to a new frontier. An efficient image-generation model would unlock a lot of possibilities,” he says.In the future, the researchers want to go down this path and build vision-language models on top of the HART architecture. Since HART is scalable and generalizable to multiple modalities, they also want to apply it for video generation and audio prediction tasks.This research was funded, in part, by the MIT-IBM Watson AI Lab, the MIT and Amazon Science Hub, the MIT AI Hardware Program, and the U.S. National Science Foundation. The GPU infrastructure for training this model was donated by NVIDIA. 

Tags: Science
David Lee

David Lee

Read More

Could AI tell you where you left your keys?

17 June 2026

How to create distinguishable states for quantum systems

15 June 2026
  • Trending
  • Comments
  • Latest

HKICPA Supports Government Plan to Boost Corporate Treasury Centres in Hong Kong

12 June 2026

Carousell Launches Hyper-Local Climate Impact Leaderboard in Hong Kong

4 June 2026
Cathay Financial Holdings Leverages Open-Source Small Language Models to Identify Customer Intent

Cathay Financial Holdings Leverages Open-Source Small Language Models to Identify Customer Intent

11 June 2026
Seaspan Takes Delivery of First 10,800 CEU Dual-Fuel LNG Pure Car and Truck Carrier

Seaspan Takes Delivery of First 10,800 CEU Dual-Fuel LNG Pure Car and Truck Carrier

12 June 2026

Ping An Good Doctor Upgrades AI Doctor Service “Ping An AI Doctor”, Expanding Access to Ping An Ecosystem’s 90 Million MAUs

16 June 2026

Dioseve Secures JPY 1.45 Billion to Advance iPS Cell‑Based IVF Support, Adds TIME100 Scientist as Advisor

16 June 2026
Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

16 June 2026
Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

16 June 2026

Recent News

Ping An Good Doctor Upgrades AI Doctor Service “Ping An AI Doctor”, Expanding Access to Ping An Ecosystem’s 90 Million MAUs

16 June 2026

Dioseve Secures JPY 1.45 Billion to Advance iPS Cell‑Based IVF Support, Adds TIME100 Scientist as Advisor

16 June 2026
Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

Natalia Dyer Begins Her K-Beauty Journey with Purito Seoul

16 June 2026
Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

Ruixiang Silicone Releases Platinum-Cured Medical Silicone Tubing for Cardiac, Infusion, and Hemodialysis Device Manufacturers

16 June 2026
HK Businesswire

Stay ahead with the latest insights on Hong Kong’s economy, finance, and investments. From market trends to policy updates, we bring you in-depth analysis and expert opinions.

📩 Subscribe to our newsletter for exclusive updates.
📍 Follow us on social media for real-time news.
📧 Contact us: info@hongkong-invest.com

Follow Us

  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2025 by HKBusinesswire.com

No Result
View All Result

© 2025 by HKBusinesswire.com