Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
AI Apple

Apple Acquires Startup That Uses AI To Compress Videos (techcrunch.com) 30

Apple has quietly acquired a Mountain View-based startup, WaveOne, that was developing AI algorithms for compressing video. From a report: Apple wouldn't confirm the sale when asked for comment. But WaveOne's website was shut down around January, and several former employees, including one of WaveOne's co-founders, now work within Apple's various machine learning groups. In a LinkedIn post published a month ago, WaveOne's former head of sales and business development, Bob Stankosh, announced the sale. "After almost two years at WaveOne, last week we finalized the sale of the company to Apple," Stankosh wrote. "We started our journey at WaveOne, realizing that machine learning and deep learning video technology could potentially change the world. Apple saw this potential and took the opportunity to add it to their technology portfolio." WaveOne was founded in 2016 by Lubomir Bourdev and Oren Rippel, who set out to take the decades-old paradigm of video codecs and make them AI-powered. Prior to joining the venture, Bourdev was a founding member of Meta's AI research division, and both he and Rippel worked on Meta's computer vision team responsible for content moderation, visual search and feed ranking on Facebook.
This discussion has been archived. No new comments can be posted.

Apple Acquires Startup That Uses AI To Compress Videos

Comments Filter:
  • From my simple understanding, this allows AI to show you "important elements" (text and faces), over "non essential" elements like scenery without requiring new hardware or codecs. Am I right? So, they are finding a way to show you lower quality video when there is a poor connection without sacrificing "important elements".

    That's neat, but can I turn it off?

    • by AmiMoJo ( 196126 ) on Monday March 27, 2023 @12:40PM (#63403868) Homepage Journal

      Feed in a low quality copy of the video. AI generates a high resolution version. Take the difference between the AI version and the real version, and compress it.

      The low res version and the compressed difference data will be smaller than compressing the full quality data. Could even be lossless if you want.

      • But then you’d have to transmit the low quality version and the difference. That’s like the whole thing.

        You could however run the same AI prediction on both ends and then transmit only the difference between the prediction and the next actual frame. That’s what I thought they’d do but it doesn’t seem to be the case.

      • Youâd also need to transmit the trained AI model data every time, except if itâ(TM)s static and can be considered part of the codec (think static vs dynamic dictionary). Thatâs why the Hutter Prize takes the decompressor size into account, itâ(TM)s not compression if you âzhideâoe the data somewhere else. Like âzthis AI can compress this 1 GB video down to just 200 KB.. using this 4 GB trained model data.âoe
    • You've spun it into anti-corporate outrage which is on point for /. but when it comes to compression, "cost savings" and "high quality" are two sides of the same coin.

      Codecs devote a lot of time to deciding where to spend bits. Have you ever looked at a scene and noticed compression artifacts floating around some important element? Maybe text on a document, or a figure cloaked in dark backdrop bokeh in a horror. That Game of Thrones episode in the final season that happened entirely at night and was almost

    • Comment removed based on user account deletion
  • AI that can focus more bits in places we pay more attention and use fewer bits in paces that we don't pay attention is the future of video compression, especially for streaming and doubly so for videoconferencing. I don't care about the details on the leaves of the trees behind the person I'm talking to, I care about their facial expressions and their gestures.

    There are several AI video compression efforts out there, from DeepMind's MuZero [bbc.com] to NVIDIA's Maxine [nvidia.com] to smaller players like Deep Render [techcrunch.com] and Orange A [github.com]

    • AI that can focus more bits in places we pay more attention and use fewer bits in paces that we don't pay attention

      Are you using a 28.8k dial-up modem or something? The few pixels you save are meaningless. Attention-based compression is Marketing Bullshit

    • The newest codecs ... AV1? have some areas that are open to decisions being decided by AI. Even in h.264 has optimized presets for situations like slide shows or cartoons or black and white film or lower color spaces. Detectors not using AI could change optimization modes but maybe do a smarter job of it. Quality ranking is decided by amount of motion/change already and this can be perfect for AI -- I do not know if anybody has been using existing face detection algorithms (on cameras) in codecs to help ma

      • The newest codecs ... AV1? have some areas that are open to decisions being decided by AI.

        Even MPEG2 can do that. The way CODECS are specified is in terms of the decoder: the bitstream must decode to the correct values, but you can create a decodable bitstream however you like.

        Even with JPEG there are things you can do. In the encoding of a block, you quantize the frequency components. Many of them go to zero. If all the end ones go to zero, there's a special token which indicates "the rest of the block is

      • >> What I would like to see is one which turns video into detailed text descriptions and from that back to video... with really interesting results! (like how they ignore some words) that is, after they get it to do motion and be consistent between frames... although that will be an interesting slow motion as each frame changes drastically.https://hackaday.com/?p=578041

        I think the best video compression will soon be AI driven, though probably not using this type of algorithm. Itâ(TM)s still cool

    • AI that can focus more bits in places we pay more attention and use fewer bits in paces that we don't pay attention is the future of video compression

      So, where would it get training data to learn where peoples focus would be in a video I wonder?

      Interesting that pretty soon they are releasing a headset with integrated eye tracking where one promoted use is watching video...

      • by dhaen ( 892570 )

        Interesting that pretty soon they are releasing a headset with integrated eye tracking where one promoted use is watching video...

        By tracking eye movement they could save some data by not displaying image that would be in each eye's blind spot.

        • by pz ( 113803 )

          The latencies involved and duration of fixations make this idea a losing proposition. I know it's all the hype these days, but gaze-contingent compression doesn't work unless (a) you can change the compression on the fly from one frame to the next and (b) you have a frame rate above 200 FPS or so. Even at 200 FPS, you can feel the latency. We do experiments that are almost exactly this idea in my lab.

    • It would be interesting if it used AI to fill in irrelevant details that humans are unlikely to care about while giving us detailed faces and other focal elements. I’m not totally sure they’re doing that though.

  • I was doing this about 10 years ago. And frankly what they're doing is almost an entire waste of a good idea.

    Consider this.

    Most video encoding systems support both temporal and spatial scalability. This was a feature present in MPEG-2 and often goes thoroughly under used.

    What this feature allows for example is the ability to encode video in a way that efficiently supports multiple resolutions, frame rates (and even color depths) into a single file. So you can encode low, medium, high and master quality into
    • This is very interesting since the whole 5G revolution seems to be centered on some silly idea that we'll need a lot of bandwidth to do holographic video

      A lot of the focus is on low latency, rather than high throughput per se. It's just that one of the sources of latency is compression, so reducing latency means either toning down or disabling compression. That increases the traffic required, but it's much easier to increase the throughput of a network than to decrease its latency (especially since you're ultimately up against the speed of light, which is slow enough to be a problem by itself without adding more sources of latency on top).

  • Just need the AI to transmit the parts of the video that include boobs. Save a lot of time.

"In my opinion, Richard Stallman wouldn't recognise terrorism if it came up and bit him on his Internet." -- Ross M. Greenberg

Working...