New Image Trends Frontend Developers Should Support
Media management firm Cloudinary is working on a plug-in that will enable developers to leverage its image capabilities from within ChatGPT.
It’s part of keeping up with new technologies that, like AI, are changing user expectations when it comes to a frontend experience, said Tal Lev-Ami, CTO and co-founder of online media management company Cloudinary.
“If you look at e-commerce, many websites now have ways to know what you want to buy the 360 [degree] way and some of them also have integrated AR experiences where you can take whatever object it is and either see it in the room or see it on yourself,” Lev-Ami told The New Stack. “These are considerations that are becoming more critical for developers to support.”
Another thing developers should consider is how AI-enabled media manipulation will alternate the expectations of end users. He compared it to the internet’s shift from simply text to using images. Images didn’t replace text, but users suddenly expected images on web pages.
“The expectations of the end users on the quality and personalization of the media is ever increasing, because they see ads and they see more sophisticated visual experiences,” he said. “It’s not that everything before is meaningless; it’s still needed. But if you’re not there to meet the expectations of the end user in terms of experiences, then you’re getting left behind.”
There are challenges around supporting 3D, such as how to optimize images and (for instance) how to take a file developed for CAD and convert it to a media 3D format that’s supported on the web, such as glTF, an open standard file format for three-dimensional scenes and models, Lev-Ami said.
A case study with Minted, a crowdsourced art site with 59.8 million images, offers a look at what’s required to support 3D. Minted used Cloudinary to improve its image generation pipeline with support for a full set of 2D and 3D transforms and automation technology. A single product at Minted can have more than 100,000 variants, according to a case study of Minted’s Cloudinary deployment.
The case study explained how the art site worked with the media company to create a 3D shopping experience. First, the image of the scenes are created in a studio, then an internal image specialist sliced the image into layers and corrected for transparency, color and position. A script was then used to generate the coordinates needed to position these layers as named transforms into a text file (CSV), which when uploaded to Cloudinary (with the previously created screen layers) created the final image.
Separately, Minted’s proprietary pipeline ingests raw art files from artists and builds the base images for each winning design. When a customer navigates to an art category page or product details page on Minted, the page sends requests to Cloudinary for images that composite the correct combination of scenes, designs, frame and texture into the final thumbnails, the case study explained.
“For close-up product images, Minted makes use of Cloudinary’s 3D rendering capability as well as its e_distort API feature,” the case study noted. “A 3D model with texture UV mapping was created for the close-up image that shows off the texture and wrapping effect of a stretched canvas art print. With some careful tweaking of the 3D coordinates, the model is uploaded and Cloudinary does the rest, composing the art design as texture onto the model.”
Bring Your Own Algorithms
WebAssembly is another relative newcomer technology for the frontend, where it can be used to deploy streaming media, so I asked Lev-Ami if Wasm is also changing how media works on the frontend, or perhaps in how Cloudinary manages its own workload? While Cloudinary does deploy Wasm to support edge computing, the company also allows developers to upload Wasm and run their own algorithms.
“We actually have a capability where you can upload your own Wasm so that you can run your own algorithm as part of the media processing pipeline,” he said. “If you have some unique algorithm that you want to run as part of the media processing pipeline, you can do that. The safety and security around Wasm allows us to be more open as a platform and allows customers to handle use cases where they need to run their own algorithms part of the pipeline.”
Wasm has fewer security risks than code because it executes within its own sandbox, according to Andrew Cornwall, a senior analyst with Forrester who specializes in the application development space. Code compiled to WebAssembly can’t grab passwords, for instance, Cornwall recently told The New Stack.