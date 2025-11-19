Google’s new Gemini 3 can simultaneously process text, images and audio19.11.25
The model is designed as a native multimodal system capable of simultaneously processing text, images, and audio. Google says this enables tasks such as converting photos of cooking notes into fully structured recipes or creating flashcards from video lectures.
The company is simultaneously testing generative interfaces in the Gemini Labs environment. It creates visual materials in the style of magazine layouts or offers dynamic layouts tailored to the user’s specific task.
In the updated search, Gemini 3 Pro generates answers enhanced with images, networks, tables, and simulations. This is achieved through an improved query processing method that breaks queries into multiple dimensions and more accurately determines user intent, increasing the amount of relevant content. Google notes that the model’s answers are now characterized by a more restrained presentation, avoiding the excessive flattery previously often criticized in competing systems. Improvements to planning and working with complex multi-level instructions were also announced.
With Gemini 3, the company is expanding the capabilities of its experimental Gemini Agent tool. It can perform actions on behalf of the user, such as organizing emails or organizing travel. Google AI Pro and Ultra subscribers in the US have gained access to agent features.
Gemini 3 Pro has already been rolled out to all users in the Gemini app, and the system’s advanced capabilities are gradually being added to search.
