In a bid to make strides in the field of generative AI, Google has unveiled Gemini, its flagship suite of generative AI models, apps, and services. Developed by Google’s AI research labs DeepMind and Google Research, Gemini comes in three variants: Gemini Ultra, the flagship model; Gemini Pro, a “lite” version; and Gemini Nano, a smaller model tailored for mobile devices like the Pixel 8 Pro.
What Sets Gemini Apart?
Gemini models are distinctively “natively multimodal,” capable of processing audio, images, and videos, along with text. This sets them apart from models like Google’s LaMDA, which exclusively focuses on text data.
Gemini Apps vs. Gemini Models
Google’s branding has caused some confusion, as Gemini apps on the web and mobile (formerly Bard) are a separate interface to access Gemini models. The models are also independent of Imagen 2, Google’s text-to-image model, adding another layer of complexity.
Capabilities of Gemini Models
Gemini models, being multimodal, have the potential to transcribe speech, caption images and videos, and even generate artwork. However, some capabilities are still in development. Google’s ambitious promises have raised eyebrows, considering past instances of under-delivery.
Gemini Ultra: The Flagship Model
Gemini Ultra is positioned as a powerful tool for various tasks, including helping with physics homework, identifying scientific papers, and even generating artwork. It supports image generation but has yet to integrate this feature into the productized version. Access to Gemini Ultra comes with a subscription to the Google One AI Premium Plan, priced at $20 per month.
Gemini Pro: Improved Reasoning and Understanding
Gemini Pro is touted as an improvement over LaMDA in reasoning and understanding capabilities. An independent study by Carnegie Mellon and BerriAI researchers supports this claim, highlighting its proficiency in handling longer and more complex reasoning chains. Gemini Pro is available via API in Vertex AI and can be fine-tuned for specific contexts and use cases.
Gemini Nano: Compact and Efficient
Gemini Nano is a smaller version designed to run directly on some phones. It powers features like Summarize in Recorder and Smart Reply in Gboard on the Pixel 8 Pro. Despite its compact size, it demonstrates efficiency in generating summaries without compromising privacy.
Gemini vs. OpenAI’s GPT-4
Google asserts Gemini’s superiority on benchmarks, but early user impressions suggest challenges in areas such as basic facts, translations, and coding suggestions. The competition with OpenAI’s GPT-4 remains a topic of debate, with marginal differences in benchmark scores.
Read Also: OpenAI unveils Sora
Cost of Using Gemini
While Gemini Pro is free in the Gemini apps during the preview, once it exits preview in Vertex AI, it will cost $0.0025 per character for input and $0.00005 per character for output. Ultra pricing is yet to be announced.
Where to Try Gemini
Gemini Pro and Ultra can be experienced in the Gemini apps, AI Studio, and Vertex AI. Developers can fine-tune models and create structured chat prompts using Gemini Pro in AI Studio. Gemini Nano is currently on the Pixel 8 Pro and will be available on other devices in the future.
Google’s Gemini has generated excitement with its multimodal capabilities, but its true potential and effectiveness will become clearer as users explore and provide feedback on these innovative AI models. Stay tuned for updates on Gemini’s evolving features and applications.