Google

Access 61 Google models through the OpenRouter unified API including Nano Banana 2 (Gemini 3.1 Flash Image), Nano Banana Pro (Gemini 3 Pro Image), and Gemini Embedding 2. Compare pricing, context windows, benchmarks, and capabilities between different Google models.

Google tokens processed on OpenRouter

Google: Nano Banana 2 (Gemini 3.1 Flash Image)Nano Banana 2 (Gemini 3.1 Flash Image)
521M tokens
Gemini 3.1 Flash Image, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced contextual understanding with fast, cost-efficient inference, making complex image generation and iterative edits significantly more accessible. Aspect ratios can be controlled with the [image_config API Parameter](https://openrouter.ai/docs/features/multimodal/image-generation#image-aspect-ratio-configuration)
by googleJun 18, 2026131K context$0.50/M input tokens$3/M output tokens

Google

Access 61 Google models through the OpenRouter unified API including Nano Banana 2 (Gemini 3.1 Flash Image), Nano Banana Pro (Gemini 3 Pro Image), and Gemini Embedding 2. Compare pricing, context windows, benchmarks, and capabilities between different Google models.

Google tokens processed on OpenRouter

Google: Nano Banana 2 (Gemini 3.1 Flash Image)Nano Banana 2 (Gemini 3.1 Flash Image)
521M tokens
Gemini 3.1 Flash Image, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced contextual understanding with fast, cost-efficient inference, making complex image generation and iterative edits significantly more accessible. Aspect ratios can be controlled with the [image_config API Parameter](https://openrouter.ai/docs/features/multimodal/image-generation#image-aspect-ratio-configuration)
by googleJun 18, 2026131K context$0.50/M input tokens$3/M output tokens

Google: Nano Banana Pro (Gemini 3 Pro Image)Nano Banana Pro (Gemini 3 Pro Image)

387M tokens

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis. The model generates context-rich graphics, from infographics and diagrams to cinematic composites, and can incorporate real-time information via Search grounding. It offers industry-leading text rendering in images (including long passages and multilingual layouts), consistent multi-image blending, and accurate identity preservation across up to five subjects. Nano Banana Pro adds fine-grained creative controls such as localized edits, lighting and focus adjustments, camera transformations, and support for 2K/4K outputs and flexible aspect ratios. It is designed for professional-grade design, product visualization, storyboarding, and complex multi-element compositions while remaining efficient for general image creation workflows.

by googleJun 18, 202666K context$2/M input tokens$12/M output tokens

Google: Gemini Embedding 2Gemini Embedding 2

7.05B tokens

Gemini Embedding 2 is Google's first multimodal embedding model. We currently support mapping text and images into a unified vector space for semantic search and retrieval-augmented generation (RAG). It supports input context up to 8,192 tokens and flexible output dimensions from 128 to 3,072 (recommended: 768, 1536, or 3,072). Designed for cross-modal similarity — you can embed a text query and retrieve the most relevant images, or vice versa — making it well-suited for multimodal search, recommendation, and document understanding pipelines.

by googleMay 20, 20268K context$0.20/M tokens

Google: Gemini 3.5 FlashGemini 3.5 Flash

419B tokens

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution loops, supporting text, image, video, audio, and PDF inputs. Defaults to medium thinking effort for faster and more cost-efficient responses, with full support for thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs.

by googleMay 19, 20261.05M context$1.50/M input tokens$9/M output tokens

Google: Gemini 3.1 Flash LiteGemini 3.1 Flash Lite

432B tokens

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic workflows, simple data extraction, and applications where responsiveness and API cost are the primary constraints. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.

by googleMay 7, 20261.05M context$0.25/M input tokens$1.50/M output tokens

Google: Chirp 3Chirp 3

Chirp 3 is Google's latest multilingual speech-to-text model. It offers enhanced transcription accuracy across 24 GA languages and 77+ preview languages, with support for automatic language detection, automatic punctuation, and a built-in denoiser for cleaner audio processing.

by googleMay 5, 2026$0.016/minute

Google Gemini Pro LatestGoogle Gemini Pro Latest

This model always redirects to the latest model in the Google Gemini Pro family.

by googleApr 27, 20261.05M context

Google Gemini Flash LatestGoogle Gemini Flash Latest

This model always redirects to the latest model in the Google Gemini Flash family.

by googleApr 27, 20261.05M context

Google: Gemini 3.1 Flash TTS PreviewGemini 3.1 Flash TTS Preview

114M tokens

Gemini 3.1 Flash TTS Preview is a text-to-speech model from Google, and a substantial generational step up from Gemini 2.5 Flash TTS. It takes text input and produces audio output across 70+ languages — nearly 3× the language coverage of its predecessor. The headline addition is a system of 200+ inline audio tags (e.g. `[whispers]`, `[laughs]`, `[excited]`) that let developers steer delivery, emotion, and pacing mid-sentence, alongside a "director's chair" workflow in Google AI Studio for defining per-character Audio Profiles and scene-level context. It supports up to two speakers with independent voice and style configuration per speaker, outputs PCM audio at 24 kHz / 16-bit mono, and automatically watermarks all output with SynthID. Context window is 32k tokens.

by googleApr 24, 20268K context$1/M input tokens$20/M output tokens

Google: Veo 3.1 FastVeo 3.1 Fast

Google's mid-tier video generation model balancing speed and quality. Veo 3.1 Fast generates high-quality video from text or image prompts with native synchronized audio, offering faster turnaround than Veo 3.1 at lower cost. Supports first-frame and last-frame conditioning, multiple resolutions and aspect ratios, and SynthID watermarking.

by googleApr 24, 2026from $0.10/second

Google: Veo 3.1 LiteVeo 3.1 Lite

Google's most cost-effective video generation model, designed for high-volume applications and rapid iteration. Veo 3.1 Lite generates 720p and 1080p video from text or image prompts with native synchronized audio at less than 50% of the cost of Veo 3.1 Fast. Supports 4–8 second clips in landscape (16:9) and portrait (9:16) formats, with SynthID watermarking. Ideal for content platforms, short-form video creation, and automated media generation.

by googleApr 23, 2026from $0.05/second

Google: Gemini Embedding 2 PreviewGemini Embedding 2 Preview

3.77B tokens

Gemini Embedding 2 Preview is Google's first multimodal embedding model. We currently support mapping text and images into a unified vector space for semantic search and retrieval-augmented generation (RAG). It supports input context up to 8,192 tokens and flexible output dimensions from 128 to 3,072 (recommended: 768, 1536, or 3,072). Designed for cross-modal similarity — you can embed a text query and retrieve the most relevant images, or vice versa — making it well-suited for multimodal search, recommendation, and document understanding pipelines.

by googleApr 17, 20268K context$0.20/M tokens

Google: Gemma 4 26B A4B Gemma 4 26B A4B

298B tokens

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

by googleApr 3, 2026262K context$0.06/M input tokens$0.33/M output tokens

Google: Gemma 4 26B A4B (free)Gemma 4 26B A4B (free)

3.99B tokens

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

by googleApr 3, 2026262K context$0/M input tokens$0/M output tokens

Google: Gemma 4 31BGemma 4 31B

191B tokens

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding, reasoning, and document understanding tasks. Apache 2.0 license.

by googleApr 2, 2026262K context$0.12/M input tokens$0.35/M output tokens

Google: Gemma 4 31B (free)Gemma 4 31B (free)

42.6B tokens

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding, reasoning, and document understanding tasks. Apache 2.0 license.

by googleApr 2, 2026262K context$0/M input tokens$0/M output tokens

Google: Lyria 3 Pro PreviewLyria 3 Pro Preview

6.11M tokens

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Pro can generate full-length songs with verses, choruses, bridges.

by googleMar 30, 20261.05M context$0.08/song

Google: Lyria 3 Clip PreviewLyria 3 Clip Preview

6.52M tokens

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Clip can generate short clips, loops, previews.

by googleMar 30, 20261.05M context$0.04/song

Google: Veo 3.1Veo 3.1

Google's state-of-the-art video generation model, built for maximum visual fidelity in final production cuts. Veo 3.1 generates high-quality 1080p video from text or image prompts with native synchronized audio — including dialogue, ambient effects, and background sound. Supports scene extension (up to 20 chained clips for 140+ second narratives), frames-to-video transitions between two images, vertical video for Shorts, and 4K upscaling.

by googleMar 23, 2026from $0.40/second

Google: Gemini 3.1 Flash Lite PreviewGemini 3.1 Flash Lite Preview

197B tokens

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across key capabilities. Improvements span audio input/ASR, RAG snippet ranking, translation, data extraction, and code completion. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.

by googleMar 3, 20261.05M context$0.25/M input tokens$1.50/M output tokens

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)Nano Banana 2 (Gemini 3.1 Flash Image Preview)

5.62B tokens

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced contextual understanding with fast, cost-efficient inference, making complex image generation and iterative edits significantly more accessible. Aspect ratios can be controlled with the [image_config API Parameter](https://openrouter.ai/docs/features/multimodal/image-generation#image-aspect-ratio-configuration)

by googleFeb 26, 2026131K context$0.50/M input tokens$3/M output tokens

Google: Gemini 3.1 Pro Preview Custom ToolsGemini 3.1 Pro Preview Custom Tools

5.68B tokens

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party or user-defined functions are available. This specialized preview endpoint significantly increases function calling reliability and ensures the model selects the most appropriate tool in coding agents and complex, multi-tool workflows. It retains the core strengths of Gemini 3.1 Pro, including multimodal reasoning across text, image, video, audio, and code, a 1M-token context window, and strong software engineering performance.

by googleFeb 25, 20261.05M context$2/M input tokens$12/M output tokens

Google: Gemini 3.1 Pro PreviewGemini 3.1 Pro Preview

254B tokens

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation of the Gemini 3 series, it combines high-precision reasoning across text, image, video, audio, and code with a 1M-token context window. Reasoning Details must be preserved when using multi-turn tool calling, see our docs here: https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning. The 3.1 update introduces measurable gains in SWE benchmarks and real-world coding environments, along with stronger autonomous task execution in structured domains such as finance and spreadsheet-based workflows. Designed for advanced development and agentic systems, Gemini 3.1 Pro Preview improves long-horizon stability and tool orchestration while increasing token efficiency. It introduces a new medium thinking level to better balance cost, speed, and performance. The model excels in agentic coding, structured planning, multimodal analysis, and workflow automation, making it well-suited for autonomous agents, financial modeling, spreadsheet automation, and high-context enterprise tasks.

by googleFeb 19, 20261.05M context$2/M input tokens$12/M output tokens

Google: Gemini 3 Flash PreviewGemini 3 Flash Preview

976B tokens

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants, making it well suited for interactive development, long running agent loops, and collaborative coding tasks. Compared to Gemini 2.5 Flash, it provides broad quality improvements across reasoning, multimodal understanding, and reliability. The model supports a 1M token context window and multimodal inputs including text, images, audio, video, and PDFs, with text output. It includes configurable reasoning via thinking levels (minimal, low, medium, high), structured output, tool use, and automatic context caching. Gemini 3 Flash Preview is optimized for users who want strong reasoning and agentic behavior without the cost or latency of full scale frontier models.

by googleDec 17, 20251.05M context$0.50/M input tokens$3/M output tokens

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)Nano Banana Pro (Gemini 3 Pro Image Preview)

2.36B tokens

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis. The model generates context-rich graphics, from infographics and diagrams to cinematic composites, and can incorporate real-time information via Search grounding. It offers industry-leading text rendering in images (including long passages and multilingual layouts), consistent multi-image blending, and accurate identity preservation across up to five subjects. Nano Banana Pro adds fine-grained creative controls such as localized edits, lighting and focus adjustments, camera transformations, and support for 2K/4K outputs and flexible aspect ratios. It is designed for professional-grade design, product visualization, storyboarding, and complex multi-element compositions while remaining efficient for general image creation workflows.

by googleNov 20, 202566K context$2/M input tokens$12/M output tokens

Google: Gemini 3 Pro PreviewGemini 3 Pro Preview

Gemini 3 Pro is Google’s flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window. Reasoning Details must be preserved when using multi-turn tool calling, see our docs here: https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks. It delivers state-of-the-art benchmark results in general reasoning, STEM problem solving, factual QA, and multimodal understanding, including leading scores on LMArena, GPQA Diamond, MathArena Apex, MMMU-Pro, and Video-MMMU. Interactions emphasize depth and interpretability: the model is designed to infer intent with minimal prompting and produce direct, insight-focused responses. Built for advanced development and agentic workflows, Gemini 3 Pro provides robust tool-calling, long-horizon planning stability, and strong zero-shot generation for complex UI, visualization, and coding tasks. It excels at agentic coding (SWE-Bench Verified, Terminal-Bench 2.0), multimodal analysis, and structured long-form tasks such as research synthesis, planning, and interactive learning experiences. Suitable applications include autonomous agents, coding assistants, multimodal analytics, scientific reasoning, and high-context information processing.

by googleNov 18, 20251.05M context

Google: Gemini Embedding 001Gemini Embedding 001

30.2B tokens

gemini-embedding-001 provides a unified cutting edge experience across domains, including science, legal, finance, and coding. This embedding model has consistently held a top spot on the Massive Text Embedding Benchmark (MTEB) Multilingual leaderboard since the experimental launch in March.

by googleOct 31, 202520K context$0.15/M tokens

Google: Nano Banana (Gemini 2.5 Flash Image)Nano Banana (Gemini 2.5 Flash Image)

7.86B tokens

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations. Aspect ratios can be controlled with the [image_config API Parameter](https://openrouter.ai/docs/features/multimodal/image-generation#image-aspect-ratio-configuration)

by googleOct 7, 202533K context$0.30/M input tokens$2.50/M output tokens

Google: Gemini 2.5 Flash Preview 09-2025Gemini 2.5 Flash Preview 09-2025

Gemini 2.5 Flash Preview September 2025 Checkpoint is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling. Additionally, Gemini 2.5 Flash is configurable through the "max tokens for reasoning" parameter, as described in the documentation (https://openrouter.ai/docs/use-cases/reasoning-tokens#max-tokens-for-reasoning).

by googleSep 25, 20251.05M context

Google: Gemini 2.5 Flash Lite Preview 09-2025Gemini 2.5 Flash Lite Preview 09-2025

29.7B tokens

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

by googleSep 25, 20251.05M context$0.10/M input tokens$0.40/M output tokens

Google: Gemini 2.5 Flash Image Preview (Nano Banana)Gemini 2.5 Flash Image Preview (Nano Banana)

Gemini 2.5 Flash Image Preview, a.k.a. "Nano Banana," is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations.

by googleAug 26, 202533K context

Google: Gemini 2.5 Flash LiteGemini 2.5 Flash Lite

709B tokens

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

by googleJul 22, 20251.05M context$0.10/M input tokens$0.40/M output tokens

Google: Gemma 3n 2BGemma 3n 2B

Gemma 3n E2B IT is a multimodal, instruction-tuned model developed by Google DeepMind, designed to operate efficiently at an effective parameter size of 2B while leveraging a 6B architecture. Based on the MatFormer architecture, it supports nested submodels and modular composition via the Mix-and-Match framework. Gemma 3n models are optimized for low-resource deployment, offering 32K context length and strong multilingual and reasoning performance across common benchmarks. This variant is trained on a diverse corpus including code, math, web, and multimodal data.

by googleJul 9, 20258K context

Google: Gemini 2.5 FlashGemini 2.5 Flash

648B tokens

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling. Additionally, Gemini 2.5 Flash is configurable through the "max tokens for reasoning" parameter, as described in the documentation (https://openrouter.ai/docs/use-cases/reasoning-tokens#max-tokens-for-reasoning).

by googleJun 17, 20251.05M context$0.30/M input tokens$2.50/M output tokens

Google: Gemini 2.5 ProGemini 2.5 Pro

66.9B tokens

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

by googleJun 17, 20251.05M context$1.25/M input tokens$10/M output tokens

Google: Gemini 2.5 Pro Preview 06-05Gemini 2.5 Pro Preview 06-05

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

by googleJun 5, 20251.05M context$1.25/M input tokens$10/M output tokens

Google: Gemma 1 2BGemma 1 2B

Gemma 1 2B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Usage of Gemma is subject to Google's [Gemma Terms of Use](https://ai.google.dev/gemma/terms).

by googleMay 28, 20258K context

Google: Gemma 3n 4BGemma 3n 4B

1.04B tokens

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks such as text generation, speech recognition, translation, and image analysis. Leveraging innovations like Per-Layer Embedding (PLE) caching and the MatFormer architecture, Gemma 3n dynamically manages memory usage and computational load by selectively activating model parameters, significantly reducing runtime resource requirements. This model supports a wide linguistic range (trained in over 140 languages) and features a flexible 32K token context window. Gemma 3n can selectively load parameters, optimizing memory and computational efficiency based on the task or device capabilities, making it well-suited for privacy-focused, offline-capable applications and on-device AI solutions. [Read more in the blog post](https://developers.googleblog.com/en/introducing-gemma-3n/)

by googleMay 20, 202533K context$0.06/M input tokens$0.12/M output tokens

Google: Gemini 2.5 Pro Preview 05-06Gemini 2.5 Pro Preview 05-06

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

by googleMay 7, 20251.05M context$1.25/M input tokens$10/M output tokens

Google: Gemini 2.5 Pro ExperimentalGemini 2.5 Pro Experimental

This model has been deprecated by Google in favor of the (paid Preview model)[google/gemini-2.5-pro-preview] Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.

by googleMar 25, 20251.05M context

Google: Gemma 3 1BGemma 3 1B

Gemma 3 1B is the smallest of the new Gemma 3 family. It handles context windows up to 32k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Note: Gemma 3 1B is not multimodal. For the smallest multimodal Gemma 3 model, please see [Gemma 3 4B](google/gemma-3-4b-it)

by googleMar 14, 202532K context

Google: Gemma 3 4BGemma 3 4B

3.49B tokens

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling.

by googleMar 13, 2025131K context$0.05/M input tokens$0.10/M output tokens

Google: Gemma 3 12BGemma 3 12B

13B tokens

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 12B is the second largest in the family of Gemma 3 models after [Gemma 3 27B](google/gemma-3-27b-it)

by googleMar 13, 2025131K context$0.05/M input tokens$0.15/M output tokens

Google: Gemma 3 27BGemma 3 27B

20.8B tokens

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to [Gemma 2](google/gemma-2-27b-it)

by googleMar 12, 2025131K context$0.08/M input tokens$0.16/M output tokens

Google: Gemini 2.0 Flash LiteGemini 2.0 Flash Lite

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5), all at extremely economical token prices.

by googleFeb 25, 20251.05M context

Google: Gemini 2.0 FlashGemini 2.0 Flash

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5). It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.

by googleFeb 5, 20251M context

Google: Gemini 2.0 Flash ExperimentalGemini 2.0 Flash Experimental

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5). It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.

by googleDec 11, 20241.05M context

Google: Gemini Experimental 1121Gemini Experimental 1121

Experimental release (November 21st, 2024) of Gemini.

by googleNov 21, 202441K context

Google: Gemini Experimental 1114Gemini Experimental 1114

Gemini 11-14 (2024) experimental model features "quality" improvements.

by googleNov 15, 202441K context

Google: Gemini 1.5 Flash 8BGemini 1.5 Flash 8B

Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results. [Click here to learn more about this model](https://developers.googleblog.com/en/gemini-15-flash-8b-is-now-generally-available-for-use/). Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms).

by googleOct 3, 20241M context

Google: Gemini 1.5 Flash ExperimentalGemini 1.5 Flash Experimental

Gemini 1.5 Flash Experimental is an experimental version of the [Gemini 1.5 Flash](/models/google/gemini-flash-1.5) model. Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). #multimodal Note: This model is experimental and not suited for production use-cases. It may be removed or redirected to another model in the future.

by googleAug 28, 20241M context

Google: Gemini 1.5 Pro ExperimentalGemini 1.5 Pro Experimental

Gemini 1.5 Pro Experimental is a bleeding-edge version of the [Gemini 1.5 Pro](/models/google/gemini-pro-1.5) model. Because it's currently experimental, it will be **heavily rate-limited** by Google. Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). #multimodal

by googleAug 1, 20241M context

Google: Gemma 2 27BGemma 2 27B

50.1M tokens

Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. See the [launch announcement](https://blog.google/technology/developers/google-gemma-2/) for more details. Usage of Gemma is subject to Google's [Gemma Terms of Use](https://ai.google.dev/gemma/terms).

by googleJul 13, 20248K context$0.65/M input tokens$0.65/M output tokens

Google: Gemma 2 9BGemma 2 9B

Gemma 2 9B by Google is an advanced, open-source language model that sets a new standard for efficiency and performance in its size class. Designed for a wide variety of tasks, it empowers developers and researchers to build innovative applications, while maintaining accessibility, safety, and cost-effectiveness. See the [launch announcement](https://blog.google/technology/developers/google-gemma-2/) for more details. Usage of Gemma is subject to Google's [Gemma Terms of Use](https://ai.google.dev/gemma/terms).

by googleJun 28, 20248K context

Google: Gemini 1.5 Flash Gemini 1.5 Flash

Gemini 1.5 Flash is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots. Gemini 1.5 Flash is designed for high-volume, high-frequency tasks where cost and latency matter. On most common tasks, Flash achieves comparable quality to other Gemini Pro models at a significantly reduced cost. Flash is well-suited for applications like chat assistants and on-demand content generation where speed and scale matter. Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). #multimodal

by googleMay 14, 20241M context

Google: Gemini 1.5 ProGemini 1.5 Pro

Google's latest multimodal model, supports image and video[0] in text or chat prompts. Optimized for language tasks including: - Code generation - Text generation - Text editing - Problem solving - Recommendations - Information extraction - Data extraction or generation - AI agents Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). * [0]: Video input is not available through OpenRouter at this time.

by googleApr 9, 20242M context

Google: Gemma 7BGemma 7B

Gemma by Google is an advanced, open-source language model family, leveraging the latest in decoder-only, text-to-text technology. It offers English language capabilities across text generation tasks like question answering, summarization, and reasoning. The Gemma 7B variant is comparable in performance to leading open source models. Usage of Gemma is subject to Google's [Gemma Terms of Use](https://ai.google.dev/gemma/terms).

by googleFeb 22, 20248K context

Google: PaLM 2 Code Chat 32kPaLM 2 Code Chat 32k

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions.

by googleNov 3, 202333K context

Google: PaLM 2 Chat 32kPaLM 2 Chat 32k

PaLM 2 is a language model by Google with improved multilingual, reasoning and coding capabilities.

by googleNov 3, 202333K context

Google: PaLM 2 Code ChatPaLM 2 Code Chat

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions.

by googleJul 20, 20237K context

Google: PaLM 2 ChatPaLM 2 Chat

PaLM 2 is a language model by Google with improved multilingual, reasoning and coding capabilities.

by googleJul 20, 20239K context