Firebase is back at Google I/O on May 20-21! Register now.

इस पेज का अनुवाद Cloud Translation API से किया गया है.

Gemini API का इस्तेमाल करके, मल्टीमोडल प्रॉम्प्ट से टेक्स्ट जनरेट करें

Vertex AI in Firebase SDK टूल का इस्तेमाल करके, अपने ऐप्लिकेशन से Gemini API को कॉल करते समय, Gemini मॉडल को कई तरह के इनपुट के आधार पर टेक्स्ट जनरेट करने के लिए कहा जा सकता है. मल्टीमोडल प्रॉम्प्ट में कई मोड (या इनपुट के टाइप) शामिल हो सकते हैं. जैसे, इमेज के साथ टेक्स्ट, PDF, प्लैन-टेक्स्ट फ़ाइलें, वीडियो, और ऑडियो.

हर मल्टीमोडल अनुरोध में, आपको हमेशा यह जानकारी देनी होगी:

फ़ाइल का mimeType. इनपुट फ़ाइल के काम करने वाले MIME टाइप के बारे में जानें.
फ़ाइल. फ़ाइल को इनलाइन डेटा के तौर पर (जैसा कि इस पेज पर दिखाया गया है) या उसके यूआरएल या यूआरआई का इस्तेमाल करके दिया जा सकता है.

हमारा सुझाव है कि कई मोड वाले प्रॉम्प्ट को टेस्ट करने और उनमें बदलाव करने के लिए, Vertex AI Studio का इस्तेमाल करें.

Gemini API के साथ काम करने के अन्य विकल्प

वैकल्पिक तौर पर, Gemini API के किसी अन्य वर्शन "Google AI" के साथ एक्सपेरिमेंट करें
Google AI Studio और Google AI क्लाइंट SDK टूल का इस्तेमाल करके, सीमाओं के अंदर और जहां उपलब्ध हो वहां बिना किसी शुल्क के ऐक्सेस पाएं. इन SDK टूल का इस्तेमाल, मोबाइल और वेब ऐप्लिकेशन में सिर्फ़ प्रोटोटाइप बनाने के लिए किया जाना चाहिए.

Gemini API के काम करने के तरीके के बारे में जानने के बाद, हमारे Vertex AI in Firebase SDK टूल पर माइग्रेट करें (यह दस्तावेज़). इसमें मोबाइल और वेब ऐप्लिकेशन के लिए कई ज़रूरी सुविधाएं हैं. जैसे, Firebase App Check का इस्तेमाल करके एपीआई को गलत इस्तेमाल से बचाना और अनुरोधों में बड़ी मीडिया फ़ाइलों के लिए सहायता.

वैकल्पिक तौर पर, Gemini API in Vertex AI के सर्वर साइड को कॉल करें (जैसे, Python, Node.js या Go के साथ)
Gemini API के लिए, सर्वर साइड Vertex AI SDK टूल, Genkit या Firebase Extensions का इस्तेमाल करें.

शुरू करने से पहले

अगर आपने अब तक ऐसा नहीं किया है, तो शुरू करने से जुड़ी गाइड पढ़ें. इसमें, Firebase प्रोजेक्ट सेट अप करने, अपने ऐप्लिकेशन को Firebase से कनेक्ट करने, SDK टूल जोड़ने, Vertex AI सेवा को शुरू करने, और GenerativeModel इंस्टेंस बनाने का तरीका बताया गया है.

टेक्स्ट और एक इमेज से टेक्स्ट जनरेट करना टेक्स्ट और कई इमेज से टेक्स्ट जनरेट करना टेक्स्ट और वीडियो से टेक्स्ट जनरेट करना

अहम जानकारी: इस पेज पर दिए गए उदाहरणों से पता चलता है कि अनुरोधों में, छोटी फ़ाइलों को इनलाइन डेटा के तौर पर कैसे शामिल किया जा सकता है. हालांकि, अगर आपको ऐसी फ़ाइलें शामिल करनी हैं जिनसे आपके अनुरोध का कुल साइज़ 20 एमबी से ज़्यादा हो जाएगा, तो आपको यूआरएल का इस्तेमाल करके फ़ाइल उपलब्ध करानी होगी. उदाहरण के लिए, Cloud Storage for Firebase यूआरएल का इस्तेमाल करके.

मीडिया फ़ाइलों के सैंपल

अगर आपके पास पहले से मीडिया फ़ाइलें नहीं हैं, तो सार्वजनिक तौर पर उपलब्ध इन फ़ाइलों का इस्तेमाल किया जा सकता है. ये फ़ाइलें उन बकेट में सेव होती हैं जो आपके Firebase प्रोजेक्ट में नहीं होतीं. इसलिए, आपको यूआरएल के लिए https://storage.googleapis.com/BUCKET_NAME/PATH/TO/FILE फ़ॉर्मैट का इस्तेमाल करना होगा.

इमेज: https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg image/jpeg MIME टाइप वाली. इस इमेज को देखें या डाउनलोड करें.
PDF: https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf application/pdf MIME टाइप वाला. यह PDF देखें या डाउनलोड करें.
वीडियो: https://storage.googleapis.com/cloud-samples-data/video/animals.mp4 video/mp4 MIME टाइप वाला. यह वीडियो देखें या डाउनलोड करें.
ऑडियो: https://storage.googleapis.com/cloud-samples-data/generative-ai/audio/pixel.mp3 audio/mp3 MIME टाइप वाला. इस ऑडियो को सुनें या डाउनलोड करें.

टेक्स्ट और एक इमेज से टेक्स्ट जनरेट करना

इस सैंपल को आज़माने से पहले, पक्का करें कि आपने इस गाइड का शुरू करने से पहले वाला सेक्शन पूरा कर लिया हो.

Gemini API को मल्टीमोडल प्रॉम्प्ट के साथ कॉल किया जा सकता है. इन प्रॉम्प्ट में टेक्स्ट और एक फ़ाइल, जैसे कि इमेज (जैसा कि इस उदाहरण में दिखाया गया है) शामिल होती है.

इनपुट फ़ाइलों के लिए ज़रूरी शर्तें और सुझाव ज़रूर देखें.

Swift

टेक्स्ट और एक इमेज वाले मल्टीमोडल प्रॉम्प्ट अनुरोध से टेक्स्ट जनरेट करने के लिए, generateContent() को कॉल किया जा सकता है:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image, prompt)
print(response.text ?? "No text in response.")

ध्यान दें: ऊपर दिए गए उदाहरण में, कई मोड वाले प्रॉम्प्ट में, प्लैटफ़ॉर्म के हिसाब से बनी इमेज टाइप (UIImage, NSImage, CIImage, और CGImage) को मैनेज करने के लिए, आसान तरीके का इस्तेमाल किया गया है. इन इमेज टाइप को क्लाइंट-साइड पर, सर्वर पर भेजे जाने से पहले 80% क्वालिटी में JPEG में बदल दिया जाता है. भले ही, इनका ओरिजनल फ़ॉर्मैट कुछ भी हो. इसका मतलब है कि ऊपर दिए गए उदाहरण की तरह, इमेज को इनलाइन में उपलब्ध कराने पर, आपको MIME टाइप की जानकारी देने की ज़रूरत नहीं होती.

इमेज के फ़ॉर्मैट और कन्वर्ज़न पर ज़्यादा कंट्रोल पाने के लिए, इमेज को InlineDataPart के तौर पर उपलब्ध कराया जा सकता है. साथ ही, इमेज का MIME टाइप भी दिया जा सकता है. उदाहरण के लिए: InlineDataPart(data: Data(/* PNG Data */), mimeType: "image/png").

Kotlin

^{Kotlin के लिए, इस SDK टूल में मौजूद मैथड, सस्पेंड फ़ंक्शन हैं. इन्हें कोरूटीन स्कोप से कॉल किया जाना चाहिए.}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

ध्यान दें: ऊपर दिए गए उदाहरण में, कई मोड वाले प्रॉम्प्ट में, प्लैटफ़ॉर्म के हिसाब से बनी इमेज टाइप (Bitmap) को मैनेज करने के लिए, आसान तरीके का इस्तेमाल किया गया है. इन इमेज टाइप को क्लाइंट-साइड पर, सर्वर पर भेजे जाने से पहले 80% क्वालिटी में JPEG में बदल दिया जाता है. भले ही, इनका ओरिजनल फ़ॉर्मैट कुछ भी हो. इसका मतलब है कि ऊपर दिए गए उदाहरण की तरह, इमेज को इनलाइन में उपलब्ध कराने पर, आपको MIME टाइप की जानकारी देने की ज़रूरत नहीं होती.

इमेज के फ़ॉर्मैट और कन्वर्ज़न पर ज़्यादा कंट्रोल पाने के लिए, इमेज को InlineDataPart के तौर पर उपलब्ध कराया जा सकता है. साथ ही, इमेज का MIME टाइप भी दिया जा सकता है. उदाहरण के लिए: content { inlineData(/* PNG as byte array */, "image/png") }.

Java

^{Java के लिए, इस SDK टूल के तरीके ListenableFuture दिखाते हैं.}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and image
  const result = await model.generateContent([prompt, imagePart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To generate text output, call generateContent with the text and image
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
print(response.text);

अपने इस्तेमाल के उदाहरण और ऐप्लिकेशन के हिसाब से सही मॉडल और जगह चुनने का तरीका जानें.

टेक्स्ट और कई इमेज से टेक्स्ट जनरेट करना

Gemini API को मल्टीमोडल प्रॉम्प्ट के साथ कॉल किया जा सकता है. इन प्रॉम्प्ट में टेक्स्ट और कई फ़ाइलें, जैसे कि इमेज (जैसा कि इस उदाहरण में दिखाया गया है) शामिल होती हैं.

इनपुट फ़ाइलों के लिए ज़रूरी शर्तें और सुझाव ज़रूर देखें.

Swift

टेक्स्ट और कई इमेज वाले मल्टीमोडल प्रॉम्प्ट अनुरोध से टेक्स्ट जनरेट करने के लिए, generateContent() को कॉल किया जा सकता है:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image1, image2, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
  image(bitmap1)
  image(bitmap2)
  text("What is different between these pictures?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

^{Java के लिए, इस SDK टूल के तरीके ListenableFuture दिखाते हैं.}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  // Prepare images for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To generate text output, call generateContent with the text and images
  const result = await model.generateContent([prompt, ...imageParts]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

टेक्स्ट, इमेज, और वीडियो वगैरह का इस्तेमाल करके किए गए प्रॉम्प्ट अनुरोध से टेक्स्ट जनरेट करने के लिए, generateContent() को कॉल किया जा सकता है. इसमें टेक्स्ट और कई इमेज शामिल होती हैं:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...imageParts])
]);
print(response.text);

टेक्स्ट और वीडियो से टेक्स्ट जनरेट करना

Gemini API को ऐसे मल्टीमोडल प्रॉम्प्ट के साथ कॉल किया जा सकता है जिनमें टेक्स्ट और वीडियो फ़ाइल, दोनों शामिल हों (जैसा कि इस उदाहरण में दिखाया गया है).

इनपुट फ़ाइलों के लिए ज़रूरी शर्तें और सुझाव ज़रूर देखें.

Swift

टेक्स्ट और एक वीडियो वाले मल्टीमोडल प्रॉम्प्ट अनुरोध से टेक्स्ट जनरेट करने के लिए, generateContent() को कॉल किया जा सकता है:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type.
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To generate text output, call generateContent with the text and video
let response = try await model.generateContent(video, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To generate text output, call generateContent with the prompt
    val response = generativeModel.generateContent(prompt)
    Log.d(TAG, response.text ?: "")
  }
}

Java

^{Java के लिए, इस SDK टूल के तरीके ListenableFuture दिखाते हैं.}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To generate text output, call generateContent with the prompt
        ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
        Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
            @Override
            public void onSuccess(GenerateContentResponse result) {
                String resultText = result.getText();
                System.out.println(resultText);
            }

            @Override
            public void onFailure(Throwable t) {
                t.printStackTrace();
            }
        }, executor);
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and video
  const result = await model.generateContent([prompt, videoPart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...videoPart])
]);
print(response.text);

जवाब स्ट्रीम करना

इन सैंपल को आज़माने से पहले, पक्का करें कि आपने इस गाइड का शुरू करने से पहले वाला सेक्शन पूरा कर लिया हो.

मॉडल जनरेशन के पूरे नतीजे का इंतज़ार किए बिना, तेज़ी से इंटरैक्शन हासिल किए जा सकते हैं. इसके बजाय, कुछ नतीजों को मैनेज करने के लिए स्ट्रीमिंग का इस्तेमाल करें. जवाब को स्ट्रीम करने के लिए, generateContentStream को कॉल करें.

उदाहरण देखें: टेक्स्ट और एक इमेज से जनरेट किया गया टेक्स्ट स्ट्रीम करना

Swift

टेक्स्ट, इमेज, और वीडियो वगैरह का इस्तेमाल करके किए गए प्रॉम्प्ट के अनुरोध से जनरेट किए गए टेक्स्ट को स्ट्रीम करने के लिए, generateContentStream() को कॉल किया जा सकता है. इस अनुरोध में टेक्स्ट और एक इमेज शामिल होती है:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

^{Java के लिए, इस SDK टूल में स्ट्रीमिंग के तरीके, Reactive Streams लाइब्रेरी से Publisher टाइप दिखाते हैं.}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content prompt = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  // Prepare image for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and image
  const result = await model.generateContentStream([prompt, imagePart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,imagePart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

उदाहरण देखें: टेक्स्ट और कई इमेज से जनरेट किया गया टेक्स्ट स्ट्रीम करना

Swift

टेक्स्ट, इमेज, और वीडियो वगैरह का इस्तेमाल करके किए गए प्रॉम्प्ट अनुरोध से जनरेट किए गए टेक्स्ट को स्ट्रीम करने के लिए, generateContentStream() को कॉल किया जा सकता है. इस अनुरोध में टेक्स्ट और कई इमेज शामिल होती हैं:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image1, image2, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
    image(bitmap1)
    image(bitmap2)
    text("What's different between these pictures?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To stream generated text output, call generateContentStream with the text and images
  const result = await model.generateContentStream([prompt, ...imageParts]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

इस उदाहरण में, टेक्स्ट और कई इमेज वाले मल्टीमोडल प्रॉम्प्ट अनुरोध से जनरेट किए गए टेक्स्ट को स्ट्रीम करने के लिए, generateContentStream का इस्तेमाल करने का तरीका बताया गया है:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To stream generated text output, call generateContentStream with the text and images
final response = await model.generateContentStream([
  Content.multi([prompt, ...imageParts])
]);
await for (final chunk in response) {
  print(chunk.text);
}

उदाहरण देखें: टेक्स्ट और वीडियो से जनरेट किया गया टेक्स्ट स्ट्रीम करना

Swift

टेक्स्ट और एक वीडियो वाले मल्टीमोडल प्रॉम्प्ट अनुरोध से जनरेट किए गए टेक्स्ट को स्ट्रीम करने के लिए, generateContentStream() को कॉल किया जा सकता है:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To stream generated text output, call generateContentStream with the text and video
let contentStream = try model.generateContentStream(video, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To stream generated text output, call generateContentStream with the prompt
    var fullResponse = ""
    generativeModel.generateContentStream(prompt).collect { chunk ->
        Log.d(TAG, chunk.text ?: "")
        fullResponse += chunk.text
    }
  }
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To stream generated text output, call generateContentStream with the prompt
        Publisher<GenerateContentResponse> streamingResponse =
                model.generateContentStream(prompt);

        final String[] fullResponse = {""};

        streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
            @Override
            public void onNext(GenerateContentResponse generateContentResponse) {
                String chunk = generateContentResponse.getText();
                fullResponse[0] += chunk;
            }

            @Override
            public void onComplete() {
                System.out.println(fullResponse[0]);
            }

            @Override
            public void onError(Throwable t) {
                t.printStackTrace();
            }

            @Override
            public void onSubscribe(Subscription s) {
            }
         });
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and video
  const result = await model.generateContentStream([prompt, videoPart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,videoPart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

इनपुट फ़ाइलों के लिए ज़रूरी शर्तें और सुझाव

इनके बारे में जानने के लिए, Gemini API in Vertex AI के लिए इस्तेमाल की जा सकने वाली इनपुट फ़ाइलें और ज़रूरी शर्तें देखें:

अनुरोध में फ़ाइल देने के अलग-अलग विकल्प
समर्थित फ़ाइल प्रकार
इस्तेमाल किए जा सकने वाले MIME टाइप और उन्हें बताने का तरीका
फ़ाइलों और अलग-अलग तरीकों से किए जाने वाले अनुरोधों के लिए ज़रूरी शर्तें और सबसे सही तरीके

अहम जानकारी: Vertex AI in Firebase SDK टूल के लिए, अनुरोध का ज़्यादा से ज़्यादा साइज़ 20 एमबी हो सकता है. अगर अनुरोध बहुत बड़ा है, तो आपको एचटीटीपी 413 गड़बड़ी का मैसेज मिलता है.

अगर किसी फ़ाइल की वजह से अनुरोध का कुल साइज़ 20 एमबी से ज़्यादा हो जाता है, तो आपको फ़ाइल को यूआरएल का इस्तेमाल करके उपलब्ध कराना होगा. उदाहरण के लिए, Cloud Storage for Firebase यूआरएल का इस्तेमाल करके. हालांकि, अगर कोई फ़ाइल छोटी है, तो अक्सर उसे सीधे इनलाइन डेटा के तौर पर पास किया जा सकता है (जैसा कि ऊपर दिए गए उदाहरणों में दिखाया गया है). हालांकि, ध्यान दें कि इनलाइन डेटा के तौर पर दी गई फ़ाइल को ट्रांज़िट के दौरान base64 कोड में बदल दिया जाता है. इससे अनुरोध का साइज़ बढ़ जाता है.

तुम और क्या कर सकती हो?

मॉडल को लंबे प्रॉम्प्ट भेजने से पहले, टोकन की गिनती करने का तरीका जानें.
Cloud Storage for Firebase को सेट अप करें, ताकि आप अपने कई मोड वाले अनुरोधों में बड़ी फ़ाइलें शामिल कर सकें. साथ ही, प्रॉम्प्ट में फ़ाइलें उपलब्ध कराने के लिए, बेहतर तरीके से मैनेज किया जा सके. फ़ाइलों में इमेज, PDF, वीडियो, और ऑडियो शामिल हो सकते हैं.
प्रोडक्शन के लिए तैयारी करना शुरू करें. इसमें, Gemini API को बिना अनुमति वाले क्लाइंट के गलत इस्तेमाल से बचाने के लिए, Firebase App Check सेट अप करना भी शामिल है. साथ ही, प्रोडक्शन की चेकलिस्ट को ज़रूर देखें.

अन्य सुविधाएं आज़माएं

कई बार की गई बातचीत (चैट) बनाएं.
सिर्फ़ टेक्स्ट वाले प्रॉम्प्ट से टेक्स्ट जनरेट करें.
टेक्स्ट और मल्टीमोडल प्रॉम्प्ट, दोनों से स्ट्रक्चर्ड आउटपुट (जैसे कि JSON) जनरेट करें.
टेक्स्ट प्रॉम्प्ट से इमेज जनरेट करें.
जनरेटिव मॉडल को बाहरी सिस्टम और जानकारी से कनेक्ट करने के लिए, फ़ंक्शन कॉल का इस्तेमाल करें.

कॉन्टेंट जनरेशन को कंट्रोल करने का तरीका जानें

प्रॉम्प्ट के डिज़ाइन को समझना. इसमें सबसे सही तरीके, रणनीतियां, और प्रॉम्प्ट के उदाहरण शामिल हैं.
मॉडल पैरामीटर कॉन्फ़िगर करें. जैसे, तापमान और ज़्यादा से ज़्यादा आउटपुट टोकन (Gemini के लिए) या आसपेक्ट रेशियो और व्यक्ति जनरेशन (Imagen के लिए).
सुरक्षा सेटिंग का इस्तेमाल करें, ताकि आपको ऐसे जवाब न मिलें जो नुकसान पहुंचा सकते हैं.

Vertex AI Studio का इस्तेमाल करके, प्रॉम्प्ट और मॉडल कॉन्फ़िगरेशन के साथ भी एक्सपेरिमेंट किया जा सकता है.

इस्तेमाल किए जा सकने वाले मॉडल के बारे में ज़्यादा जानें

अलग-अलग कामों के लिए उपलब्ध मॉडल, उनके कोटे, और कीमत के बारे में जानें.

Vertex AI in Firebase के साथ अपने अनुभव के बारे में सुझाव/राय दें या शिकायत करें