Firebase is back at Google I/O on May 20-21! Register now.

এই পৃষ্ঠাটি Cloud Translation API অনুবাদ করেছে।

Gemini API ব্যবহার করে মাল্টিমোডাল প্রম্পট থেকে পাঠ্য তৈরি করুন, Gemini API ব্যবহার করে মাল্টিমোডাল প্রম্পট থেকে পাঠ্য তৈরি করুন
সেভ করা পৃষ্ঠা গুছিয়ে রাখতে 'সংগ্রহ' ব্যবহার করুন আপনার পছন্দ অনুযায়ী কন্টেন্ট সেভ করুন ও সঠিক বিভাগে রাখুন।

Firebase SDK-তে Vertex AI ব্যবহার করে আপনার অ্যাপ থেকে Gemini API কল করার সময়, আপনি একটি মাল্টিমডাল ইনপুটের উপর ভিত্তি করে টেক্সট তৈরি করতে জেমিনি মডেলকে অনুরোধ করতে পারেন। মাল্টিমোডাল প্রম্পটে একাধিক পদ্ধতি (বা ইনপুটের প্রকার) অন্তর্ভুক্ত থাকতে পারে, যেমন চিত্র সহ পাঠ্য, পিডিএফ, প্লেইন-টেক্সট ফাইল, ভিডিও এবং অডিও।

প্রতিটি মাল্টিমোডাল অনুরোধে, আপনাকে সর্বদা নিম্নলিখিতগুলি প্রদান করতে হবে:

ফাইলের mimeType . প্রতিটি ইনপুট ফাইলের সমর্থিত MIME প্রকারগুলি সম্পর্কে জানুন৷
ফাইল। আপনি ফাইলটিকে ইনলাইন ডেটা হিসাবে প্রদান করতে পারেন (যেমন এই পৃষ্ঠায় দেখানো হয়েছে) অথবা এর URL বা URI ব্যবহার করে৷

মাল্টিমোডাল প্রম্পটে পরীক্ষা এবং পুনরাবৃত্তি করার জন্য, আমরা Vertex AI Studio ব্যবহার করার পরামর্শ দিই।

Gemini API এর সাথে কাজ করার জন্য অন্যান্য বিকল্প

ঐচ্ছিকভাবে Gemini API- এর একটি বিকল্প " Google AI " সংস্করণ নিয়ে পরীক্ষা করুন৷
Google AI স্টুডিও এবং Google AI ক্লায়েন্ট SDK ব্যবহার করে বিনামূল্যে অ্যাক্সেস পান (সীমার মধ্যে এবং যেখানে উপলব্ধ)। এই SDKগুলি শুধুমাত্র মোবাইল এবং ওয়েব অ্যাপে প্রোটোটাইপ করার জন্য ব্যবহার করা উচিত৷
একটি Gemini API কীভাবে কাজ করে সে সম্পর্কে আপনি পরিচিত হওয়ার পরে, Firebase SDK-এ আমাদের Vertex AI- তে স্থানান্তর করুন (এই ডকুমেন্টেশন), যেটিতে মোবাইল এবং ওয়েব অ্যাপের জন্য গুরুত্বপূর্ণ অনেক অতিরিক্ত বৈশিষ্ট্য রয়েছে, যেমন Firebase App Check ব্যবহার করে API-কে অপব্যবহার থেকে রক্ষা করা এবং অনুরোধে বড় মিডিয়া ফাইলগুলির জন্য সমর্থন।
ঐচ্ছিকভাবে জেমিনি API-কে Vertex AI সার্ভার-সাইডে কল করুন (যেমন Python, Node.js, বা Go সহ)
Gemini API-এর Firebase Extensions সার্ভার-সাইড Vertex AI SDKs , Genkit বা Firebase এক্সটেনশনগুলি ব্যবহার করুন।

আপনি শুরু করার আগে

যদি আপনি ইতিমধ্যে না করে থাকেন, শুরু করার নির্দেশিকাটি সম্পূর্ণ করুন, যা বর্ণনা করে যে কীভাবে আপনার ফায়ারবেস প্রকল্প সেট আপ করবেন, আপনার অ্যাপকে ফায়ারবেসের সাথে সংযুক্ত করবেন, SDK যোগ করবেন, Vertex AI পরিষেবা শুরু করবেন এবং একটি GenerativeModel উদাহরণ তৈরি করবেন।

পাঠ্য এবং একটি একক চিত্র থেকে পাঠ্য তৈরি করুন পাঠ্য এবং একাধিক চিত্র থেকে পাঠ্য তৈরি করুন পাঠ্য এবং একটি ভিডিও থেকে পাঠ্য তৈরি করুন

গুরুত্বপূর্ণ: এই পৃষ্ঠার উদাহরণগুলি দেখায় কিভাবে অনুরোধে ইনলাইন ডেটা হিসাবে ছোট ফাইলগুলি অন্তর্ভুক্ত করতে হয়৷ যাইহোক, যদি আপনি এমন ফাইলগুলি অন্তর্ভুক্ত করতে চান যা আপনার মোট অনুরোধের আকার 20 MB ছাড়িয়ে যাবে, তাহলে আপনাকে একটি URL ব্যবহার করে ফাইলটি প্রদান করতে হবে (উদাহরণস্বরূপ, Cloud Storage for Firebase ব্যবহার করে)।

নমুনা মিডিয়া ফাইল

আপনার যদি ইতিমধ্যেই মিডিয়া ফাইল না থাকে, তাহলে আপনি নিম্নলিখিত সর্বজনীনভাবে উপলব্ধ ফাইলগুলি ব্যবহার করতে পারেন৷ যেহেতু এই ফাইলগুলি আপনার ফায়ারবেস প্রকল্পে নেই এমন বালতিতে সংরক্ষণ করা হয়েছে, তাই আপনাকে URL-এর জন্য https://storage.googleapis.com/ BUCKET_NAME/PATH/TO/FILE ফর্ম্যাট ব্যবহার করতে হবে৷

ছবি : https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg একটি MIME ধরনের image/jpeg সহ। এই ছবিটি দেখুন বা ডাউনলোড করুন।
PDF : https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf একটি MIME ধরনের application/pdf সহ। এই PDF দেখুন বা ডাউনলোড করুন.
ভিডিও : https://storage.googleapis.com/cloud-samples-data/video/animals.mp4 একটি MIME ধরনের video/mp4 সহ। এই ভিডিওটি দেখুন বা ডাউনলোড করুন।
অডিও : https://storage.googleapis.com/cloud-samples-data/generative-ai/audio/pixel.mp3 একটি MIME ধরনের audio/mp3 সহ। এই অডিওটি শুনুন বা ডাউনলোড করুন।

পাঠ্য এবং একটি একক চিত্র থেকে পাঠ্য তৈরি করুন

এই নমুনা চেষ্টা করার আগে নিশ্চিত করুন যে আপনি এই গাইডের শুরু করার আগে বিভাগটি সম্পূর্ণ করেছেন।

আপনি মাল্টিমোডাল প্রম্পট সহ জেমিনি API কল করতে পারেন যাতে পাঠ্য এবং একটি একক ফাইল উভয়ই অন্তর্ভুক্ত থাকে (যেমন একটি চিত্র, যেমন এই উদাহরণে দেখানো হয়েছে)।

ইনপুট ফাইলের জন্য প্রয়োজনীয়তা এবং সুপারিশ পর্যালোচনা করতে ভুলবেন না।

সুইফট

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে পাঠ্য তৈরি করতে generateContent() কল করতে পারেন যাতে পাঠ্য এবং একটি একক চিত্র অন্তর্ভুক্ত থাকে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image, prompt)
print(response.text ?? "No text in response.")

দ্রষ্টব্য : উপরের উদাহরণটি মাল্টি-মোডাল প্রম্পটে প্ল্যাটফর্ম-নেটিভ ইমেজ প্রকারগুলি ( UIImage , NSImage , CIImage , এবং CGImage ) পরিচালনা করার একটি সরলীকৃত উপায়ের সুবিধা নেয়। এই ধরনের চিত্রগুলি (তাদের আসল বিন্যাস নির্বিশেষে) সার্ভারে পাঠানোর আগে 80% গুণমানে ক্লায়েন্ট-সাইডে JPEG-তে রূপান্তরিত হয়। এর মানে হল যে আপনি উপরের উদাহরণের মত ইনলাইনে ছবি প্রদান করলে, আপনাকে MIME প্রকার নির্দিষ্ট করতে হবে না।

ইমেজ ফরম্যাট এবং রূপান্তরগুলির উপর আরও নিয়ন্ত্রণের জন্য, আপনি একটি InlineDataPart হিসাবে ছবিগুলি প্রদান করতে পারেন এবং নির্দিষ্ট MIME প্রকার সরবরাহ করতে পারেন৷ যেমন: InlineDataPart(data: Data(/* PNG Data */), mimeType: "image/png") ।

Kotlin

^{কোটলিনের জন্য, এই SDK-এর পদ্ধতিগুলি হল সাসপেন্ড ফাংশন এবং একটি Coroutine স্কোপ থেকে কল করা প্রয়োজন৷}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

দ্রষ্টব্য : উপরের উদাহরণটি মাল্টি-মোডাল প্রম্পটে প্ল্যাটফর্ম-নেটিভ ইমেজ টাইপ ( Bitmap ) পরিচালনা করার একটি সরলীকৃত উপায়ের সুবিধা নেয়। এই ধরনের চিত্রগুলি (তাদের আসল বিন্যাস নির্বিশেষে) সার্ভারে পাঠানোর আগে 80% গুণমানে ক্লায়েন্ট-সাইডে JPEG-তে রূপান্তরিত হয়। এর মানে হল যে আপনি উপরের উদাহরণের মত ইনলাইনে ছবি প্রদান করলে, আপনাকে MIME প্রকার নির্দিষ্ট করতে হবে না।

ইমেজ ফরম্যাট এবং রূপান্তরগুলির উপর আরও নিয়ন্ত্রণের জন্য, আপনি একটি InlineDataPart হিসাবে ছবিগুলি প্রদান করতে পারেন এবং নির্দিষ্ট MIME প্রকার সরবরাহ করতে পারেন৷ যেমন: content { inlineData(/* PNG as byte array */, "image/png") } ।

Java

^{জাভার জন্য, এই SDK-এর পদ্ধতিগুলি একটি ListenableFuture প্রদান করে।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and image
  const result = await model.generateContent([prompt, imagePart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To generate text output, call generateContent with the text and image
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
print(response.text);

আপনার ব্যবহারের ক্ষেত্রে এবং অ্যাপের জন্য উপযুক্ত একটি মডেল এবং ঐচ্ছিকভাবে একটি অবস্থান কীভাবে চয়ন করবেন তা শিখুন।

পাঠ্য এবং একাধিক চিত্র থেকে পাঠ্য তৈরি করুন

আপনি মাল্টিমোডাল প্রম্পট সহ জেমিনি API কল করতে পারেন যাতে পাঠ্য এবং একাধিক ফাইল উভয়ই অন্তর্ভুক্ত থাকে (যেমন চিত্রগুলি, যেমন এই উদাহরণে দেখানো হয়েছে)।

ইনপুট ফাইলের জন্য প্রয়োজনীয়তা এবং সুপারিশ পর্যালোচনা করতে ভুলবেন না।

সুইফট

আপনি একটি মাল্টিমডাল প্রম্পট অনুরোধ থেকে পাঠ্য তৈরি করতে generateContent() কল করতে পারেন যাতে পাঠ্য এবং একাধিক চিত্র রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image1, image2, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
  image(bitmap1)
  image(bitmap2)
  text("What is different between these pictures?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

^{জাভার জন্য, এই SDK-এর পদ্ধতিগুলি একটি ListenableFuture প্রদান করে।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  // Prepare images for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To generate text output, call generateContent with the text and images
  const result = await model.generateContent([prompt, ...imageParts]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...imageParts])
]);
print(response.text);

পাঠ্য এবং একটি ভিডিও থেকে পাঠ্য তৈরি করুন

আপনি মাল্টিমোডাল প্রম্পট সহ জেমিনি API কল করতে পারেন যাতে পাঠ্য এবং ভিডিও ফাইল(গুলি) উভয়ই অন্তর্ভুক্ত থাকে (যেমন এই উদাহরণে দেখানো হয়েছে)।

ইনপুট ফাইলের জন্য প্রয়োজনীয়তা এবং সুপারিশ পর্যালোচনা করতে ভুলবেন না।

সুইফট

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে পাঠ্য তৈরি করতে generateContent() কল করতে পারেন যাতে পাঠ্য এবং একটি একক ভিডিও রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type.
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To generate text output, call generateContent with the text and video
let response = try await model.generateContent(video, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To generate text output, call generateContent with the prompt
    val response = generativeModel.generateContent(prompt)
    Log.d(TAG, response.text ?: "")
  }
}

Java

^{জাভার জন্য, এই SDK-এর পদ্ধতিগুলি একটি ListenableFuture প্রদান করে।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To generate text output, call generateContent with the prompt
        ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
        Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
            @Override
            public void onSuccess(GenerateContentResponse result) {
                String resultText = result.getText();
                System.out.println(resultText);
            }

            @Override
            public void onFailure(Throwable t) {
                t.printStackTrace();
            }
        }, executor);
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and video
  const result = await model.generateContent([prompt, videoPart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...videoPart])
]);
print(response.text);

প্রতিক্রিয়া স্ট্রীম

এই নমুনাগুলি চেষ্টা করার আগে আপনি এই গাইডের শুরু করার আগে বিভাগটি সম্পূর্ণ করেছেন তা নিশ্চিত করুন।

আপনি মডেল জেনারেশন থেকে সম্পূর্ণ ফলাফলের জন্য অপেক্ষা না করে দ্রুত মিথস্ক্রিয়া অর্জন করতে পারেন এবং পরিবর্তে আংশিক ফলাফল পরিচালনা করতে স্ট্রিমিং ব্যবহার করতে পারেন। প্রতিক্রিয়া স্ট্রিম করতে, generateContentStream কল করুন।

উদাহরণ দেখুন: পাঠ্য এবং একটি একক চিত্র থেকে তৈরি পাঠ্য স্ট্রিম করুন

সুইফট

আপনি generateContentStream() কল করতে পারেন একটি মাল্টিমডাল প্রম্পট অনুরোধ থেকে জেনারেট করা পাঠ্য স্ট্রিম করতে যাতে পাঠ্য এবং একটি একক চিত্র অন্তর্ভুক্ত থাকে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

^{জাভার জন্য, এই SDK-এর স্ট্রিমিং পদ্ধতিগুলি প্রতিক্রিয়াশীল স্ট্রীমস লাইব্রেরি থেকে একটি Publisher টাইপ ফেরত দেয়।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content prompt = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  // Prepare image for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and image
  const result = await model.generateContentStream([prompt, imagePart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,imagePart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

উদাহরণ দেখুন: টেক্সট এবং একাধিক ছবি থেকে জেনারেট করা টেক্সট স্ট্রিম করুন

সুইফট

আপনি generateContentStream() কল করতে পারেন একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে জেনারেট করা পাঠ্য স্ট্রিম করতে যাতে পাঠ্য এবং একাধিক চিত্র অন্তর্ভুক্ত থাকে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image1, image2, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
    image(bitmap1)
    image(bitmap2)
    text("What's different between these pictures?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To stream generated text output, call generateContentStream with the text and images
  const result = await model.generateContentStream([prompt, ...imageParts]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

এই উদাহরণটি দেখায় কিভাবে generateContentStream ব্যবহার করে একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে জেনারেট করা টেক্সট স্ট্রিম করতে হয় যাতে টেক্সট এবং একাধিক ছবি রয়েছে:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To stream generated text output, call generateContentStream with the text and images
final response = await model.generateContentStream([
  Content.multi([prompt, ...imageParts])
]);
await for (final chunk in response) {
  print(chunk.text);
}

উদাহরণ দেখুন: পাঠ্য এবং একটি ভিডিও থেকে তৈরি পাঠ্য স্ট্রিম করুন

সুইফট

আপনি generateContentStream() কল করতে পারেন একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে জেনারেট করা পাঠ্য স্ট্রীম করতে যাতে পাঠ্য এবং একটি একক ভিডিও রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To stream generated text output, call generateContentStream with the text and video
let contentStream = try model.generateContentStream(video, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To stream generated text output, call generateContentStream with the prompt
    var fullResponse = ""
    generativeModel.generateContentStream(prompt).collect { chunk ->
        Log.d(TAG, chunk.text ?: "")
        fullResponse += chunk.text
    }
  }
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To stream generated text output, call generateContentStream with the prompt
        Publisher<GenerateContentResponse> streamingResponse =
                model.generateContentStream(prompt);

        final String[] fullResponse = {""};

        streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
            @Override
            public void onNext(GenerateContentResponse generateContentResponse) {
                String chunk = generateContentResponse.getText();
                fullResponse[0] += chunk;
            }

            @Override
            public void onComplete() {
                System.out.println(fullResponse[0]);
            }

            @Override
            public void onError(Throwable t) {
                t.printStackTrace();
            }

            @Override
            public void onSubscribe(Subscription s) {
            }
         });
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and video
  const result = await model.generateContentStream([prompt, videoPart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,videoPart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

ইনপুট ফাইলের জন্য প্রয়োজনীয়তা এবং সুপারিশ

নিম্নলিখিত বিষয়ে জানতে Vertex AI- তে Gemini API-এর জন্য সমর্থিত ইনপুট ফাইল এবং প্রয়োজনীয়তাগুলি দেখুন:

একটি অনুরোধে একটি ফাইল প্রদানের জন্য বিভিন্ন বিকল্প
সমর্থিত ফাইল প্রকার
সমর্থিত MIME প্রকার এবং কিভাবে সেগুলি নির্দিষ্ট করতে হয়৷
ফাইল এবং মাল্টিমোডাল অনুরোধের জন্য প্রয়োজনীয়তা এবং সর্বোত্তম অনুশীলন

গুরুত্বপূর্ণ: Firebase SDK-তে Vertex AI-এর জন্য, সর্বোচ্চ অনুরোধের আকার হল 20 MB । একটি অনুরোধ খুব বড় হলে আপনি একটি HTTP 413 ত্রুটি পাবেন।

যদি একটি ফাইলের আকার মোট অনুরোধের আকার 20 MB ছাড়িয়ে যায়, তাহলে আপনাকে একটি URL ব্যবহার করে ফাইলটি প্রদান করতে হবে (উদাহরণস্বরূপ, Cloud Storage for Firebase ব্যবহার করে)। যাইহোক, যদি একটি ফাইল ছোট হয়, আপনি প্রায়ই এটি সরাসরি ইনলাইন ডেটা হিসাবে পাস করতে পারেন (উপরের উদাহরণগুলিতে দেখানো হয়েছে)। যদিও মনে রাখবেন, ইনলাইন ডেটা হিসাবে প্রদত্ত একটি ফাইল ট্রানজিটে বেস64-এ এনকোড করা হয়, যা অনুরোধের আকার বাড়ায়।

আপনি আর কি করতে পারেন?

মডেলে দীর্ঘ প্রম্পট পাঠানোর আগে কীভাবে টোকেন গণনা করবেন তা শিখুন।
Cloud Storage for Firebase সেট আপ করুন যাতে আপনি আপনার মাল্টিমোডাল অনুরোধগুলিতে বড় ফাইলগুলি অন্তর্ভুক্ত করতে পারেন এবং প্রম্পটে ফাইলগুলি সরবরাহ করার জন্য আরও পরিচালিত সমাধান পেতে পারেন৷ ফাইলগুলিতে ছবি, পিডিএফ, ভিডিও এবং অডিও অন্তর্ভুক্ত থাকতে পারে।
জেমিনি API কে অননুমোদিত ক্লায়েন্টদের অপব্যবহার থেকে রক্ষা করতে Firebase App Check সেট আপ সহ উত্পাদনের জন্য প্রস্তুতির বিষয়ে চিন্তা করা শুরু করুন৷ এছাড়াও, উত্পাদন চেকলিস্ট পর্যালোচনা করতে ভুলবেন না।

অন্যান্য ক্ষমতা ব্যবহার করে দেখুন

মাল্টি-টার্ন কথোপকথন তৈরি করুন (চ্যাট) ।
শুধুমাত্র পাঠ্য প্রম্পট থেকে পাঠ্য তৈরি করুন।
টেক্সট এবং মাল্টিমোডাল প্রম্পট উভয় থেকে কাঠামোগত আউটপুট (যেমন JSON) তৈরি করুন।
টেক্সট প্রম্পট থেকে ছবি তৈরি করুন।
বাহ্যিক সিস্টেম এবং তথ্যের সাথে জেনারেটিভ মডেল সংযোগ করতে ফাংশন কলিং ব্যবহার করুন।

বিষয়বস্তু তৈরি নিয়ন্ত্রণ কিভাবে শিখুন

সর্বোত্তম অনুশীলন, কৌশল এবং উদাহরণ প্রম্পট সহ প্রম্পট ডিজাইন বুঝুন ।
তাপমাত্রা এবং সর্বোচ্চ আউটপুট টোকেন ( মিথুনের জন্য) বা আকৃতির অনুপাত এবং ব্যক্তি তৈরির ( ইমেজেনের জন্য) মত মডেল প্যারামিটারগুলি কনফিগার করুন ।
ক্ষতিকারক বলে বিবেচিত প্রতিক্রিয়া পাওয়ার সম্ভাবনা সামঞ্জস্য করতে নিরাপত্তা সেটিংস ব্যবহার করুন ।

আপনি Vertex AI Studio ব্যবহার করে প্রম্পট এবং মডেল কনফিগারেশন নিয়ে পরীক্ষা করতে পারেন।

সমর্থিত মডেল সম্পর্কে আরও জানুন

বিভিন্ন ব্যবহারের ক্ষেত্রে উপলব্ধ মডেল এবং তাদের কোটা এবং মূল্য সম্পর্কে জানুন।

Firebase-এ Vertex AI-এর সাথে আপনার অভিজ্ঞতা সম্পর্কে মতামত দিন

ফাইলের mimeType . প্রতিটি ইনপুট ফাইলের সমর্থিত MIME প্রকারগুলি সম্পর্কে জানুন৷
ফাইল। আপনি ফাইলটিকে ইনলাইন ডেটা হিসাবে প্রদান করতে পারেন (যেমন এই পৃষ্ঠায় দেখানো হয়েছে) অথবা এর URL বা URI ব্যবহার করে৷

Gemini API এর সাথে কাজ করার জন্য অন্যান্য বিকল্প

ঐচ্ছিকভাবে Gemini API- এর একটি বিকল্প " Google AI " সংস্করণ নিয়ে পরীক্ষা করুন৷
Google AI স্টুডিও এবং Google AI ক্লায়েন্ট SDK ব্যবহার করে বিনামূল্যে অ্যাক্সেস পান (সীমার মধ্যে এবং যেখানে উপলব্ধ)। এই SDKগুলি শুধুমাত্র মোবাইল এবং ওয়েব অ্যাপে প্রোটোটাইপ করার জন্য ব্যবহার করা উচিত৷
একটি Gemini API কীভাবে কাজ করে সে সম্পর্কে আপনি পরিচিত হওয়ার পরে, Firebase SDK-এ আমাদের Vertex AI- তে স্থানান্তর করুন (এই ডকুমেন্টেশন), যেটিতে মোবাইল এবং ওয়েব অ্যাপের জন্য গুরুত্বপূর্ণ অনেক অতিরিক্ত বৈশিষ্ট্য রয়েছে, যেমন Firebase App Check ব্যবহার করে API-কে অপব্যবহার থেকে রক্ষা করা এবং অনুরোধে বড় মিডিয়া ফাইলগুলির জন্য সমর্থন।
ঐচ্ছিকভাবে জেমিনি API-কে Vertex AI সার্ভার-সাইডে কল করুন (যেমন Python, Node.js, বা Go সহ)
Gemini API-এর Firebase Extensions সার্ভার-সাইড Vertex AI SDKs , Genkit বা Firebase এক্সটেনশনগুলি ব্যবহার করুন।

আপনি শুরু করার আগে

নমুনা মিডিয়া ফাইল

ছবি : https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg একটি MIME ধরনের image/jpeg সহ। এই ছবিটি দেখুন বা ডাউনলোড করুন।
PDF : https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf একটি MIME ধরনের application/pdf সহ। এই PDF দেখুন বা ডাউনলোড করুন.
ভিডিও : https://storage.googleapis.com/cloud-samples-data/video/animals.mp4 একটি MIME ধরনের video/mp4 সহ। এই ভিডিওটি দেখুন বা ডাউনলোড করুন।
অডিও : https://storage.googleapis.com/cloud-samples-data/generative-ai/audio/pixel.mp3 একটি MIME ধরনের audio/mp3 সহ। এই অডিওটি শুনুন বা ডাউনলোড করুন।

পাঠ্য এবং একটি একক চিত্র থেকে পাঠ্য তৈরি করুন

ইনপুট ফাইলের জন্য প্রয়োজনীয়তা এবং সুপারিশ পর্যালোচনা করতে ভুলবেন না।

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

^{জাভার জন্য, এই SDK-এর পদ্ধতিগুলি একটি ListenableFuture প্রদান করে।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and image
  const result = await model.generateContent([prompt, imagePart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To generate text output, call generateContent with the text and image
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
print(response.text);

পাঠ্য এবং একাধিক চিত্র থেকে পাঠ্য তৈরি করুন

ইনপুট ফাইলের জন্য প্রয়োজনীয়তা এবং সুপারিশ পর্যালোচনা করতে ভুলবেন না।

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image1, image2, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
  image(bitmap1)
  image(bitmap2)
  text("What is different between these pictures?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

^{জাভার জন্য, এই SDK-এর পদ্ধতিগুলি একটি ListenableFuture প্রদান করে।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  // Prepare images for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To generate text output, call generateContent with the text and images
  const result = await model.generateContent([prompt, ...imageParts]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...imageParts])
]);
print(response.text);

পাঠ্য এবং একটি ভিডিও থেকে পাঠ্য তৈরি করুন

ইনপুট ফাইলের জন্য প্রয়োজনীয়তা এবং সুপারিশ পর্যালোচনা করতে ভুলবেন না।

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type.
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To generate text output, call generateContent with the text and video
let response = try await model.generateContent(video, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To generate text output, call generateContent with the prompt
    val response = generativeModel.generateContent(prompt)
    Log.d(TAG, response.text ?: "")
  }
}

Java

^{জাভার জন্য, এই SDK-এর পদ্ধতিগুলি একটি ListenableFuture প্রদান করে।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To generate text output, call generateContent with the prompt
        ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
        Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
            @Override
            public void onSuccess(GenerateContentResponse result) {
                String resultText = result.getText();
                System.out.println(resultText);
            }

            @Override
            public void onFailure(Throwable t) {
                t.printStackTrace();
            }
        }, executor);
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and video
  const result = await model.generateContent([prompt, videoPart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...videoPart])
]);
print(response.text);

প্রতিক্রিয়া স্ট্রীম

উদাহরণ দেখুন: পাঠ্য এবং একটি একক চিত্র থেকে তৈরি পাঠ্য স্ট্রিম করুন

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

দ্রষ্টব্য : উপরের উদাহরণটি মাল্টি-মোডাল প্রম্পটে প্ল্যাটফর্ম-নেটিভ ইমেজ টাইপ ( Bitmap ) পরিচালনা করার একটি সরলীকৃত উপায়ের সুবিধা নেয়। এই চিত্রের ধরণগুলি (তাদের মূল ফর্ম্যাট নির্বিশেষে) সার্ভারে প্রেরণের আগে ক্লায়েন্ট-সাইডকে JPEG এ 80% মানের রূপান্তরিত করা হয়। এর অর্থ হ'ল আপনি যখন উপরের উদাহরণের মতো চিত্রগুলি ইনলাইন সরবরাহ করেন, তখন আপনাকে মাইম টাইপ নির্দিষ্ট করার দরকার নেই।

চিত্রের ফর্ম্যাট এবং রূপান্তরগুলির উপর আরও নিয়ন্ত্রণের জন্য, আপনি চিত্রগুলি InlineDataPart হিসাবে সরবরাহ করতে পারেন এবং নির্দিষ্ট মাইম প্রকার সরবরাহ করতে পারেন। যেমন: content { inlineData(/* PNG as byte array */, "image/png") }

Java

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে উত্পন্ন পাঠ্য স্ট্রিম করতে generateContentStream() কল করতে পারেন যাতে পাঠ্য এবং একটি একক চিত্র অন্তর্ভুক্ত রয়েছে:

^{জাভার জন্য, এই এসডিকে স্ট্রিমিং পদ্ধতিগুলি প্রতিক্রিয়াশীল স্ট্রিমস লাইব্রেরি থেকে একটি Publisher টাইপ ফেরত দেয়।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content prompt = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

দ্রষ্টব্য : উপরের উদাহরণটি মাল্টি-মডেল প্রম্পটগুলিতে প্ল্যাটফর্ম-স্থানীয় চিত্রের প্রকারগুলি ( Bitmap ) পরিচালনা করার জন্য একটি সরলিকৃত উপায়ের সুবিধা গ্রহণ করে। এই চিত্রের ধরণগুলি (তাদের মূল ফর্ম্যাট নির্বিশেষে) সার্ভারে প্রেরণের আগে ক্লায়েন্ট-সাইডকে JPEG এ 80% মানের রূপান্তরিত করা হয়। এর অর্থ হ'ল আপনি যখন উপরের উদাহরণের মতো চিত্রগুলি ইনলাইন সরবরাহ করেন, তখন আপনাকে মাইম টাইপ নির্দিষ্ট করার দরকার নেই।

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  // Prepare image for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and image
  const result = await model.generateContentStream([prompt, imagePart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,imagePart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

উদাহরণ দেখুন: পাঠ্য এবং একাধিক চিত্র থেকে উত্পন্ন পাঠ্য স্ট্রিম

সুইফট

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে উত্পন্ন পাঠ্য স্ট্রিম করতে generateContentStream() কল করতে পারেন যাতে পাঠ্য এবং একাধিক চিত্র অন্তর্ভুক্ত রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image1, image2, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

দ্রষ্টব্য : উপরের উদাহরণটি মাল্টি-মডাল প্রম্পটে প্ল্যাটফর্ম-স্থানীয় চিত্রের ধরণগুলি ( UIImage , NSImage , CIImage এবং CGImage ) পরিচালনা করার জন্য একটি সরলিকৃত উপায়ের সুবিধা গ্রহণ করে। এই চিত্রের ধরণগুলি (তাদের মূল ফর্ম্যাট নির্বিশেষে) সার্ভারে প্রেরণের আগে ক্লায়েন্ট-সাইডকে JPEG এ 80% মানের রূপান্তরিত করা হয়। এর অর্থ হ'ল আপনি যখন উপরের উদাহরণের মতো চিত্রগুলি ইনলাইন সরবরাহ করেন, তখন আপনাকে মাইম টাইপ নির্দিষ্ট করার দরকার নেই।

চিত্রের ফর্ম্যাট এবং রূপান্তরগুলির উপর আরও নিয়ন্ত্রণের জন্য, আপনি চিত্রগুলি InlineDataPart হিসাবে সরবরাহ করতে পারেন এবং নির্দিষ্ট মাইম প্রকার সরবরাহ করতে পারেন। যেমন: InlineDataPart(data: Data(/* PNG Data */), mimeType: "image/png") ।

Kotlin

^{কোটলিনের জন্য, এই এসডিকে -র পদ্ধতিগুলি স্থগিত ফাংশন এবং একটি করুটাইন স্কোপ থেকে কল করা দরকার।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
    image(bitmap1)
    image(bitmap2)
    text("What's different between these pictures?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To stream generated text output, call generateContentStream with the text and images
  const result = await model.generateContentStream([prompt, ...imageParts]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

এই উদাহরণটি দেখায় যে কীভাবে একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে উত্পন্ন পাঠ্য প্রবাহিত করতে generateContentStream ব্যবহার করবেন যাতে পাঠ্য এবং একাধিক চিত্র অন্তর্ভুক্ত রয়েছে:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To stream generated text output, call generateContentStream with the text and images
final response = await model.generateContentStream([
  Content.multi([prompt, ...imageParts])
]);
await for (final chunk in response) {
  print(chunk.text);
}

উদাহরণ দেখুন: পাঠ্য এবং একটি ভিডিও থেকে উত্পন্ন পাঠ্য স্ট্রিম

সুইফট

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে উত্পন্ন পাঠ্য স্ট্রিম করতে generateContentStream() কল করতে পারেন যাতে পাঠ্য এবং একটি একক ভিডিও অন্তর্ভুক্ত রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To stream generated text output, call generateContentStream with the text and video
let contentStream = try model.generateContentStream(video, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To stream generated text output, call generateContentStream with the prompt
    var fullResponse = ""
    generativeModel.generateContentStream(prompt).collect { chunk ->
        Log.d(TAG, chunk.text ?: "")
        fullResponse += chunk.text
    }
  }
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To stream generated text output, call generateContentStream with the prompt
        Publisher<GenerateContentResponse> streamingResponse =
                model.generateContentStream(prompt);

        final String[] fullResponse = {""};

        streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
            @Override
            public void onNext(GenerateContentResponse generateContentResponse) {
                String chunk = generateContentResponse.getText();
                fullResponse[0] += chunk;
            }

            @Override
            public void onComplete() {
                System.out.println(fullResponse[0]);
            }

            @Override
            public void onError(Throwable t) {
                t.printStackTrace();
            }

            @Override
            public void onSubscribe(Subscription s) {
            }
         });
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and video
  const result = await model.generateContentStream([prompt, videoPart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,videoPart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

ইনপুট ফাইলগুলির জন্য প্রয়োজনীয়তা এবং সুপারিশ

নিম্নলিখিতগুলি সম্পর্কে জানতে ভার্টেক্স এআই -তে জেমিনি এপিআইয়ের জন্য সমর্থিত ইনপুট ফাইল এবং প্রয়োজনীয়তাগুলি দেখুন:

একটি অনুরোধে একটি ফাইল সরবরাহ করার জন্য বিভিন্ন বিকল্প
সমর্থিত ফাইল প্রকার
সমর্থিত মাইম প্রকারগুলি এবং কীভাবে সেগুলি নির্দিষ্ট করা যায়
ফাইল এবং মাল্টিমোডাল অনুরোধগুলির জন্য প্রয়োজনীয়তা এবং সেরা অনুশীলনগুলি

গুরুত্বপূর্ণ: ফায়ারবেস এসডিকেগুলিতে ভার্টেক্স এআইয়ের জন্য, সর্বাধিক অনুরোধের আকার 20 এমবি । কোনও অনুরোধ খুব বড় হলে আপনি একটি HTTP 413 ত্রুটি পান।

যদি কোনও ফাইলের আকার মোট অনুরোধের আকার 20 এমবি ছাড়িয়ে যায়, তবে আপনাকে একটি ইউআরএল ব্যবহার করে ফাইলটি সরবরাহ করতে হবে (উদাহরণস্বরূপ, Cloud Storage for Firebase ব্যবহার করে)। তবে, যদি কোনও ফাইল ছোট হয় তবে আপনি প্রায়শই এটি সরাসরি ইনলাইন ডেটা হিসাবে পাস করতে পারেন (উপরের উদাহরণগুলিতে দেখানো হয়েছে)। যদিও দ্রষ্টব্য, ইনলাইন ডেটা হিসাবে সরবরাহিত একটি ফাইল ট্রানজিটে বেস 64 এ এনকোড করা হয়েছে, যা অনুরোধের আকার বাড়িয়ে তোলে।

আপনি আর কি করতে পারেন?

মডেলটিতে দীর্ঘ প্রম্পট প্রেরণের আগে টোকেনগুলি কীভাবে গণনা করবেন তা শিখুন।
Cloud Storage for Firebase সেট আপ করুন যাতে আপনি আপনার মাল্টিমোডাল অনুরোধগুলিতে বড় ফাইলগুলি অন্তর্ভুক্ত করতে পারেন এবং প্রম্পটগুলিতে ফাইল সরবরাহ করার জন্য আরও পরিচালিত সমাধান পেতে পারেন। ফাইলগুলিতে চিত্র, পিডিএফএস, ভিডিও এবং অডিও অন্তর্ভুক্ত থাকতে পারে।
অননুমোদিত ক্লায়েন্টদের দ্বারা জেমিনি এপিআইকে অপব্যবহার থেকে রক্ষা করতে Firebase App Check সেট আপ করা সহ উত্পাদনের প্রস্তুতি সম্পর্কে চিন্তাভাবনা শুরু করুন। এছাড়াও, উত্পাদন চেকলিস্টটি পর্যালোচনা করতে ভুলবেন না।

অন্যান্য ক্ষমতা চেষ্টা করে দেখুন

মাল্টি-টার্ন কথোপকথন (চ্যাট) তৈরি করুন।
কেবল পাঠ্য-প্রম্পটগুলি থেকে পাঠ্য তৈরি করুন।
উভয় পাঠ্য এবং মাল্টিমোডাল প্রম্পট থেকে কাঠামোগত আউটপুট (জেএসনের মতো) উত্পন্ন করুন।
পাঠ্য অনুরোধগুলি থেকে চিত্র তৈরি করুন।
বাহ্যিক সিস্টেম এবং তথ্যের সাথে জেনারেটর মডেলগুলি সংযুক্ত করতে ফাংশন কলিং ব্যবহার করুন।

কীভাবে সামগ্রী প্রজন্ম নিয়ন্ত্রণ করবেন তা শিখুন

সর্বোত্তম অনুশীলন, কৌশল এবং উদাহরণ প্রম্পট সহ প্রম্পট ডিজাইনটি বুঝুন ।
তাপমাত্রা এবং সর্বোচ্চ আউটপুট টোকেন ( মিথুনের জন্য) বা দিক অনুপাত এবং ব্যক্তি প্রজন্মের ( ইমেজের জন্য) মতো মডেল পরামিতিগুলি কনফিগার করুন ।
ক্ষতিকারক হিসাবে বিবেচিত হতে পারে এমন প্রতিক্রিয়া পাওয়ার সম্ভাবনা সামঞ্জস্য করতে সুরক্ষা সেটিংস ব্যবহার করুন ।

আপনি ভার্টেক্স এআই স্টুডিও ব্যবহার করে প্রম্পট এবং মডেল কনফিগারেশনগুলি নিয়েও পরীক্ষা করতে পারেন।

সমর্থিত মডেলগুলি সম্পর্কে আরও জানুন

বিভিন্ন ব্যবহারের ক্ষেত্রে এবং তাদের কোটা এবং মূল্য নির্ধারণের জন্য উপলব্ধ মডেলগুলি সম্পর্কে জানুন।

ফায়ারবেসে ভার্টেক্স এআইয়ের সাথে আপনার অভিজ্ঞতা সম্পর্কে প্রতিক্রিয়া জানান

ফায়ারবেস এসডিকে একটি ভার্টেক্স এআই ব্যবহার করে আপনার অ্যাপ্লিকেশন থেকে জেমিনি এপিআইকে কল করার সময়, আপনি মিথুন মডেলটিকে একটি মাল্টিমোডাল ইনপুট ভিত্তিক পাঠ্য তৈরি করতে অনুরোধ করতে পারেন। মাল্টিমোডাল প্রম্পটগুলিতে একাধিক পদ্ধতি (বা ইনপুট প্রকার) অন্তর্ভুক্ত থাকতে পারে, যেমন চিত্র, পিডিএফএস, প্লেইন-টেক্সট ফাইল, ভিডিও এবং অডিও সহ পাঠ্য।

প্রতিটি মাল্টিমোডাল অনুরোধে, আপনাকে অবশ্যই সর্বদা নিম্নলিখিতগুলি সরবরাহ করতে হবে:

ফাইলটির mimeType । প্রতিটি ইনপুট ফাইলের সমর্থিত মাইম প্রকারগুলি সম্পর্কে জানুন।
ফাইল। আপনি হয় ফাইলটি ইনলাইন ডেটা হিসাবে সরবরাহ করতে পারেন (এই পৃষ্ঠায় দেখানো হয়েছে) বা এর ইউআরএল বা ইউআরআই ব্যবহার করে।

মাল্টিমোডাল প্রম্পটগুলিতে পরীক্ষা এবং পুনরাবৃত্তির জন্য, আমরা ভার্টেক্স এআই স্টুডিও ব্যবহার করার পরামর্শ দিই।

জেমিনি এপিআইয়ের সাথে কাজ করার জন্য অন্যান্য বিকল্পগুলি

জেমিনি এপিআইয়ের বিকল্প " গুগল এআই " সংস্করণ নিয়ে ally চ্ছিকভাবে পরীক্ষা করুন
গুগল এআই স্টুডিও এবং গুগল এআই ক্লায়েন্ট এসডিকেএস ব্যবহার করে বিনামূল্যে-চার্জ অ্যাক্সেস (সীমা এবং যেখানে পাওয়া যায়) পান। এই এসডিকে কেবল মোবাইল এবং ওয়েব অ্যাপ্লিকেশনগুলিতে প্রোটোটাইপিংয়ের জন্য ব্যবহার করা উচিত।
কোনও মিথুন এপিআই কীভাবে কাজ করে তার সাথে আপনি পরিচিত হওয়ার পরে, ফায়ারবেস এসডিকেএসে (এই ডকুমেন্টেশন) আমাদের ভার্টেক্স এআইতে স্থানান্তরিত করুন , যার মধ্যে মোবাইল এবং ওয়েব অ্যাপ্লিকেশনগুলির জন্য আরও অনেক অতিরিক্ত বৈশিষ্ট্য রয়েছে, যেমন অনুরোধে বড় মিডিয়া ফাইলগুলির জন্য Firebase App Check এবং সমর্থন ব্যবহার করে অপব্যবহার থেকে এপিআইকে রক্ষা করা।
Ally চ্ছিকভাবে জেমিনি এপিআইকে ভার্টেক্স এআই সার্ভার-সাইডে কল করুন (যেমন পাইথন, নোড.জেএস বা যান)
জেমিনি এপিআইয়ের Firebase Extensions সার্ভার-সাইড ভার্টেক্স এআই এসডিকে , Genkit বা ফায়ারবেস এক্সটেনশনগুলি ব্যবহার করুন।

আপনি শুরু করার আগে

যদি আপনি ইতিমধ্যে না থাকেন তবে শুরু করা গাইডটি সম্পূর্ণ করুন, যা আপনার ফায়ারবেস প্রকল্পটি কীভাবে সেট আপ করতে পারে, আপনার অ্যাপ্লিকেশনটিকে ফায়ারবেসে সংযুক্ত করতে পারে, এসডিকে যুক্ত করতে পারে, ভার্টেক্স এআই পরিষেবাটি শুরু করে এবং একটি GenerativeModel উদাহরণ তৈরি করে তা বর্ণনা করে।

পাঠ্য থেকে পাঠ্য উত্পন্ন করুন এবং একটি একক চিত্র পাঠ্য থেকে পাঠ্য উত্পন্ন করুন এবং একাধিক চিত্র পাঠ্য এবং একটি ভিডিও থেকে পাঠ্য উত্পন্ন করুন

গুরুত্বপূর্ণ: এই পৃষ্ঠার উদাহরণগুলি কীভাবে অনুরোধগুলিতে ইনলাইন ডেটা হিসাবে ছোট ফাইলগুলি অন্তর্ভুক্ত করবেন তা দেখায়। তবে, আপনি যদি এমন ফাইলগুলি অন্তর্ভুক্ত করতে চান যা আপনার মোট অনুরোধের আকারটি 20 এমবি ছাড়িয়ে যাবে , আপনাকে একটি ইউআরএল ব্যবহার করে ফাইলটি সরবরাহ করতে হবে (উদাহরণস্বরূপ, Cloud Storage for Firebase ব্যবহার করে)।

নমুনা মিডিয়া ফাইল

আপনার যদি ইতিমধ্যে মিডিয়া ফাইল না থাকে তবে আপনি নিম্নলিখিত সর্বজনীনভাবে উপলব্ধ ফাইলগুলি ব্যবহার করতে পারেন। যেহেতু এই ফাইলগুলি বালতিতে সংরক্ষণ করা হয় যা আপনার ফায়ারবেস প্রকল্পে নেই, তাই আপনাকে URL এর জন্য https://storage.googleapis.com/ BUCKET_NAME/PATH/TO/FILE ফর্ম্যাট ব্যবহার করতে হবে।

চিত্র : https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg data/generative-ai/image/scons.jpg একটি মাইম ধরণের image/jpeg সহ। এই চিত্রটি দেখুন বা ডাউনলোড করুন।
পিডিএফ : https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf একটি মাইম টাইপ application/pdf সহ। এই পিডিএফ দেখুন বা ডাউনলোড করুন।
ভিডিও : https://storage.googleapis.com/cloud-samples-data/video/animals.mp4 নমুনা-data/video/animals.mp4 একটি মাইম টাইপ video/mp4 সহ। এই ভিডিওটি দেখুন বা ডাউনলোড করুন।
অডিও : https://storage.googleapis.com/cloud-samples-data/generative-ai/audio/pixel.mp3 data/generative-ai/audio/pixel.mp3 একটি মাইম টাইপ audio/mp3 সহ। শুনুন বা এই অডিওটি ডাউনলোড করুন।

পাঠ্য এবং একটি একক চিত্র থেকে পাঠ্য উত্পন্ন করুন

এই নমুনাটি চেষ্টা করার আগে আপনি এই গাইডের বিভাগটি শুরু করার আগে আপনি শেষ করেছেন তা নিশ্চিত করুন।

আপনি মাল্টিমোডাল প্রম্পটগুলির সাথে জেমিনি এপিআইকে কল করতে পারেন যা উভয় পাঠ্য এবং একটি একক ফাইল অন্তর্ভুক্ত করে (যেমন একটি চিত্রের মতো, যেমন এই উদাহরণে দেখানো হয়েছে)।

ইনপুট ফাইলগুলির জন্য প্রয়োজনীয়তা এবং সুপারিশগুলি পর্যালোচনা করার বিষয়টি নিশ্চিত করুন।

সুইফট

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে পাঠ্য তৈরি করতে generateContent() কল করতে পারেন যাতে পাঠ্য এবং একটি একক চিত্র অন্তর্ভুক্ত রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

^{জাভার জন্য, এই এসডিকে পদ্ধতিগুলি ListenableFuture ফিউচারটি ফিরিয়ে দেয়।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and image
  const result = await model.generateContent([prompt, imagePart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To generate text output, call generateContent with the text and image
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
print(response.text);

কীভাবে কোনও মডেল চয়ন করবেন এবং আপনার ব্যবহারের কেস এবং অ্যাপ্লিকেশনটির জন্য উপযুক্ত একটি অবস্থান কীভাবে চয়ন করবেন তা শিখুন।

পাঠ্য এবং একাধিক চিত্র থেকে পাঠ্য তৈরি করুন

আপনি মাল্টিমোডাল প্রম্পটগুলির সাথে জেমিনি এপিআইকে কল করতে পারেন যাতে পাঠ্য এবং একাধিক ফাইল উভয়ই অন্তর্ভুক্ত থাকে (যেমন চিত্রগুলি যেমন এই উদাহরণে দেখানো হয়েছে)।

সুইফট

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে পাঠ্য তৈরি করতে generateContent() কল করতে পারেন যাতে পাঠ্য এবং একাধিক চিত্র অন্তর্ভুক্ত রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image1, image2, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
  image(bitmap1)
  image(bitmap2)
  text("What is different between these pictures?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

^{জাভার জন্য, এই এসডিকে পদ্ধতিগুলি ListenableFuture ফিউচারটি ফিরিয়ে দেয়।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  // Prepare images for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To generate text output, call generateContent with the text and images
  const result = await model.generateContent([prompt, ...imageParts]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...imageParts])
]);
print(response.text);

পাঠ্য এবং একটি ভিডিও থেকে পাঠ্য তৈরি করুন

আপনি মাল্টিমোডাল প্রম্পটগুলির সাথে জেমিনি এপিআইকে কল করতে পারেন যাতে পাঠ্য এবং ভিডিও ফাইল (গুলি) উভয়ই অন্তর্ভুক্ত থাকে (যেমন এই উদাহরণে দেখানো হয়েছে)।

সুইফট

আপনি একটি মাল্টিমোডাল প্রম্পট অনুরোধ থেকে পাঠ্য তৈরি করতে generateContent() কল করতে পারেন যাতে পাঠ্য এবং একটি একক ভিডিও অন্তর্ভুক্ত রয়েছে:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type.
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To generate text output, call generateContent with the text and video
let response = try await model.generateContent(video, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To generate text output, call generateContent with the prompt
    val response = generativeModel.generateContent(prompt)
    Log.d(TAG, response.text ?: "")
  }
}

Java

^{জাভার জন্য, এই এসডিকে পদ্ধতিগুলি ListenableFuture ফিউচারটি ফিরিয়ে দেয়।}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To generate text output, call generateContent with the prompt
        ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
        Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
            @Override
            public void onSuccess(GenerateContentResponse result) {
                String resultText = result.getText();
                System.out.println(resultText);
            }

            @Override
            public void onFailure(Throwable t) {
                t.printStackTrace();
            }
        }, executor);
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and video
  const result = await model.generateContent([prompt, videoPart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...videoPart])
]);
print(response.text);

প্রতিক্রিয়া স্ট্রিম

এই নমুনাগুলি চেষ্টা করার আগে আপনি এই গাইডের বিভাগটি শুরু করার আগে আপনি শেষ করেছেন তা নিশ্চিত করুন।

মডেল প্রজন্মের পুরো ফলাফলের জন্য অপেক্ষা না করে আপনি দ্রুত মিথস্ক্রিয়া অর্জন করতে পারেন এবং পরিবর্তে আংশিক ফলাফলগুলি পরিচালনা করতে স্ট্রিমিং ব্যবহার করতে পারেন। প্রতিক্রিয়া প্রবাহিত করতে, generateContentStream কল করুন।

উদাহরণ দেখুন: পাঠ্য এবং একটি একক চিত্র থেকে উত্পন্ন পাঠ্য স্ট্রিম

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content prompt = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  // Prepare image for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and image
  const result = await model.generateContentStream([prompt, imagePart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,imagePart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

উদাহরণ দেখুন: পাঠ্য এবং একাধিক চিত্র থেকে উত্পন্ন পাঠ্য স্ট্রিম

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image1, image2, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
    image(bitmap1)
    image(bitmap2)
    text("What's different between these pictures?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To stream generated text output, call generateContentStream with the text and images
  const result = await model.generateContentStream([prompt, ...imageParts]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To stream generated text output, call generateContentStream with the text and images
final response = await model.generateContentStream([
  Content.multi([prompt, ...imageParts])
]);
await for (final chunk in response) {
  print(chunk.text);
}

উদাহরণ দেখুন: পাঠ্য এবং একটি ভিডিও থেকে উত্পন্ন পাঠ্য স্ট্রিম

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To stream generated text output, call generateContentStream with the text and video
let contentStream = try model.generateContentStream(video, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To stream generated text output, call generateContentStream with the prompt
    var fullResponse = ""
    generativeModel.generateContentStream(prompt).collect { chunk ->
        Log.d(TAG, chunk.text ?: "")
        fullResponse += chunk.text
    }
  }
}

Java

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To stream generated text output, call generateContentStream with the prompt
        Publisher<GenerateContentResponse> streamingResponse =
                model.generateContentStream(prompt);

        final String[] fullResponse = {""};

        streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
            @Override
            public void onNext(GenerateContentResponse generateContentResponse) {
                String chunk = generateContentResponse.getText();
                fullResponse[0] += chunk;
            }

            @Override
            public void onComplete() {
                System.out.println(fullResponse[0]);
            }

            @Override
            public void onError(Throwable t) {
                t.printStackTrace();
            }

            @Override
            public void onSubscribe(Subscription s) {
            }
         });
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and video
  const result = await model.generateContentStream([prompt, videoPart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,videoPart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

ইনপুট ফাইলগুলির জন্য প্রয়োজনীয়তা এবং সুপারিশ

একটি অনুরোধে একটি ফাইল সরবরাহ করার জন্য বিভিন্ন বিকল্প
সমর্থিত ফাইল প্রকার
সমর্থিত মাইম প্রকারগুলি এবং কীভাবে সেগুলি নির্দিষ্ট করা যায়
ফাইল এবং মাল্টিমোডাল অনুরোধগুলির জন্য প্রয়োজনীয়তা এবং সেরা অনুশীলনগুলি

আপনি আর কি করতে পারেন?

মডেলটিতে দীর্ঘ প্রম্পট প্রেরণের আগে টোকেনগুলি কীভাবে গণনা করবেন তা শিখুন।
Cloud Storage for Firebase সেট আপ করুন যাতে আপনি আপনার মাল্টিমোডাল অনুরোধগুলিতে বড় ফাইলগুলি অন্তর্ভুক্ত করতে পারেন এবং প্রম্পটগুলিতে ফাইল সরবরাহ করার জন্য আরও পরিচালিত সমাধান পেতে পারেন। ফাইলগুলিতে চিত্র, পিডিএফএস, ভিডিও এবং অডিও অন্তর্ভুক্ত থাকতে পারে।
অননুমোদিত ক্লায়েন্টদের দ্বারা জেমিনি এপিআইকে অপব্যবহার থেকে রক্ষা করতে Firebase App Check সেট আপ করা সহ উত্পাদনের প্রস্তুতি সম্পর্কে চিন্তাভাবনা শুরু করুন। এছাড়াও, উত্পাদন চেকলিস্টটি পর্যালোচনা করতে ভুলবেন না।

অন্যান্য ক্ষমতা চেষ্টা করে দেখুন

মাল্টি-টার্ন কথোপকথন (চ্যাট) তৈরি করুন।
কেবল পাঠ্য-প্রম্পটগুলি থেকে পাঠ্য তৈরি করুন।
উভয় পাঠ্য এবং মাল্টিমোডাল প্রম্পট থেকে কাঠামোগত আউটপুট (জেএসনের মতো) উত্পন্ন করুন।
পাঠ্য অনুরোধগুলি থেকে চিত্র তৈরি করুন।
বাহ্যিক সিস্টেম এবং তথ্যের সাথে জেনারেটর মডেলগুলি সংযুক্ত করতে ফাংশন কলিং ব্যবহার করুন।

কীভাবে সামগ্রী প্রজন্ম নিয়ন্ত্রণ করবেন তা শিখুন

সর্বোত্তম অনুশীলন, কৌশল এবং উদাহরণ প্রম্পট সহ প্রম্পট ডিজাইনটি বুঝুন ।
তাপমাত্রা এবং সর্বোচ্চ আউটপুট টোকেন ( মিথুনের জন্য) বা দিক অনুপাত এবং ব্যক্তি প্রজন্মের ( ইমেজের জন্য) মতো মডেল পরামিতিগুলি কনফিগার করুন ।
ক্ষতিকারক হিসাবে বিবেচিত হতে পারে এমন প্রতিক্রিয়া পাওয়ার সম্ভাবনা সামঞ্জস্য করতে সুরক্ষা সেটিংস ব্যবহার করুন ।

সমর্থিত মডেলগুলি সম্পর্কে আরও জানুন

ফাইলটির mimeType । প্রতিটি ইনপুট ফাইলের সমর্থিত মাইম প্রকারগুলি সম্পর্কে জানুন।
ফাইল। আপনি হয় ফাইলটি ইনলাইন ডেটা হিসাবে সরবরাহ করতে পারেন (এই পৃষ্ঠায় দেখানো হয়েছে) বা এর ইউআরএল বা ইউআরআই ব্যবহার করে।

জেমিনি এপিআইয়ের সাথে কাজ করার জন্য অন্যান্য বিকল্পগুলি

জেমিনি এপিআইয়ের বিকল্প " গুগল এআই " সংস্করণ নিয়ে ally চ্ছিকভাবে পরীক্ষা করুন
গুগল এআই স্টুডিও এবং গুগল এআই ক্লায়েন্ট এসডিকেএস ব্যবহার করে বিনামূল্যে-চার্জ অ্যাক্সেস (সীমা এবং যেখানে পাওয়া যায়) পান। এই এসডিকে কেবল মোবাইল এবং ওয়েব অ্যাপ্লিকেশনগুলিতে প্রোটোটাইপিংয়ের জন্য ব্যবহার করা উচিত।
কোনও মিথুন এপিআই কীভাবে কাজ করে তার সাথে আপনি পরিচিত হওয়ার পরে, ফায়ারবেস এসডিকেএসে (এই ডকুমেন্টেশন) আমাদের ভার্টেক্স এআইতে স্থানান্তরিত করুন , যার মধ্যে মোবাইল এবং ওয়েব অ্যাপ্লিকেশনগুলির জন্য আরও অনেক অতিরিক্ত বৈশিষ্ট্য রয়েছে, যেমন অনুরোধে বড় মিডিয়া ফাইলগুলির জন্য Firebase App Check এবং সমর্থন ব্যবহার করে অপব্যবহার থেকে এপিআইকে রক্ষা করা।
Ally চ্ছিকভাবে জেমিনি এপিআইকে ভার্টেক্স এআই সার্ভার-সাইডে কল করুন (যেমন পাইথন, নোড.জেএস বা যান)
জেমিনি এপিআইয়ের Firebase Extensions সার্ভার-সাইড ভার্টেক্স এআই এসডিকে , Genkit বা ফায়ারবেস এক্সটেনশনগুলি ব্যবহার করুন।

আপনি শুরু করার আগে

নমুনা মিডিয়া ফাইল

চিত্র : https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg data/generative-ai/image/scons.jpg একটি মাইম ধরণের image/jpeg সহ। এই চিত্রটি দেখুন বা ডাউনলোড করুন।
পিডিএফ : https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf একটি মাইম টাইপ application/pdf সহ। এই পিডিএফ দেখুন বা ডাউনলোড করুন।
ভিডিও : https://storage.googleapis.com/cloud-samples-data/video/animals.mp4 নমুনা-data/video/animals.mp4 একটি মাইম টাইপ video/mp4 সহ। এই ভিডিওটি দেখুন বা ডাউনলোড করুন।
অডিও : https://storage.googleapis.com/cloud-samples-data/generative-ai/audio/pixel.mp3 data/generative-ai/audio/pixel.mp3 একটি মাইম টাইপ audio/mp3 সহ। শুনুন বা এই অডিওটি ডাউনলোড করুন।

পাঠ্য এবং একটি একক চিত্র থেকে পাঠ্য উত্পন্ন করুন

সুইফট

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image, prompt)
print(response.text ?? "No text in response.")

Kotlin

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Note : The example above takes advantage of a simplified way to handle platform-native image types ( Bitmap ) in multi-modal prompts. These image types (regardless of their original format) are converted client-side to JPEG at 80% quality before being sent to the server. This means that when you provide images inline like in the example above, you don't need to specify the MIME type.

For more control over image formats and conversions, you may provide the images as an InlineDataPart and supply the specific MIME type. যেমন: content { inlineData(/* PNG as byte array */, "image/png") } .

Java

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single image:

^{For Java, the methods in this SDK return a ListenableFuture .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single image:

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and image
  const result = await model.generateContent([prompt, imagePart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single image:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To generate text output, call generateContent with the text and image
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
print(response.text);

Learn how to choose a model and optionally a location appropriate for your use case and app.

Generate text from text and multiple images

Make sure that you've completed the Before you begin section of this guide before trying this sample.

You can call the Gemini API with multimodal prompts that include both text and multiple files (like images, as shown in this example).

Make sure to review the requirements and recommendations for input files .

সুইফট

You can call generateContent() to generate text from a multimodal prompt request that includes text and multiple images:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image1, image2, prompt)
print(response.text ?? "No text in response.")

Note : The example above takes advantage of a simplified way to handle platform-native image types ( UIImage , NSImage , CIImage , and CGImage ) in multi-modal prompts. These image types (regardless of their original format) are converted client-side to JPEG at 80% quality before being sent to the server. This means that when you provide images inline like in the example above, you don't need to specify the MIME type.

For more control over image formats and conversions, you may provide the images as an InlineDataPart and supply the specific MIME type. যেমন: InlineDataPart(data: Data(/* PNG Data */), mimeType: "image/png") .

Kotlin

You can call generateContent() to generate text from a multimodal prompt request that includes text and multiple images:

^{For Kotlin, the methods in this SDK are suspend functions and need to be called from a Coroutine scope .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
  image(bitmap1)
  image(bitmap2)
  text("What is different between these pictures?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

Java

You can call generateContent() to generate text from a multimodal prompt request that includes text and multiple images:

^{For Java, the methods in this SDK return a ListenableFuture .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

You can call generateContent() to generate text from a multimodal prompt request that includes text and multiple images:

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  // Prepare images for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To generate text output, call generateContent with the text and images
  const result = await model.generateContent([prompt, ...imageParts]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

You can call generateContent() to generate text from a multimodal prompt request that includes text and multiple images:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...imageParts])
]);
print(response.text);

Learn how to choose a model and optionally a location appropriate for your use case and app.

Generate text from text and a video

Make sure that you've completed the Before you begin section of this guide before trying this sample.

You can call the Gemini API with multimodal prompts that include both text and video file(s) (as shown in this example).

Make sure to review the requirements and recommendations for input files .

সুইফট

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single video:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type.
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To generate text output, call generateContent with the text and video
let response = try await model.generateContent(video, prompt)
print(response.text ?? "No text in response.")

Kotlin

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single video:

^{For Kotlin, the methods in this SDK are suspend functions and need to be called from a Coroutine scope .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To generate text output, call generateContent with the prompt
    val response = generativeModel.generateContent(prompt)
    Log.d(TAG, response.text ?: "")
  }
}

Java

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single video:

^{For Java, the methods in this SDK return a ListenableFuture .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To generate text output, call generateContent with the prompt
        ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
        Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
            @Override
            public void onSuccess(GenerateContentResponse result) {
                String resultText = result.getText();
                System.out.println(resultText);
            }

            @Override
            public void onFailure(Throwable t) {
                t.printStackTrace();
            }
        }, executor);
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single video:

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and video
  const result = await model.generateContent([prompt, videoPart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

You can call generateContent() to generate text from a multimodal prompt request that includes text and a single video:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...videoPart])
]);
print(response.text);

Learn how to choose a model and optionally a location appropriate for your use case and app.

Stream the response

Make sure that you've completed the Before you begin section of this guide before trying these samples.

You can achieve faster interactions by not waiting for the entire result from the model generation, and instead use streaming to handle partial results. To stream the response, call generateContentStream .

View example: Stream generated text from text and a single image

সুইফট

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single image:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single image:

^{For Kotlin, the methods in this SDK are suspend functions and need to be called from a Coroutine scope .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single image:

^{For Java, the streaming methods in this SDK return a Publisher type from the Reactive Streams library .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content prompt = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single image:

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  // Prepare image for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and image
  const result = await model.generateContentStream([prompt, imagePart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single image:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,imagePart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

View example: Stream generated text from text and multiple images

সুইফট

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and multiple images:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image1, image2, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and multiple images:

^{For Kotlin, the methods in this SDK are suspend functions and need to be called from a Coroutine scope .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
    image(bitmap1)
    image(bitmap2)
    text("What's different between these pictures?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and multiple images:

^{For Java, the streaming methods in this SDK return a Publisher type from the Reactive Streams library .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and multiple images:

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To stream generated text output, call generateContentStream with the text and images
  const result = await model.generateContentStream([prompt, ...imageParts]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

This example shows how to use generateContentStream to stream generated text from a multimodal prompt request that includes text and multiple images:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To stream generated text output, call generateContentStream with the text and images
final response = await model.generateContentStream([
  Content.multi([prompt, ...imageParts])
]);
await for (final chunk in response) {
  print(chunk.text);
}

View example: Stream generated text from text and a video

সুইফট

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single video:

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

// Provide the video as `Data` with the appropriate MIME type
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To stream generated text output, call generateContentStream with the text and video
let contentStream = try model.generateContentStream(video, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single video:

^{For Kotlin, the methods in this SDK are suspend functions and need to be called from a Coroutine scope .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To stream generated text output, call generateContentStream with the prompt
    var fullResponse = ""
    generativeModel.generateContentStream(prompt).collect { chunk ->
        Log.d(TAG, chunk.text ?: "")
        fullResponse += chunk.text
    }
  }
}

Java

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single video:

^{For Java, the streaming methods in this SDK return a Publisher type from the Reactive Streams library .}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To stream generated text output, call generateContentStream with the prompt
        Publisher<GenerateContentResponse> streamingResponse =
                model.generateContentStream(prompt);

        final String[] fullResponse = {""};

        streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
            @Override
            public void onNext(GenerateContentResponse generateContentResponse) {
                String chunk = generateContentResponse.getText();
                fullResponse[0] += chunk;
            }

            @Override
            public void onComplete() {
                System.out.println(fullResponse[0]);
            }

            @Override
            public void onError(Throwable t) {
                t.printStackTrace();
            }

            @Override
            public void onSubscribe(Subscription s) {
            }
         });
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single video:

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and video
  const result = await model.generateContentStream([prompt, videoPart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

You can call generateContentStream() to stream generated text from a multimodal prompt request that includes text and a single video:

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,videoPart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

Requirements and recommendations for input files

See Supported input files and requirements for the Gemini API in Vertex AI to learn about the following:

Different options for providing a file in a request
সমর্থিত ফাইল প্রকার
Supported MIME types and how to specify them
Requirements and best practices for files and multimodal requests

আপনি আর কি করতে পারেন?

Learn how to count tokens before sending long prompts to the model.
Set up Cloud Storage for Firebase so that you can include large files in your multimodal requests and have a more managed solution for providing files in prompts. Files can include images, PDFs, video, and audio.
Start thinking about preparing for production, including setting up Firebase App Check to protect the Gemini API from abuse by unauthorized clients. Also, make sure to review the production checklist .

Try out other capabilities

Build multi-turn conversations (chat) .
Generate text from text-only prompts .
Generate structured output (like JSON) from both text and multimodal prompts.
Generate images from text prompts.
Use function calling to connect generative models to external systems and information.

Learn how to control content generation

Understand prompt design , including best practices, strategies, and example prompts.
Configure model parameters like temperature and maximum output tokens (for Gemini ) or aspect ratio and person generation (for Imagen ).
Use safety settings to adjust the likelihood of getting responses that may be considered harmful.

You can also experiment with prompts and model configurations using Vertex AI Studio .

Learn more about the supported models

Learn about the models available for various use cases and their quotas and pricing .

Give feedback about your experience with Vertex AI in Firebase