生成式模型会将数据细分为单元(称为“词元”)以进行处理。每个模型具有在提示和响应中可以处理的词元数上限。
本页介绍了如何使用 Count Tokens API 估算 对 Gemini 模型的请求的词元数和计费字符数。没有用于获取响应中令牌估算值的 API。
请注意,Count Tokens API 不能用于 Imagen 模型。
计数中会提供哪些信息?
请注意以下关于计算词元数和计费字符数的说明:
统计总令牌数
此计数有助于确保您的请求不会超出允许的上下文窗口。
令牌数将反映在请求输入中提供的所有文件(例如图片)的大小。系统不会统计视频中的图片数量或秒数。
对于所有 Gemini 模型,一个令牌大约相当于 4 个字符。100 个词元大约相当于 60-80 个英语单词。
统计计费字符总数
此计数有助于了解和控制费用,因为对于 Vertex AI,字符数是价格计算的一部分。
可结算字符数将反映在请求输入中提供的文本中的字符数。
对于较旧的 Gemini 型号,令牌不包含在价格计算中;但对于 Gemini 2.0 型号,令牌会用于价格计算。详细了解每个模型的令牌限制和每个模型的价格。
计算词元和计费字符数的价格和配额
使用 CountTokens
API 无需付费或配额限制。CountTokens
API 的最大配额为每分钟 3000 个请求 (RPM)。
代码示例
仅限文本输入
Swift
let response = try await model.countTokens("Write a story about a magic backpack.")
print("Total Tokens: \(response.totalTokens)")
print("Total Billable Characters: \(response.totalBillableCharacters)")
Kotlin
val response = generativeModel.countTokens("Write a story about a magic backpack.")
println("Total Tokens: ${response.totalTokens}")
println("Total Billable Characters: ${response.totalBillableCharacters}")
Java
Content prompt = new Content.Builder()
.addText("Write a story about a magic backpack.")
.build();
GenerativeModelFutures modelFutures = GenerativeModelFutures.from(model);
ListenableFuture<CountTokensResponse> countTokensResponse =
modelFutures.countTokens(prompt);
Futures.addCallback(countTokensResponse, new FutureCallback<CountTokensResponse>() {
@Override
public void onSuccess(CountTokensResponse response) {
System.out.println("Total Tokens = " + response.getTotalTokens());
System.out.println("Total Billable Characters: = " +
response.getTotalBillableCharacters());
}
@Override
public void onFailure(Throwable t) {
t.printStackTrace();
}
}, executor);
Web
const { totalTokens, totalBillableCharacters } = await model.countTokens("Write a story about a magic backpack.");
console.log(`Total tokens: ${totalTokens}, total billable characters: ${totalBillableCharacters}`);
Dart
final tokenCount = await model.countTokens(Content.text("Write a story about a magic backpack."));
print('Token count: ${tokenCount.totalTokens}, billable characters: ${tokenCount.totalBillableCharacters}');
多模态输入
Swift
let response = try await model.countTokens(image, "What's in this picture?")
print("Total Tokens: \(response.totalTokens)")
print("Total Billable Characters: \(response.totalBillableCharacters)")
Kotlin
val prompt = content {
image(bitmap)
text("What's in this picture?")
}
val response = generativeModel.countTokens(prompt)
println("Total Tokens: ${response.totalTokens}")
println("Total Billable Characters: ${response.totalBillableCharacters}")
Java
Content prompt = new Content.Builder()
.addImage(bitmap)
.addText("What's in this picture?")
.build();
GenerativeModelFutures modelFutures = GenerativeModelFutures.from(model);
ListenableFuture<CountTokensResponse> countTokensResponse =
modelFutures.countTokens(prompt);
Futures.addCallback(countTokensResponse, new FutureCallback<CountTokensResponse>() {
@Override
public void onSuccess(CountTokensResponse response) {
System.out.println("Total Tokens = " + response.getTotalTokens());
System.out.println("Total Billable Characters: = " +
response.getTotalBillableCharacters());
}
@Override
public void onFailure(Throwable t) {
t.printStackTrace();
}
}, executor);
Web
const prompt = "What's in this picture?";
const imagePart = { inlineData: { mimeType: 'image/jpeg', data: imageAsBase64 }};
const { totalTokens, totalBillableCharacters } = await model.countTokens([prompt, imagePart]);
console.log(`Total tokens: ${totalTokens}, total billable characters: ${totalBillableCharacters}`);
Dart
final prompt = TextPart("What's in the picture?");
final tokenCount = await model.countTokens([
Content.multi([prompt, imagePart])
]);
print('Token count: ${tokenCount.totalTokens}, billable characters: ${tokenCount.totalBillableCharacters}');