GenerativeModelFutures

public abstract class GenerativeModelFutures


A Java-friendly wrapper for the GenerativeModel class, providing ListenableFuture for asynchronous operations.

Summary

Public methods

abstract @NonNull ListenableFuture<@FeatureStatus @NonNull Integer>

Checks the current availability status of the content generation feature.

abstract @NonNull ListenableFuture<@NonNull Void>

Clears all caches created by prefix caching.

abstract @NonNull ListenableFuture<@NonNull CountTokensResponse>

Counts the number of tokens in the request.

abstract @NonNull ListenableFuture<@NonNull Void>

Downloads the required model assets for the content generation feature if they are not already available.

static final @NonNull GenerativeModelFutures
from(@NonNull GenerativeModel generativeModel)
abstract @NonNull ListenableFuture<@NonNull GenerateContentResponse>

Runs a non-streaming inference call to the model for the provided prompt.

abstract @NonNull ListenableFuture<@NonNull GenerateContentResponse>

Runs a non-streaming inference call to the model for the provided GenerateContentRequest.

abstract @NonNull ListenableFuture<@NonNull GenerateContentResponse>
generateContent(
    @NonNull String prompt,
    @NonNull StreamingCallback callback
)

Runs a streaming inference call to the model for the provided prompt.

abstract @NonNull ListenableFuture<@NonNull GenerateContentResponse>

Runs a streaming inference call to the model for the provided GenerateContentRequest.

abstract @NonNull ListenableFuture<@NonNull String>

Returns the name of the base model used by this generator instance.

abstract @NonNull GenerativeModel

Returns the underlying GenerativeModel instance that was used to create this object.

abstract @NonNull ListenableFuture<@NonNull Integer>

Returns total token limit for the API including both input and output tokens.

abstract @NonNull ListenableFuture<@NonNull Void>

Warms up the inference engine for use by loading necessary models.

Public methods

checkStatus

public abstract @NonNull ListenableFuture<@FeatureStatus @NonNull IntegercheckStatus()

Checks the current availability status of the content generation feature.

clearCaches

public abstract @NonNull ListenableFuture<@NonNull VoidclearCaches()

Clears all caches created by prefix caching.

This experimental method clears all caches created by prefix caching. When promptPrefix is provided in GenerateContentRequest, the system caches its processing to reduce inference time for subsequent requests sharing the same prefix. This method clears all such created caches.

countTokens

public abstract @NonNull ListenableFuture<@NonNull CountTokensResponsecountTokens(@NonNull GenerateContentRequest request)

Counts the number of tokens in the request.

The number of tokens counted includes input and output tokens. The result can be compared with getTokenLimit to check if the request is within the token limit.

Parameters
@NonNull GenerateContentRequest request

a non-null GenerateContentRequest containing input content.

download

public abstract @NonNull ListenableFuture<@NonNull Voiddownload(@NonNull DownloadCallback callback)

Downloads the required model assets for the content generation feature if they are not already available.

from

public static final @NonNull GenerativeModelFutures from(@NonNull GenerativeModel generativeModel)
Parameters
@NonNull GenerativeModel generativeModel

the GenerativeModel instance to wrap.

Returns
@NonNull GenerativeModelFutures

a GenerativeModelFutures created around the provided GenerativeModel instance.

generateContent

public abstract @NonNull ListenableFuture<@NonNull GenerateContentResponsegenerateContent(@NonNull String prompt)

Runs a non-streaming inference call to the model for the provided prompt.

generateContent

public abstract @NonNull ListenableFuture<@NonNull GenerateContentResponsegenerateContent(@NonNull GenerateContentRequest request)

Runs a non-streaming inference call to the model for the provided GenerateContentRequest.

generateContent

public abstract @NonNull ListenableFuture<@NonNull GenerateContentResponsegenerateContent(
    @NonNull String prompt,
    @NonNull StreamingCallback callback
)

Runs a streaming inference call to the model for the provided prompt.

Parameters
@NonNull String prompt

the input prompt text.

@NonNull StreamingCallback callback

the callback to receive streaming results.

generateContent

public abstract @NonNull ListenableFuture<@NonNull GenerateContentResponsegenerateContent(
    @NonNull GenerateContentRequest request,
    @NonNull StreamingCallback callback
)

Runs a streaming inference call to the model for the provided GenerateContentRequest.

Parameters
@NonNull GenerateContentRequest request

the request containing the prompt of text and images.

@NonNull StreamingCallback callback

the callback to receive streaming results.

getBaseModelName

public abstract @NonNull ListenableFuture<@NonNull StringgetBaseModelName()

Returns the name of the base model used by this generator instance.

getGenerativeModel

public abstract @NonNull GenerativeModel getGenerativeModel()

Returns the underlying GenerativeModel instance that was used to create this object.

getTokenLimit

public abstract @NonNull ListenableFuture<@NonNull IntegergetTokenLimit()

Returns total token limit for the API including both input and output tokens.

This limit can be used with countTokens to check if a request is within limits before running inference. The input size returned by countTokens plusing the output size specified by GenerateContentRequest.maxOutputTokens should be no larger than the limit returned by this method.

Returns
@NonNull ListenableFuture<@NonNull Integer>

a ListenableFuture resolving to the token limit.

warmup

public abstract @NonNull ListenableFuture<@NonNull Voidwarmup()

Warms up the inference engine for use by loading necessary models.