Skip to content

Latest commit

 

History

History
604 lines (473 loc) · 21.6 KB

README.md

File metadata and controls

604 lines (473 loc) · 21.6 KB

build CodeQL REUSE status Fosstars security rating

SAP Cloud SDK for AI (for Java)

⚠️ This is a pre-alpha version of the AI SDK for Java. The APIs are subject to change ⚠️

About this project

Integrate chat completion into your business applications with SAP Cloud SDK for GenAI Hub. Leverage the Generative AI Hub of SAP AI Core to make use of templating, grounding, data masking, content filtering and more. Set up your SAP AI Core instance with SAP Cloud SDK for AI Core.

List of available and tested APIs

We maintain a list of currently available and tested AI Core APIs

Documentation

AI Core Deployment

Prerequisites

  • The AI Core service in BTP
  • Created a configuration in AI Core
    • An example configuration from the AI Core /configuration endpoint
      {
        "createdAt": "2024-07-03T12:44:08Z",
        "executableId": "azure-openai",
        "id": "12345-123-123-123-123456abcdefg",
        "inputArtifactBindings": [],
        "name": "gpt-35-turbo",
        "parameterBindings": [
          {
            "key": "modelName",
            "value": "gpt-35-turbo"
          },
          {
            "key": "modelVersion",
            "value": "latest"
          }
        ],
        "scenarioId": "foundation-models"
      }
      
  • A Java project with a Maven pom.xml
    • Java 17 or higher
    • Maven 3.9 or higher
    • if Spring Boot is used, then minimum version 3
  • Set the AI Core credentials as an environment variable for local testing

Maven dependencies

Add the following dependencies to your pom.xml file:

<dependencies>
  <dependency>
    <groupId>com.sap.ai.sdk</groupId>
    <artifactId>core</artifactId>
    <version>${ai-sdk.version}</version>
  </dependency>
</dependencies>

See an example pom in our Spring Boot application

Create a Deployment

public AiDeploymentCreationResponse createDeployment() {

  final AiDeploymentCreationResponse deployment =
      new DeploymentApi(getClient())
          .deploymentCreate(
              "default",
              AiDeploymentCreationRequest.create()
                  .configurationId("12345-123-123-123-123456abcdefg"));

  String id = deployment.getId();
  AiExecutionStatus status = deployment.getStatus();

  return deployment;
}

See an example in our Spring Boot application

Delete a Deployment

public AiDeploymentDeletionResponse deleteDeployment(AiDeploymentCreationResponse deployment) {

  DeploymentApi client = new DeploymentApi(getClient());

  if (deployment.getStatus() == AiExecutionStatus.RUNNING) {
    // Only RUNNING deployments can be STOPPED
    client.deploymentModify(
        "default",
        deployment.getId(),
        AiDeploymentModificationRequest.create().targetStatus(AiDeploymentTargetStatus.STOPPED));
  }
  // Wait a few seconds for the deployment to stop
  // Only UNKNOWN and STOPPED deployments can be DELETED
  return client.deploymentDelete("default", deployment.getId());
}

See an example in our Spring Boot application

OpenAI chat completion

Prerequisites

  • A deployed OpenAI model in AI Core.
    • How to deploy a model to AI Core
    • An example deployed model from the AI Core /deployments endpoint
      {
        "id": "d123456abcdefg",
        "deploymentUrl": "https://api.ai.region.aws.ml.hana.ondemand.com/v2/inference/deployments/d123456abcdefg",
        "configurationId": "12345-123-123-123-123456abcdefg",
        "configurationName": "gpt-35-turbo",
        "scenarioId": "foundation-models",
        "status": "RUNNING",
        "statusMessage": null,
        "targetStatus": "RUNNING",
        "lastOperation": "CREATE",
        "latestRunningConfigurationId": "12345-123-123-123-123456abcdefg",
        "ttl": null,
        "details": {
          "scaling": {
            "backendDetails": null,
            "backend_details": {
            }
          },
          "resources": {
            "backendDetails": null,
            "backend_details": {
              "model": {
                "name": "gpt-35-turbo",
                "version": "latest"
              }
            }
          }
        },
        "createdAt": "2024-07-03T12:44:22Z",
        "modifiedAt": "2024-07-16T12:44:19Z",
        "submissionTime": "2024-07-03T12:44:51Z",
        "startTime": "2024-07-03T12:45:56Z",
        "completionTime": null
      }
      
  • A Java project with a Maven pom.xml
    • Java 17 or higher
    • Maven 3.9 or higher
    • if Spring Boot is used, then minimum version 3
  • Set the AI Core credentials as an environment variable for local testing

Maven dependencies

Add the following dependencies to your pom.xml file:

<dependencies>
  <dependency>
    <groupId>com.sap.ai.sdk.foundationmodels</groupId>
    <artifactId>openai</artifactId>
    <version>${ai-sdk.version}</version>
  </dependency>
</dependencies>

See an example pom in our Spring Boot application

Simple chat completion

final OpenAiChatCompletionOutput result =
    OpenAiClient.forModel(GPT_35_TURBO)
        .withSystemPrompt("You are a helpful AI")
        .chatCompletion("Hello World! Why is this phrase so famous?");

final String resultMessage = result.getContent();

Message history

final var systemMessage =
    new OpenAiChatSystemMessage().setContent("You are a helpful assistant");
final var userMessage =
    new OpenAiChatUserMessage().addText("Hello World! Why is this phrase so famous?");
final var request =
    new OpenAiChatCompletionParameters().addMessages(systemMessage, userMessage);

final OpenAiChatCompletionOutput result =
    OpenAiClient.forModel(GPT_35_TURBO).chatCompletion(request);

final String resultMessage = result.getContent();

See an example in our Spring Boot application

Chat completion with a model not defined in OpenAiModel

final OpenAiChatCompletionOutput result =
    OpenAiClient.forModel(new OpenAiModel("model")).chatCompletion(request);

Stream chat completion

It's possible to pass a stream of chat completion delta elements, e.g. from the application backend to the frontend in real-time.

Stream the chat completion asynchronously

This is a blocking example for streaming and printing directly to the console:

String msg = "Can you give me the first 100 numbers of the Fibonacci sequence?";

OpenAiClient client = OpenAiClient.forModel(GPT_35_TURBO);

// try-with-resources on stream ensures the connection will be closed
try( Stream<String> stream = client.streamChatCompletion(msg)) {
    stream.forEach(deltaString -> {
        System.out.print(deltaString);
        System.out.flush();
    });
}
It's also possible to aggregate the total output.

The following example is non-blocking. Any asynchronous library can be used, e.g. classic Thread API.

String msg = "Can you give me the first 100 numbers of the Fibonacci sequence?";

OpenAiChatCompletionParameters request =
    new OpenAiChatCompletionParameters()
        .addMessages(new OpenAiChatUserMessage().addText(msg));

OpenAiChatCompletionOutput totalOutput = new OpenAiChatCompletionOutput();
OpenAiClient client = OpenAiClient.forModel(GPT_35_TURBO);

// Do the request before the thread starts to handle exceptions during request initialization 
Stream<OpenAiChatCompletionDelta> stream = client.streamChatCompletionDeltas(request);

Thread thread = new Thread(() -> {
    // try-with-resources ensures the stream is closed
    try (stream) {
        stream.peek(totalOutput::addDelta).forEach(delta -> System.out.println(delta));
    }
});
thread.start(); // non-blocking

thread.join(); // blocking

// access aggregated information from total output, e.g.
Integer tokens = totalOutput.getUsage().getCompletionTokens();
System.out.println("Tokens: " + tokens);

Spring Boot example

Please find an example in our Spring Boot application. It shows the usage of Spring Boot's ResponseBodyEmitter to stream the chat completion delta messages to the frontend in real-time.

Orchestration chat completion

Prerequisites

  • A deployed Orchestration service in AI Core.
    • Orchestration documentation
    • An example orchestration deployment from the AI Core /deployments endpoint
      {
        "id": "d123456abcdefg",
        "deploymentUrl": "https://api.ai.intprod-eu12.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/d123456abcdefg",
        "configurationId": "12345-123-123-123-123456abcdefg",
        "configurationName": "orchestration",
        "scenarioId": "orchestration",
        "status": "RUNNING",
        "statusMessage": null,
        "targetStatus": "RUNNING",
        "lastOperation": "CREATE",
        "latestRunningConfigurationId": "12345-123-123-123-123456abcdefg",
        "ttl": null,
        "createdAt": "2024-08-05T16:17:29Z",
        "modifiedAt": "2024-08-06T06:32:50Z",
        "submissionTime": "2024-08-05T16:17:40Z",
        "startTime": "2024-08-05T16:18:41Z",
        "completionTime": null
      }
      
  • A Java project with a Maven pom.xml
    • Java 17 or higher
    • Maven 3.9 or higher
    • if Spring Boot is used, then minimum version 3
  • Set the AI Core credentials as an environment variable for local testing

Maven dependencies

Add the following dependencies to your pom.xml file:

<dependencies>
  <dependency>
    <groupId>com.sap.ai.sdk</groupId>
    <artifactId>orchestration</artifactId>
    <version>${ai-sdk.version}</version>
  </dependency>
</dependencies>

See an example pom in our Spring Boot application

Chat completion template

final var llmConfig = LLMModuleConfig.create().modelName("gpt-35-turbo").modelParams(Map.of());

final var inputParams =
    Map.of("input", "Reply with 'Orchestration Service is working!' in German");
final var template = ChatMessage.create().role("user").content("{{?input}}");
final var templatingConfig = TemplatingModuleConfig.create().template(template);

final var config =
    CompletionPostRequest.create()
        .orchestrationConfig(
            OrchestrationConfig.create()
                .moduleConfigurations(
                    ModuleConfigs.create()
                        .llmModuleConfig(llmConfig)
                        .templatingModuleConfig(templatingConfig)))
        .inputParams(inputParams);

final CompletionPostResponse result =
    new OrchestrationCompletionApi(getOrchestrationClient("default"))
        .orchestrationV1EndpointsCreate(config);

final String messageResult =
    result.getOrchestrationResult().getChoices().get(0).getMessage().getContent();

See an example in our Spring Boot application

Messages history

final var llmConfig = LLMModuleConfig.create().modelName("gpt-35-turbo").modelParams(Map.of());

List<ChatMessage> messagesHistory =
    List.of(
        ChatMessage.create().role("user").content("What is the capital of France?"),
        ChatMessage.create().role("assistant").content("The capital of France is Paris."));

final var message =
    ChatMessage.create().role("user").content("What is the typical food there?");
final var templatingConfig = TemplatingModuleConfig.create().template(message);

final var config =
    CompletionPostRequest.create()
        .orchestrationConfig(
            OrchestrationConfig.create()
                .moduleConfigurations(
                    ModuleConfigs.create()
                        .llmModuleConfig(llmConfig)
                        .templatingModuleConfig(templatingConfig)))
        .inputParams(Map.of())
        .messagesHistory(messagesHistory);

final CompletionPostResponse result =
    new OrchestrationCompletionApi(getOrchestrationClient("default"))
        .orchestrationV1EndpointsCreate(config);

final String messageResult =
    result.getOrchestrationResult().getChoices().get(0).getMessage().getContent();

See an example in our Spring Boot application

Chat completion filter

final var llmConfig = LLMModuleConfig.create().modelName("gpt-35-turbo").modelParams(Map.of());

final var inputParams =
    Map.of(
        "disclaimer",
        "```DISCLAIMER: The area surrounding the apartment is known for prostitutes and gang violence including armed conflicts, gun violence is frequent.");
final var template =
    ChatMessage.create()
        .role("user")
        .content(
            "Create a rental posting for subletting my apartment in the downtown area. Keep it short. Make sure to add the following disclaimer to the end. Do not change it! {{?disclaimer}}");
final var templatingConfig = TemplatingModuleConfig.create().template(template);

final var filterStrict = 
    FilterConfig.create()
        .type(FilterConfig.TypeEnum.AZURE_CONTENT_SAFETY)
        .config(
            AzureContentSafety.create()
                .hate(NUMBER_0)
                .selfHarm(NUMBER_0)
                .sexual(NUMBER_0)
                .violence(NUMBER_0));

final var filterLoose =
    FilterConfig.create()
        .type(FilterConfig.TypeEnum.AZURE_CONTENT_SAFETY)
        .config(
            AzureContentSafety.create()
                .hate(NUMBER_4)
                .selfHarm(NUMBER_4)
                .sexual(NUMBER_4)
                .violence(NUMBER_4));

final var filteringConfig =
    FilteringModuleConfig.create()
        // changing the input to filterLoose will allow the message to pass
        .input(FilteringConfig.create().filters(filterStrict))
        .output(FilteringConfig.create().filters(filterStrict));

final var config =
    CompletionPostRequest.create()
        .orchestrationConfig(
            OrchestrationConfig.create()
                .moduleConfigurations(
                    ModuleConfigs.create()
                        .llmModuleConfig(llmConfig)
                        .templatingModuleConfig(templatingConfig)
                        .filteringModuleConfig(filteringConfig)))
        .inputParams(inputParams);

final CompletionPostResponse result =
    new OrchestrationCompletionApi(getOrchestrationClient("default"))
        // this fails with Bad Request because the strict filter prohibits the input message
        .orchestrationV1EndpointsCreate(config);

final String messageResult =
    result.getOrchestrationResult().getChoices().get(0).getMessage().getContent();

See an example in our Spring Boot application

Set model parameters

Change your LLM module configuration to add model parameters:

var llmConfig =
    LLMModuleConfig.create()
        .modelName("gpt-35-turbo")
        .modelParams(
            Map.of(
                "max_tokens", 50,
                "temperature", 0.1,
                "frequency_penalty", 0,
                "presence_penalty", 0));

See an example in our unit test

Add a header to every request

To add a header to AI Core requests, use the following code:

ApiClient client = Core.getClient().addDefaultHeader("header-key", "header-value");
DeploymentApi api = new DeploymentApi(client);

For more customization, creating a HeaderProvider is also possible.

Requirements and Setup for AI Core

For any AI Core service interaction, the SAP AI SDK requires credentials to be available at application runtime. By default, the credentials are extracted automatically from a service instance of type "aicore" bound to the application. Running the application locally without this service binding will throw an exception:

Could not find any matching service bindings for service identifier 'aicore'

There are multiple options to register the service binding:

  • Regular service binding in SAP BTP Cloud Foundry (resulting in VCAP_SERVICES env var entry).
  • Set an environment variable explicitly: AICORE_SERVICE_KEY
  • Define and use a Destination in BTP Destination Service.
  • (For CAP applications) use the hybrid testing approach (not recommended for production).
    • For example: cds bind --to aicore --exec mvn spring-boot:run
  • Leveraging "user-provided" service binding (not recommended for production).
  • Define and use a custom ServiceBinding or ServiceBindingAccessor declaration in application (not recommended for production).

Regular service binding in SAP BTP Cloud Foundry

After application restart, there should be an "aicore" entry in environment variable VCAP_SERVICES.
{
  "aicore": [
      {
        "clientid": "...",
        "clientsecret": "...",
        "url": "...",
        "identityzone": "...",
        "identityzoneid": "...",
        "appname": "...",
        "serviceurls": {
          "AI_API_URL": "..."
        }
      }
  ]
}

Set credentials as dedicated environment variable

  • Go into the SAP BTP Cockpit

  • Instances and Subscriptions -> Instances -> AI Core -> View Credentials -> Copy JSON

  • Set it as an environment variable AICORE_SERVICE_KEY in your IDE

    Or in your terminal:

export AICORE_SERVICE_KEY='{   "serviceurls": {     "AI_API_URL": ...'

Define and use a Destination

  • Lookup service-key credentials as explained in the previous step for AICORE_SERVICE_KEY.

  • Define a new destination in the SAP BTP Destination Service using the service-key credentials
    • (Destinations can be added on subaccount level and on service instance level.)
    • (The URL field requires an additional path segment: /v2)
    • Name: my-aicore
    • Type: HTTP
    • URL: [serviceurls.AI_API_URL]/v2
    • Proxy-Type: Internet
    • Authentication: Oauth2ClientCredentials
    • Client ID: [clientid]
    • Client Secret: [clientsecret]
    • Token Service URL Type: Dedicated
    • Token Service URL: [url]
  • At application runtime the following can be executed:

    Destination destination = DestinationAccessor.getDestination("my-aicore");
    ApiClient client = Core.getClient(destination);

Run the Spring Boot test application

cd sample-code/spring-app
mvn spring-boot:run

Deploy to Cloud Foundry

mvn clean package
cf push

Contribute

Set-up Formatting

  • Install the Google Java Format plugin on Intellij and follow these instructions.

Support, Feedback, Contributing

This project is open to feature requests/suggestions, bug reports etc. via GitHub issues. Contribution and feedback are encouraged and always welcome. For more information about how to contribute, the project structure, as well as additional contribution information, see our Contribution Guidelines.

Security / Disclosure

If you find any bug that may be a security problem, please follow our instructions at in our security policy on how to report it. Please do not create GitHub issues for security-related doubts or problems.

Code of Conduct

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone. By participating in this project, you agree to abide by its Code of Conduct at all times.

Licensing

Copyright 2024 SAP SE or an SAP affiliate company and ai-sdk-java contributors. Please see our LICENSE for copyright and license information. Detailed information including third-party components and their licensing/copyright information is available via the REUSE tool.