Skip to content

Set up your self-hosted model infrastructure

DETAILS: Tier: For a limited time, Ultimate. On October 17, 2024, Ultimate with GitLab Duo Enterprise. Offering: Self-managed Status: Beta

FLAG: The availability of this feature is controlled by a feature flag. For more information, see the history.

By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.

To set up your self-hosted model infrastructure:

  1. Install the large language model (LLM) serving infrastructure.
  2. Configure your GitLab instance.
  3. Install the GitLab AI Gateway.

Install large language model serving infrastructure

Install one of the following GitLab-approved LLM models:

Model family Model Code completion Code generation GitLab Duo Chat
Mistral Codestral 22B (see setup instructions) {check-circle} Yes {check-circle} Yes {dotted-circle} No
Mistral Mistral 7B {dotted-circle} No {check-circle} Yes {check-circle} Yes
Mistral Mixtral 8x22B {dotted-circle} No {check-circle} Yes {check-circle} Yes
Mistral Mixtral 8x7B {dotted-circle} No {check-circle} Yes {check-circle} Yes
Mistral Mistral 7B Text {check-circle} Yes {dotted-circle} No {dotted-circle} No
Mistral Mixtral 8x22B Text {check-circle} Yes {dotted-circle} No {dotted-circle} No
Mistral Mixtral 8x7B Text {check-circle} Yes {dotted-circle} No {dotted-circle} No
Claude 3 Claude 3.5 Sonnet {check-circle} No {check-circle} Yes {check-circle} Yes

The following models are under evaluation, and support is limited:

Model family Model Code completion Code generation GitLab Duo Chat
CodeGemma CodeGemma 2b {check-circle} Yes {dotted-circle} No {dotted-circle} No
CodeGemma CodeGemma 7b-it (Instruction) {dotted-circle} No {check-circle} Yes {dotted-circle} No
CodeGemma CodeGemma 7b-code (Code) {check-circle} Yes {dotted-circle} No {dotted-circle} No
CodeLlama Code-Llama 13b-code {check-circle} Yes {dotted-circle} No {dotted-circle} No
CodeLlama Code-Llama 13b {dotted-circle} No {check-circle} Yes {dotted-circle} No
DeepSeekCoder DeepSeek Coder 33b Instruct {check-circle} Yes {check-circle} Yes {dotted-circle} No
DeepSeekCoder DeepSeek Coder 33b Base {check-circle} Yes {dotted-circle} No {dotted-circle} No
GPT GPT-3.5-Turbo {check-circle} No {dotted-circle} Yes {dotted-circle} No
GPT GPT-4 {check-circle} No {dotted-circle} Yes {dotted-circle} No
GPT GPT-4 Turbo {check-circle} No {dotted-circle} Yes {dotted-circle} No
GPT GPT-4o {check-circle} No {dotted-circle} Yes {dotted-circle} No
GPT GPT-4o-mini {check-circle} No {dotted-circle} Yes {dotted-circle} No

Use a serving architecture

To host your models, you should use:

  • For non-cloud on-premise deployments, vLLM.
  • For cloud deployments, AWS Bedrock or Azure as a cloud providers.

Configure your GitLab instance

Prerequisites:

  • Upgrade to the latest version of GitLab.
  1. The GitLab instance must be able to access the AI Gateway.

    1. Where your GitLab instance is installed, update the /etc/gitlab/gitlab.rb file.

      sudo vim /etc/gitlab/gitlab.rb
    2. Add and save the following environment variables.

      gitlab_rails['env'] = {
      'GITLAB_LICENSE_MODE' => 'production',
      'CUSTOMER_PORTAL_URL' => 'https://customers.gitlab.com',
      'AI_GATEWAY_URL' => '<path_to_your_ai_gateway>:<port>'
      }
    3. Run reconfigure:

      sudo gitlab-ctl reconfigure

GitLab AI Gateway

Install the GitLab AI Gateway.

Enable logging

Prerequisites:

  • You must be an administrator for your self-managed instance.

To enable logging and access the logs, enable the feature flag:

Feature.enable(:expanded_ai_logging)

Disabling the feature flag stops logs from being written.

Logs in your GitLab installation

In your instance log directory, a file called llm.log is populated.

For more information on:

Logs in your AI Gateway container

To specify the location of logs generated by AI Gateway, run:

docker run -e AIGW_GITLAB_URL=<your_gitlab_instance> \
 -e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
 -e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
 -e AIGW_LOGGING__TO_FILE="aigateway.log" \
 -v <your_file_path>:"aigateway.log"
 <image>

If you do not specify a file name, logs are streamed to the output.

Additionally, the outputs of the AI Gateway execution can also be useful for debugging issues. To access them:

  • When using Docker:

    docker logs <container-id>
  • When using Kubernetes:

    kubectl logs <container-name>

To ingest these logs into the logging solution, see your logging provider documentation.

Logs in your inference service provider

GitLab does not manage logs generated by your inference service provider. Please refer to the documentation of your inference service provider on how to use their logs.

Cross-referencing logs between AI Gateway and GitLab

The property correlation_id is assigned to every request and is carried across different components that respond to a request. For more information, see the documentation on finding logs with a correlation ID.

Correlation ID is not available in your model provider logs.

Troubleshooting

First, run the debugging scripts to verify your self-hosted model setup.

For more information on other actions to take, see the troubleshooting documentation.