GitLab Duo Glossary

This is a list of terms that may have a general meaning but also may have a specific meaning at GitLab. If you encounter a piece of technical jargon related to AI that you think could benefit from being in this list, add it!

General terminology

Adapters

A variation on Fine Tuning. Instead of opening the model and adjusting the layer weights, new trained layers are added onto the model or hosted in an upstream standalone model. Also known as Adapter-based Models. By selectively fine-tuning these specific modules rather than the entire model, Adapters facilitate the customisation of pre-trained models for distinct tasks, requiring only a minimal increase in parameters. This method enables precise, task-specific adjustments of the model without altering its foundational structure.

AI catalog

The Workflow Catalog Group is focused on developing Workflow Catalog, a catalog of Agents, tools, and flows that can be created, curated, and shared across organizations, groups, and projects.

AI gateway

Standalone service used to give access to AI features to non-SaaS GitLab users. This logic will be moved to Cloud Connector when that service is ready. Eventually, the AI gateway will be used to host endpoints that proxy requests to AI providers, removing the need for the GitLab Rails monolith to integrate and communicate directly with third-party Large Language Models (LLMs). Design document.

AI gateway prompt

An encapsulation of prompt templates, model selection, and model parameters. As part of the AI gateway as the Sole Access Point for Monolith to Access Models effort we're migrating these components from the GitLab Rails monolith into the prompts package in the AI gateway.

AI gateway prompt registry

A component responsible for maintaining a list of AI gateway Prompts available to perform specific actions. Currently, we use a LocalPromptRegistry that reads definitions from YAML files in the AI gateway.

air-gapped model

A hosted model that is internal to an organisations intranet only. In the context of GitLab AI features, this could be connected to an air-gapped GitLab instance.

Bring Your Own Model (BYOM)

A third-party model to be connected to one or more GitLab Duo features. Could be an off-the-shelf Open Source (OS) model, a fine-tuned model, or a closed source model. GitLab is planning to support specific, validated BYOMs for GitLab Duo features, but does not plan to support general BYOM use for GitLab Duo features.

Chat evaluation

Automated mechanism for determining the helpfulness and accuracy of GitLab Duo Chat to various user questions. The MVC is an RSpec test run via GitLab CI that asks a set of questions to Chat and then has a two different third-party LLMs determine if the generated answer is accurate or not. MVC. Design doc for next iteration.

Cloud Connector

Cloud Connector is a way to access services common to multiple GitLab deployments, instances, and cells. We use it as an umbrella term to refer to the set of technical solutions and APIs used to make such services available to all GitLab customers. For more information, see the Cloud Connector architecture.

closed source model

A private model fine-tuned or built from scratch by an organisation. These may be hosted as cloud services, for example ChatGPT.

consensus filtering

Consensus filtering is a method of LLM evaluation. An LLM judge is asked to rate and compare the output of multiple LLMs to sets of prompts. This is the method of evaluation being used for the Chat Evaluation MVC. Issue from Model Validation team.

context

Relevant information that surrounds a data point, an event, or a piece of information, which helps to clarify its meaning and implications. For GitLab Duo Chat, context is the attributes of the Issue or Epic being referenced in a user question.

custom model

Any implementation of a GitLab Duo feature using a self-hosted model, BYOM, fine-tuned model, RAG-enhanced model, or adapter-based model.

embeddings

In the context of machine learning and large language models, embeddings refer to a technique used to represent words, phrases, or even entire documents as dense numerical vectors in a continuous vector space. At GitLab, we use Vertex AI's Embeddings API to create a vector representation of GitLab documentation. These embeddings are stored in the vertex_gitlab_docs database table in the embeddings database. The embeddings search is done in Postgres using the vector extension. The vertex embeddings database is updated based on the latest version of GitLab documentation on a daily basis by running Llm::Embedding::GitlabDocumentation::CreateEmbeddingsRecordsWorker as a cronjob.

fine-tuning

Altering an existing model using a supervised learning process that utilizes a dataset of labeled examples to update the weights of the LLM, improving its output for specific tasks such as code completion or Chat.

foundational model

A general purpose LLM trained using a generic objective, typically next token prediction. These models are capable and flexible, and can be adjusted to solved many domain-specific tasks (through finetuning or prompt engineering). This means that these general purpose models are ideal to serve as the foundation of many downstream models. Examples of foundational models are: GPT-4o, Claude 3.7 Sonnet.

frozen model

A LLM which cannot be fine-tuned (also Frozen LLM).

GitLab Duo

AI-assisted features across the GitLab DevSecOps platform. These features aim to help increase velocity and solve key pain points across the software development lifecycle. See also the GitLab Duo features page.

GitLab-managed model

A LLM that is managed by GitLab. Currently all GitLab Managed Models are hosted externally and accessed through the AI gateway. GitLab-owned API keys are used to access the models.

golden questions

A small subset of the types of questions we think a user should be able to ask GitLab Duo Chat. Used to generate data for Chat evaluation. Questions for Chat Beta.

ground truth

Data that is determined to be the true output for a given input, representing the reality that the AI model aims to learn and predict. Ground truth data are often human-annotated, but may also be produced from a trusted source such as an LLM that has known good output for a given use case.

local model

A LLM running on a user's workstation. More information.

LLM

A Large Language Model, or LLM, is a very large-scale neural network trained to understand and generate human-like text. For GitLab Duo features, GitLab is currently working with frozen models hosted at Google and Anthropic

offline model

A model that runs without internet or intranet connection (for example, you are running a model on your laptop on a plane).

open-source model

Models that are published with their source code and weights and are available for modifications and re-distribution. Examples: Llama / Llama 2, BLOOM, Falcon, Mistral, Gemma.

Centralized Evaluation Framework

The "Centralized Evaluation Framework" is a Python library that provides a CLI for evaluating GitLab AI features. It enables data-driven improvements to LLM applications by facilitating hypothesis testing. Code.

prompt registry

Stored, versioned prompts used to interact with third-party AI Services. Design document proposal MR (closed).

prompt

Natural language instructions sent to an LLM to perform certain tasks. Prompt guidelines.

RAG (Retrieval Augmented Generation)

RAG provide contextual data to an LLM as part of a query to personalise results. RAG is used to inject additional context into a prompt to decrease hallucinations and improve the quality of outputs.

RAG pipeline

A mechanism used to take an input (such as a user question) into a system, retrieve any relevant data for that input, augment the input with additional context, and then synthesize the information to generate a coherent, contextualy-relevant answer. This design pattern is helpful in open-domain question answering with LLMs, which is why we use this design pattern for answering questions to GitLab Duo Chat.

self-hosted model

A LLM hosted externally to GitLab by an organisation and interacting with GitLab AI features. See also the style guide reference.

similarity score

A mathematical method to determine the likeness between answers produced by an LLM and the reference ground truth answers. See also the Model Validation direction page

tool

Logic that performs a specific LLM-related task; each tool has a description and its own prompt. How to add a new tool.

unit primitive

GitLab-specific term that refers to the fundamental logical feature that a permission or access scope can control. Examples: duo_chat and code_suggestions. These features are both currently part of the GitLab Duo Pro license but we are building the concept of a Unit Primitive around each Duo feature so that Duo features are easily composable into different groupings to accommodate potential future product packaging needs.

word-level metrics

Method for LLM evaluation that compares aspects of text at the granularity of individual words. Issue from Model Validation team.

zero-shot agent

In the general world of AI, a learning model or system that can perform tasks without having seen any examples of that task during training. At GitLab, we use this term to refer specifically to a piece of our code that serves as a sort of LLM-powered air traffic controller for GitLab Duo Chat. The GitLab zero-shot agent has a system prompt that explains how an LLM should interpret user input from GitLab Duo Chat as well as a list of tool descriptions. Using this information, the agent determines which tool to use to answer a user's question. The agent may decide that no tools are required and answer the question directly. If a tool is used, the answer from the tool is fed back to the zero-shot agent to evaluate if the answer is sufficient or if an additional tool must be used to answer the question.

Code.

Zero-shot agent in action.

GitLab Duo Agent Platform terminology

Core layer concepts (GitLab-specific)

Flow

A goal-oriented, structured graph that orchestrates agents and tools to deliver a single, economically-valuable outcome (e.g., create a code-review MR, triage issues).

Structure - Explicit phases: planning → execution → completion
Nodes - Each node is an Agent (decision-maker) or Deterministic step: CRUD, Boolean decision
Trigger & Terminator - Every flow has one or many defined start trigger(s) and a defined end state
Input - Each Flow must have an input. Inputs set the context for the Flow session and will differentiate different flows in outcomes. Inputs can be: Free text, Entities (GitLab or from 3rd party)
Session - One execution of an flow; sessions carry user-specific goals and data

Analogy: competency / job description - the "what & when" of getting work done.

Agent

A specialized, LLM-powered decision-maker that owns a single node inside an flow. Can be defined independently and reused across multiple flows as a reusable component.

Prompt (System) - Sets the overall behavior, guardrails and persona for the agents
Prompt (Goal) - Receives the session-specific objective from the flow
Tools - May call only the tools granted by the flow node definition and the user/company definition of available tools
Agents / Flows - Agents can invoke other agents or Flows to achieve their goal if these were made available
Reasoning - Uses an LLM to decompose its goal into dynamic subtasks
Context awareness - Gains project / repo / issue data through tool calls

GitLab agents are specialists, not generalists, to maximize reliability and UX.

Tool

A discrete, deterministic capability an agent (or flow step) invokes to perform read/write actions. Tools can be used to perform these in GitLab or in 3rd party applications via MCP or other protocols.

Examples: read GitLab issues, clone a repository, commit & push changes, call a REST API. Tools expose data or side-effects; they themselves perform no reasoning.

Flow types

Current implementation

Sequence - The Flow is executing agents that handover their output to the next agent in a pre set manner

Future implementations

Single Agent - A single agent is executing the entire flow to completion, suitable for small defined tasks with latency considerations
Multi Agent - A pool of agents are working to complete a task in a manner where each agent is getting a chance to solve it, and/or a supervisor chooses the final solution. Can support different graph topologies

Supporting Terminology

Term	Definition
Node (Flow node)	A single step in the flow graph. GitLab currently supports Agent, Tool Executor, Agent Handover, Supervisor, and Terminator nodes.
Session	One instantiation of an flow with concrete user input and data context.
Task	A formal object representing a unit of work inside a run. At present only the Executor agent persists tasks, but the concept is extensible.
Trigger	An event that starts an flow run (e.g., slash command, schedule, issue label).
Agent handover	Node type that packages context from one agent and passes it to another.
Supervisor agent	An agent node that monitors other agents' progress and enforces run-level constraints (timeout, max tokens, etc.).
Subagent	Shorthand for an agent that operates under a Supervisor within the same run.
Autonomous agent	Historical term for an agent that can loop without human approval. In GitLab, autonomy level is governed by flow design, not by a separate agent type.
Framework	A platform for building multi-agent systems. GitLab Duo Agent Platform uses LangGraph, an extension to LangChain that natively models agent graphs.

Execution

Flows are executed in the following ways:

Local - The Flow is executed in relation to a project or a folder (future)
Remote - The Flow is executed by CI/CD runners in relation to a project, group (future), namespace (future)

Quick Reference Matrix

Layer	Human Analogy	Key Question Answered
Tool	Capability	"What concrete action can I perform?"
Agent	Skill / Specialist	"How do I use my tools to reach my goal?"
Flow	Competency / Job	"When and in what order should skills be applied to deliver value?"

AI Context Terminology

Advanced Context Resolver

Advanced context is a comprehensive set of code-related information extending beyond a single file, including open file tabs, imports, dependencies, cross-file symbols and definitions, and project-wide relevant code snippets.

Advanced context resolver is a system designed to gather the above advanced context. By providing advanced context, the resolver providers the LLM with a more holistic understanding of the project structure, enabling more accurate and context-aware code suggestions and generation.

AI Context Abstraction Layer

A Ruby gem that provides a unified interface for Retrieval Augmented Generation (RAG) across multiple vector databases within GitLab. The system abstracts away the differences between Elasticsearch, OpenSearch, and PostgreSQL with pgvector, enabling AI features to work regardless of the underlying storage solution.

Key components include collections that define data schemas and reference classes that handle serialization, migrations for schema management, and preprocessors for chunking and embedding generation. The layer supports automatic model migration between different LLMs without downtime, asynchronous processing through Redis-backed queues, and permission-aware search with automatic redaction.

This architecture prevents vendor lock-in and enables GitLab customers without Elasticsearch to access RAG-powered features through pgvector.

AI Context Policies

A user-defined and user-managed mechanism allowing precise control over the content that can be sent to LLMs as contextual information. GitLab has an architecture document that proposes a format for AI Context Policies.

Codebase as Chat Context

This refers to a repository that the user explicitly provides using the /include command. The user may narrow the scope by choosing a directory within a repository. This feature allows the user to ask questions about an entire repository, or a subset of that repository by selecting specific directories.

This is automatically enhanced by performing a semantic search of the user's question over the Code Embeddings of the included repository, with the search results then added to the context sent to the LLM. This gives the LLM information about the included repository or directory that is specifically targeted to the user's question, allowing the LLM to generate a more helpful response.

This architecture document proposes Codebase as Chat Context enhanced by semantic search over Code Embeddings.

In the future, the repository or directory context may also be enhanced by a Knowledge Graph search.

Code Embeddings

The Code Embeddings initiative aims to build vector embeddings representation of files in a repository. The file contents are chunked into logical segments, then embeddings are generated for the chunked content and stored in a vector store.

With Code Embeddings, we can perform a semantic search over a given repository, with the search results then used as additional context for an LLM. (See Codebase as Chat Context for how Code Embeddings will be used in Duo Chat.)

GitLab Zoekt

A scalable exact code search service and file-based database system, with flexible architecture supporting various AI context use cases beyond traditional search. It's built on top of open-source code search engine Zoekt.

The system consists of a unified gitlab-zoekt binary that can operate in both indexer and webserver modes, managing index files on persistent storage for fast searches. Key features include bi-directional communication with GitLab and self-registering node architecture for easy scaling.

The system is designed to handle enterprise-scale deployments, with GitLab.com successfully operating over 48 TiB of indexed data.

Most likely, this distributed database system will be used to power Knowledge Graph. Also, we might leverage Exact Code Search to provide additional context and/or tools for GitLab Duo.

Knowledge Graph

The Knowledge Graph project aims to create a structured, queryable graph database from code repositories to power AI features and enhance developer productivity within GitLab.

Think of it like creating a detailed blueprint that shows which functions call other functions, how classes relate to each other, and where variables are used throughout the codebase. Instead of GitLab Duo having to read through thousands of files every time you ask it something, it can quickly navigate this pre-built map to give you better code suggestions, find related code snippets, or help debug issues. It gives Duo a much smarter way to understand your codebase so it can assist you more effectively with things like code reviews, refactoring, or finding where to make changes when you're working on a feature.

One Parser (GitLab Code Parser)

The GitLab Code Parser establishes a single, efficient, and reliable static code analysis library. This library will serve as the foundation for diverse code intelligence features across GitLab, from server-side indexing (Knowledge Graph, Embeddings) to client-side analysis (Language Server, Web IDE). Initially scoped to AI and Editor Features.

Supplementary User Context

Information, such as open tabs in their IDE, files, and folders, that the user provides from their local environment to extend the default AI Context. This is sometimes called "pinned context" internally. GitLab Duo Chat users can provide supplementary user context with the /include command (IDE only).