Repo Generation

Koji generates repositories based on tag content. For the most part, this means yum repos from the rpm content, but Koji can also generate maven repos if configured to do so.

The primary purpose of these repos is to facilitate Koji’s own build process. Most builds utilize a buildroot generated by the mock tool, which needs a yum repository to pull packages from.

Repositories can be triggered in different ways and with different parameters, but all repositories represent the contents of a tag at a specific point in time (i.e. an event).

On demand generation

When Koji needs a repo for a tag, it files a request via a hub call. Typically this is done in a build process, but requests can also be triggered automatically without a build if configured. They can also be triggered manually.

repo.request(tag, min_event=None, at_event=None, opts=None, priority=None, force=False)
description: Request a repo for a tag

    :param int|str taginfo: tag id or name
    :param int|str min_event: minimum event for the repo (optional)
    :param int at_event: specific event for the repo (optional)
    :param dict opts: custom repo options (optional)
    :param bool force: force request creation, even if a matching repo exists

    The special value min_event="last" uses the most recent event for the tag
    Otherwise min_event should be an integer

    use opts=None (the default) to get default options for the tag.
    If opts is given, it should be a dictionary of repo options. These will override
    the defaults.

Each repo request is for a single tag. The optional min_event parameter specifies how recent the repo needs to be. If not given, Koji chooses a suitably recent event. The optional opts specifies options for creating the repo. If not given, Koji uses the default options based on the tag.

When the hub responds to this call, it first checks to see if an existing repo satisfies the request. If so, then information for that repo is returned and no further action is taken. If there is no such repo yet, then Koji records the request and returns the request data. If an identical active request already exists, then Koji will return that.

Build parameters

For some types of builds, the user can affect the parameters of the repo request.

For rpms builds, the --wait-repo option will cause the build to request a current repo. That is, the min_event for the request will be the most recent event that affected the tag. For example, if a previous build has just been tagged into the buildroot, then this option will ensure that the new build gets a repo containing the previous one.

It’s worth noting that rpm builds also accept --wait-build option(s) that will cause the build to wait for specific NVRs to be present in the repo. This option is not actually handled by the request mechanism. Instead, the build will wait for these NVRs to be tagged and then request a current repo.

Repository Options

There are a few options that govern how the repo is generated. At present these are:

src

whether to include srpms in the repos

debuginfo

whether to include debuginfo rpms

separate_src

whether to create a separate src repo

maven

whether to also create a maven repo

These options are normally determined by the tag that the repo is based on. Administrators can set repo.opts for a given tag to control these options.

Additionally the following pattern based hub options can be used:

SourceTags

Tags matching these glob patterns will have the src option set

DebuginfoTags

Tags matching these glob patterns will have the debuginfo option set

SeparateSourceTags

Tags matching these glob patterns will have the separate_src option set

For historical reasons, the maven option can also controlled by setting the maven_support field for the tag. E.g. koji edit-tag --maven-support MYTAG

Note that the maven option is ignored if Maven support is disabled on the hub.

Manually requested repos can specify their own custom options.

Automatic generation

Automatic generation can be configured setting repo.auto=True for a given tag. This requires administrative access. The system regularly requests repos for such tags.

From Requests to Repos

All repo requests go into a queue that Koji regularly checks. As long as there is sufficient capacity, Koji will create newRepo tasks for these requests.

The status of a request can be checked with the repo.checkRequest api call

repo.checkRequest(req_id)
description: Report status of repo request

    :param int req_id the request id
    :return: status dictionary

    The return dictionary will include 'request' and 'repo' fields

If the return includes a non-None repo field, then that repo satisfies the request. The request field will include task_id and task_state (may be None) to indicate progress.

Repository Data

The hub stores key data about each repo in the database and this can be reported numerous ways. One common way is the repoInfo call, which returns data about a single repository. E.g.

$ koji call repoInfo 2398
{'begin_event': 497152,
 'begin_ts': 1707888890.306149,
 'create_event': 497378,
 'create_ts': 1710216388.543129,
 'creation_time': '2024-03-12 00:06:28.541893-04:00',
 'creation_ts': 1710216388.541893,
 'custom_opts': None,
 'dist': False,
 'end_event': None,
 'end_ts': None,
 'id': 2398,
 'opts': {'debuginfo': False, 'separate_src': False, 'src': False},
 'state': 3,
 'state_time': '2024-03-17 17:03:49.820435-04:00',
 'state_ts': 1710709429.820435,
 'tag_id': 2,
 'tag_name': 'f24-build',
 'task_id': 13611,
 'task_state': 2}

Key fields

id

The integer id of the repo itself

tag_id

The integer id of the tag the repo was created from

tag_name

The name of the tag the repo was created from

state

The (integer) state of the repo. Corresponds to koji.REPO_STATES values

create_event

The event id (moment in koji history) that the repo was created from. I.e. the contents of the repo come from the contents of the tag at this event.

create_ts

This is the timestamp for the create_event.

creation_ts / creation_time

This is the time that the repo was created, which may be quite different than the time of the repo’s create_event. The creation_ts field is the numeric value and creation_time is a string representation of that.

state_ts / state_time

This is the time that the repo last changed state.

begin_event / end_event

These events define the range of validity for the repo. Individual events do not necessarily affect a given tag, so for each repo there is actually a range of events where it accurately represents the tag contents. The begin_event is the first event in the range. This will often be the same as the create_event, but might not be. The end_event is the first event after creation that changes the tag. This is often None when a repo is created. Koji will update this field as tags change.

begin_ts / end_ts

These are the numeric timestamps for the begin and end events.

opts

This is dictionary of repo creation options

custom_opts

This dictionary indicates which options were overridden by the request

task_id

The numeric id of the task that created the repo

dist

A boolean flag. True for dist repos.

Repository Lifecycle

Generally, the lifecycle looks like:

INIT -> READY -> EXPIRED -> DELETED

Repositories begin in the INIT state when the newRepo task first initializes them. Repos in this state are incomplete and not ready to be used.

When Koji finishes creating a repo, it is moved to the READY state. Such repos are ready to be used. Their contents will remain unchanged until they are deleted. Note that this state does not mean the repo is current for its tag.

When a repo is no longer relevant, Koji will move it to the EXPIRED state. This means the repo is marked for deletion and should no longer be used.

Once a repo has been expired for a waiting period, Koji will move it to the DELETED state and remove its files from disc. The database entry will remain

In cases of unusual errors, a repo might be moved to the PROBLEM state. Such repos should not be used and will eventually be deleted.

Hub Configuration

There are several hub configuration option governing repo generation behavior:

MaxRepoTasks

The maximum number of newRepo tasks to run at one time. Default: 10

MaxRepoTasksMaven

The maximum number of newRepo tasks for maven tags to run a one time. Default: 2

RepoRetries

The number of times to retry a failed newRepo task per request. Default: 3

RequestCleanTime

The number of minutes to wait before clearing an inactive repo request. Default: 1440

AllowNewRepo

Whether to allow the legacy newRepo call. Default: True

RepoLag

This affects the default min_event value for normal repo requests. An event roughly this many seconds in the past is used. Default: 3600

RepoAutoLag

Same as RepoLag, but for automatic requests. Default: 7200

RepoLagWindow

This affects the granularity of the RepoLag and RepoAutoLag settings. Default: 600

RepoQueueUser

The user that should own the newRepo tasks generated by repo requests. Default: kojira

SourceTags

Tags matching these glob patterns will have the src option set. Default: ''

DebuginfoTags

Tags matching these glob patterns will have the debuginfo option set. Default: ''

SeparateSourceTags

Tags matching these glob patterns will have the separate_src option set Default: ''

Repository Layout

Koji’s repositories live under /mnt/koji/repos. From there, they are indexed by tag name and repo id. So, the full path to a given repository would look something like

/mnt/koji/repos/f40-build/6178041/

This directory will contain:

  • repo.json – data about the repo itself

  • groups – a directory containing comps data

  • <ARCH> – a directory for each tag arch containing a yum repo

The full path to an actual yum repo would be something like:

/mnt/koji/repos/f40-build/6178041/x86_64

This directory will contain:

  • pkglist – file listing the relative paths to the rpms for the repo

  • blocklist – file listing the blocked package names for the tag

  • rpmlist.jsonl – json data for the rpms in the repo

  • toplink – a relative symlink to the top of Koji’s directory tree (i.e. up to /mnt/koji)

  • repodata – yum repo data

By default, source rpms are omitted. This can be controlled by repository options. If the src option is True, then source rpms will be added to each arch repo separately, similar to noarch rpms. If the separate_src option is True, then a separate src repo is created.

Dist Repos

Dist repos are managed by a separate process. See Exporting repositories for more details.

Older Koji Versions

Prior to Koji 1.35, the triggering of repo generation was quite different. The kojira service monitored all build tags and trigger newRepo tasks whenever the tag content changed. The work queue was managed in kojira. For large systems, this could lead to significant regeneration backlogs.