Repo Generation
===============

Koji generates repositories based on tag content. For the most part, this means yum repos
from the rpm content, but Koji can also generate maven repos if configured to do so.

The *primary* purpose of these repos is to facilitate Koji's own build process.
Most builds utilize a buildroot generated by the ``mock`` tool, which needs a yum repository
to pull packages from.

Repositories can be triggered in different ways and with different parameters, but all
repositories represent the contents of a tag at a specific point in time (i.e. an event).


On demand generation
--------------------

When Koji needs a repo for a tag, it files a *request* via a hub call.
Typically this is done in a build process, but requests can also be triggered automatically
without a build if configured. They can also be triggered manually.

::

    repo.request(tag, min_event=None, at_event=None, opts=None, priority=None, force=False)
    description: Request a repo for a tag

        :param int|str taginfo: tag id or name
        :param int|str min_event: minimum event for the repo (optional)
        :param int at_event: specific event for the repo (optional)
        :param dict opts: custom repo options (optional)
        :param bool force: force request creation, even if a matching repo exists

        The special value min_event="last" uses the most recent event for the tag
        Otherwise min_event should be an integer

        use opts=None (the default) to get default options for the tag.
        If opts is given, it should be a dictionary of repo options. These will override
        the defaults.


Each repo request is for a single tag. The optional ``min_event`` parameter specifies how recent the
repo needs to be. If not given, Koji chooses a suitably recent event. The optional ``opts`` specifies
options for creating the repo. If not given, Koji uses the default options based on the tag.

When the hub responds to this call, it first checks to see if an existing repo satisfies the
request. If so, then information for that repo is returned and no further action is taken.
If there is no such repo yet, then Koji records the request and returns the request data.
If an identical active request already exists, then Koji will return that.


Build parameters
----------------

For some types of builds, the user can affect the parameters of the repo request.

For rpms builds, the ``--wait-repo`` option will cause the build to request a *current* repo.
That is, the ``min_event`` for the request will be the most recent event that affected the tag.
For example, if a previous build has just been tagged into the buildroot, then this option will
ensure that the new build gets a repo containing the previous one.

It's worth noting that rpm builds also accept ``--wait-build`` option(s) that will cause the build
to wait for specific NVRs to be present in the repo. This option is not actually handled by the
request mechanism. Instead, the build will wait for these NVRs to be tagged and then request a
current repo.


Repository Options
------------------

There are a few options that govern how the repo is generated. At present these are:

src
    whether to include srpms in the repos

debuginfo
    whether to include debuginfo rpms

separate_src
    whether to create a separate src repo

maven
    whether to also create a maven repo

These options are normally determined by the tag that the repo is based on.
Administrators can set ``repo.opts`` for a given tag to control these options.

Additionally the following pattern based hub options can be used:

SourceTags
    Tags matching these glob patterns will have the src option set

DebuginfoTags
    Tags matching these glob patterns will have the debuginfo option set

SeparateSourceTags
    Tags matching these glob patterns will have the separate_src option set

For historical reasons, the ``maven`` option can also controlled by setting the ``maven_support``
field for the tag. E.g. ``koji edit-tag --maven-support MYTAG``

Note that the ``maven`` option is ignored if Maven support is disabled on the hub.

Manually requested repos can specify their own custom options.


Automatic generation
--------------------

Automatic generation can be configured setting ``repo.auto=True`` for a given tag.
This requires administrative access.
The system regularly requests repos for such tags.


From Requests to Repos
----------------------

All repo requests go into a queue that Koji regularly checks.
As long as there is sufficient capacity, Koji will create ``newRepo`` tasks for these
requests.

The status of a request can be checked with the ``repo.checkRequest`` api call

::

    repo.checkRequest(req_id)
    description: Report status of repo request

        :param int req_id the request id
        :return: status dictionary

        The return dictionary will include 'request' and 'repo' fields

If the return includes a non-None ``repo`` field, then that repo satisfies the request.
The ``request`` field will include ``task_id`` and ``task_state`` (may be None) to indicate
progress.



Repository Data
---------------

The hub stores key data about each repo in the database and this can be reported numerous ways.
One common way is the ``repoInfo`` call, which returns data about a single repository. E.g.

::

    $ koji call repoInfo 2398
    {'begin_event': 497152,
     'begin_ts': 1707888890.306149,
     'create_event': 497378,
     'create_ts': 1710216388.543129,
     'creation_time': '2024-03-12 00:06:28.541893-04:00',
     'creation_ts': 1710216388.541893,
     'custom_opts': None,
     'dist': False,
     'end_event': None,
     'end_ts': None,
     'id': 2398,
     'opts': {'debuginfo': False, 'separate_src': False, 'src': False},
     'state': 3,
     'state_time': '2024-03-17 17:03:49.820435-04:00',
     'state_ts': 1710709429.820435,
     'tag_id': 2,
     'tag_name': 'f24-build',
     'task_id': 13611,
     'task_state': 2}

Key fields

.. glossary::
    id
        The integer id of the repo itself

    tag_id
        The integer id of the tag the repo was created from

    tag_name
        The name of the tag the repo was created from

    state
        The (integer) state of the repo. Corresponds to ``koji.REPO_STATES`` values

    create_event
        The event id (moment in koji history) that the repo was created from. I.e. the contents
        of the repo come from the contents of the tag at this event.

    create_ts
        This is the timestamp for the create_event.

    creation_ts / creation_time
        This is the time that the repo was created, which may be quite different than the time
        of the repo's create_event. The ``creation_ts`` field is the numeric value and
        ``creation_time`` is a string representation of that.

    state_ts / state_time
        This is the time that the repo last changed state.

    begin_event / end_event
        These events define the *range of validity* for the repo. Individual events do not
        necessarily affect a given tag, so for each repo there is actually a range of events
        where it accurately represents the tag contents.
        The ``begin_event`` is the first event in the range. This will often be the same as
        the create_event, but might not be.
        The ``end_event`` is the first event after creation that changes the tag. This is
        often None when a repo is created. Koji will update this field as tags change.

    begin_ts / end_ts
        These are the numeric timestamps for the begin and end events.

    opts
        This is dictionary of repo creation options

    custom_opts
        This dictionary indicates which options were overridden by the request

    task_id
        The numeric id of the task that created the repo

    dist
        A boolean flag. True for dist repos.


Repository Lifecycle
--------------------

Generally, the lifecycle looks like:

::

    INIT -> READY -> EXPIRED -> DELETED

Repositories begin in the ``INIT`` state when the ``newRepo`` task first initializes them.
Repos in this state are incomplete and not ready to be used.

When Koji finishes creating a repo, it is moved to the ``READY`` state. Such repos are ready
to be used. Their contents will remain unchanged until they are deleted.
Note that this state does not mean the repo is current for its tag.

When a repo is no longer relevant, Koji will move it to the ``EXPIRED`` state. This means the
repo is marked for deletion and should no longer be used.

Once a repo has been expired for a waiting period, Koji will move it to the ``DELETED`` state
and remove its files from disc. The database entry will remain

In cases of unusual errors, a repo might be moved to the ``PROBLEM`` state. Such repos should
not be used and will eventually be deleted.


Hub Configuration
-----------------

There are several hub configuration option governing repo generation behavior:

MaxRepoTasks
    The maximum number of ``newRepo`` tasks to run at one time. Default: ``10``

MaxRepoTasksMaven
    The maximum number of ``newRepo`` tasks for maven tags to run a one time. Default: ``2``

RepoRetries
    The number of times to retry a failed ``newRepo`` task per request. Default: ``3``

RequestCleanTime
    The number of minutes to wait before clearing an inactive repo request. Default: ``1440``

AllowNewRepo
    Whether to allow the legacy ``newRepo`` call. Default: ``True``

RepoLag
    This affects the default ``min_event`` value for normal repo requests.
    An event roughly this many seconds in the past is used.  Default: ``3600``

RepoAutoLag
    Same as RepoLag, but for automatic requests. Default: ``7200``

RepoLagWindow
    This affects the granularity of the ``RepoLag`` and ``RepoAutoLag`` settings. Default: ``600``

RepoQueueUser
    The user that should own the ``newRepo`` tasks generated by repo requests. Default: ``kojira``

SourceTags
    Tags matching these glob patterns will have the src option set. Default: ``''``

DebuginfoTags
    Tags matching these glob patterns will have the debuginfo option set. Default: ``''``

SeparateSourceTags
    Tags matching these glob patterns will have the separate_src option set Default: ``''``


Repository Layout
-----------------

Koji's repositories live under ``/mnt/koji/repos``. From there, they are indexed by tag name and repo id.
So, the full path to a given repository would look something like 

::

    /mnt/koji/repos/f40-build/6178041/

This directory will contain:

* ``repo.json`` -- data about the repo itself
* ``groups`` -- a directory containing comps data
* ``<ARCH>`` -- a directory for each tag arch containing a yum repo

The full path to an actual yum repo would be something like:

::

    /mnt/koji/repos/f40-build/6178041/x86_64

This directory will contain:

* ``pkglist`` -- file listing the relative paths to the rpms for the repo
* ``blocklist`` -- file listing the blocked package names for the tag
* ``rpmlist.jsonl`` -- json data for the rpms in the repo
* ``toplink`` -- a relative symlink to the top of Koji's directory tree (i.e. up to /mnt/koji)
* ``repodata`` -- yum repo data

By default, source rpms are omitted. This can be controlled by repository options.
If the ``src`` option is True, then source rpms will be added to each arch repo separately,
similar to noarch rpms.
If the ``separate_src`` option is True, then a separate ``src`` repo is created.


Dist Repos
----------

Dist repos are managed by a separate process.
See :doc:`exporting_repositories` for more details.


Older Koji Versions
-------------------

Prior to Koji 1.35, the triggering of repo generation was quite different.
The kojira service monitored all build tags and trigger ``newRepo`` tasks
whenever the tag content changed. The work queue was managed in kojira.
For large systems, this could lead to significant regeneration backlogs.