Problem Package Format

This is the 2023-07-draft version of the Kattis problem package format.

Overview

This document describes the format of a Kattis problem package, used for distributing and sharing problems for algorithmic programming contests as well as educational use.

This document does not explicitly specify an access policy to the data in a problem package. Normally most data (such as test data) should be considered privileged and only to be available to those managing the problems/contest (from now on referred to as "judges"), unless it is indicated as meant to be shared with those attempting to solve the problem (from now on referred to as "teams").

General Requirements

  • The package must consist of a single directory containing files as described below. The directory name must consist solely of lowercase letters a–z and digits 0–9. Alternatively, the package can be a ZIP-compressed archive of such a directory with identical base name and extension .kpp or .zip.
  • All file names for files included in the package must match the regexp
    ^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,253}[a-zA-Z0-9]$
    
    i.e., they must be of length at least 2, at most 255, consist solely of lower- or uppercase letters a–z, A–Z, digits 0–9, period, dash, or underscore, but must not begin or end with a period, dash, or underscore.
  • All directory names inside the package must match the regexp
    ^[a-zA-Z0-9]([a-zA-Z0-9_-]{0,253}[a-zA-Z0-9])?$
    

that is, they must be of length at least 1, at most 255, consist solely of lower- or uppercase letters a–z, A–Z, digits 0–9, dash, or underscore, but must not begin or end with a dash, or underscore.

  • All text files for a problem must be UTF-8 encoded and not have a byte-order mark (BOM).
  • All text files must have Unix-style line endings (newline/LF byte only). Note that LF is line-ending and not line-separating in POSIX, which means that all non-empty text files must end with a newline.
  • Natural language (for example, in the problem statement filename) must be specified as 2-letter ISO 639-1 code if it exists, otherwise as a 3-letter code from ISO 639. Optionally, it may be suffixed with a hyphen and an ISO 3166-1 alpha-2 code, as defined in BCP 47, for example, pt-BR to indicate Brazilian Portuguese.
  • All floating-point numbers (in any files, including in the problem package or submission output, that are parsed by contest system tools or the default output validator), must be given in decimal and may use scientific notation. More specifically, floating-point numbers must satisfy the following grammar, which accepts formatted floating-point output from most major programming languages:
    sign         [+-]
    digit        [0123456789]
    expIndicator [Ee]
    significand  ( {digit}* "." {digit}+ | {digit}+ "." | {digit}+ )
    exponent     {expIndicator} {sign}? {digit}+
    float        {sign}? {significand} {exponent}?
    
  • The systems parsing these floating-point numbers may impose reasonable limits on the number of digits, but must support at least 30 digits before and after the decimal point. They must use an internal representation with at least 52 bits of mantissa precision and should make a "best effort" to parse floating-point numbers to their closest representable values.
  • The problem package may include symbolic links to other files in the problem package. Symlinks must not have targets outside the problem package directory tree.

Problem Package Structure Overview

The following table summarizes the elements of a problem package described in this specification:

File or Folder Required? Described In Description
problem.yaml Yes Problem Metadata Metadata about the problem (e.g., source, license, limits)
statement/ Yes Problem Statements Problem statement files
attachments/ No Attachments Files available to teams other than the problem statement and sample test data
solution/ No Solution Description Written explanations of how to solve the problem
data/sample/ No Test Data Sample test data
data/secret/ Yes Test Data Secret test data
data/invalid_input/ No Invalid Test Cases Invalid test case input for testing input validation
data/invalid_output/ No Invalid Test Cases Invalid test case output for testing output validation
data/valid_output/ No Valid Output Valid test case output for testing output validation
generators/ No Generators Scripts and documentation about how test cases were automatically generated
include/ No Included Files Files appended to all submitted solutions
submissions/ Yes Example Submissions Correct and incorrect judge solutions of the problem
input_validators/ Yes Input Validators Programs that verifies correctness of the test data inputs
static_validator/ No Static Validator Custom program for judging solutions with source files as input
output_validator/ No Output Validator Custom program for judging solutions
input_visualizer/ No Input Visualizer Scripts and documentation about how test case illustrations were generated
output_visualizer/ No Output Visualizer Program to generate images illustrating submission output

A minimal problem package must contain problem.yaml, a problem statement, a secret test case, an accepted judge solution, and an input validator.

Programs

There are a number of different kinds of programs that may be provided in the problem package: submissions, input validators, output validators, and output visualizers. All programs are always represented by a single file or directory. In other words, if a program consists of several files, these must be provided in a single directory. In the case that a program is a single file, it is treated as if a directory with the same name takes its place, which contains only that file. The name of the program, for the purpose of referring to it within the package, is the base name of the file or the name of the directory. There can't be two programs of the same kind with the same name.

Languages and Compilation

Submissions

The language of a submission program is determined by the language key in submissions.yaml if present; otherwise, by comparing the file extensions of the submission file(s) to those specified in the languages table. If a single language can't be determined, building fails. Included files, if any, must be copied into the submission folder before building the submission.

For languages where there could be several entry points, the entry point specified by the entrypoint key in submissions.yaml is used if present; otherwise, the default entry point in the languages table is used.

Other Programs

Other programs (validators and visualizers) provided as a directory may include one of two POSIX-compliant shell scripts, build and run. If at least one of these two files is included:

  1. First, if the build script is present, it must be executable and will be run. The working directory will be (a copy of) the program directory. The run file must exist in that directory and be executable after build is done.
  2. Then, the run file (which now exists, and is an executable binary or POSIX-compliant shell script) will be used as the validator or visualizer program.

Scripts may assume a POSIX-compliant shell and that a Python 3 interpreter, C compiler, and C++ compiler are available on the system search path, aliased to python3, cc, and c++ respectively. Problem packages with build or run scripts are strongly encouraged to include a README file in the program directory documenting any such additional dependencies.

Programs that do not include a build or run script must have one of the following forms:

  • a single Python 3 file;
  • a directory containing multiple Python 3 source files, two of which are __init__.py (defining a module) and __main__.py (which will be used as the program entry point);
  • a single C or C++ source file, or a directory containing one or more such files.

The language of files is inferred from their extension as listed in the languages table.

Working Directory

Each program must be run in a working directory with the following contents and nothing else:

  • For input validators: the files in the program directory of the input validator in question.
  • For submissions: the submitted files, any compiled binaries of the submitted files, any included files, and the contents of the .files directory of the test case being tested (if this directory exists).
  • For output validators, output visualizers, and static validators: the submitted files, any compiled binaries of the submitted files, as well as any included files.

Please note that in particular:

  • the working directory for submissions must not contain any of the test data files, except for the contents of the test case .files directory;
  • except for input validators, the files in a program's directory are not included in the working directory.

Problem Metadata

Metadata about the problem (e.g., source, license, limits) are provided in a YAML file named problem.yaml placed in the root directory of the package.

The keys are defined as below. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.

Key Type Required Default
problem_format_version String Yes
type String or non-empty sequence of strings No pass-fail
name String or map of strings Yes
uuid String Yes
version String No
credits String or map with keys as defined below No
source String, a sequence, or a map as defined below No
license String No unknown
rights_owner String See below See below
embargo_until Date No
limits Map with keys as defined below No See below
keywords Sequence of strings No
languages String or non-empty sequence of strings No all
allow_file_writing Boolean No false
constants Map of strings to int, float, or string No

Problem format version

Version of the Problem Package Format used for this package. If using this version of the Format, it must be the string 2023-07-draft. The string will be in the form <yyyy>-<mm> for a stable version, <yyyy>-<mm>-draft or draft for a draft version, or legacy or legacy-icpc for the version before the addition of problem_format_version. Documentation for version <version> is available at https://www.kattis.com/problem-package-format/spec/<version>.

Type

Type of problem. Must be either a single string or a non-empty sequence of strings, from the table below, with no repetition. Two values listed as incompatible must not both be in the sequence.

Value Incompatible with Comments
pass-fail scoring Default. Submissions are judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc).
scoring pass-fail An accepted submission is additionally given a score, which is a non-negative numeric value (and the goal is to maximize this value).
multi-pass submit-answer A submission should run multiple times with inputs for the next pass generated by the output validator of the current pass.
interactive submit-answer The output validator is run interactively with the submission.
submit-answer multi-pass, interactive A submission consists of the answers to the test cases instead of source code for a program that produces the answers.

Name

The name of the problem in each language for which a problem statement exists. The name field is a map with the language codes as keys and the problem names as values. If there is only one language and that language is English, the name field can simply be the problem name instead. The set of languages for which name is given must exactly match the set of languages for which a problem statement exists.

A deliberately complex example:

name:
  en: Hello World!
  pt-BR: Olá mundo!
  pt-PT: Oi mundo!
  fil: Kumusta mundo!

The simplest example, which implies that the only provided language is en;

name: Hello World!

UUID and version

The uuid is meant to track a problem, even if its package name and/or name changes. For example, it can be used to identify the existing problem to update in an online problem archive and not accidentally upload it as a new one. The intention is that a new uuid should be assigned if the problem significantly changes.

The version is meant for tracking (slightly) evolving versions of a problem, possibly during development, but also to track fixes to it. This can be used to check whether a problem uploaded to a contest system needs to be updated since it does not contain the latest fixes.

This specification currently does not imply any more semantic meaning to these fields.

Credits

Map specifying who should get credits for creating this problem. A person is specified as a string with the full name, optionally followed by an email wrapped in <>, (e.g.: Full Name or Full Name <fullname@problem.example>). Each of the keys in this section is optional.

Key Type Comments
authors Person or non-empty sequence of persons The people who conceptualized the problem.
contributors Person or non-empty sequence of persons The people who developed the problem package, such as the statement, validators, and test data.
testers Person or non-empty sequence of persons The people who tested the problem package, for example, by providing a solution and reviewing the statement.
translators Map of strings to persons or non-empty sequences of persons The people who translated the statement to other languages. Each key must be a language code as described in General Requirements.
packagers Person or non-empty sequence of persons The people who created the problem package out of an existing problem.
acknowledgements Person or non-empty sequence of persons Extra acknowledgements or special thanks in addition to the previously mentioned.

A full example would be

credits:
  authors: Authy McAuth <authy@mcauth.example>
  contributors:
  - Authy McAuth <authy@mcauth.example>
  - Additional Contributor <extra@contrib.example>
  testers:
  - Tester One
  - Tester Two
  - Tester Three
  translators:
    da: Mads Jensen <mads@mads.example>
    eo: Ludoviko Lazaro Zamenhofo
  acknowledgements:
    - Inspirational Speaker 1
    - Inspirational Speaker 2
  packagers:
    - Package Creatorson

which demonstrates all the available credit types.

Credits are sometimes omitted when authors instead choose to only give source credit, but both may be specified. If a string is provided instead of a map for credits, such as

credits: Authy McAuth <authy@mcauth.example>

it is treated as if only a single author is being specified, so it is equivalent to

credits:
    authors: Authy McAuth <authy@mcauth.example>

to support a less verbose credits section.

Source

The source key contains one or more source entries that this problem originates from. Each entry consists of either a map with keys name and url, where name is required, but url is optional, or alternatively a string with value equivalent to that of the name key. If there is only a single source entry, it can be specified directly as the value of source; otherwise source contains a list with all entries.

The name should typically contain the name (and year) of the problem set (such as a contest or a course), where the problem was first used or for which it was created, and the key url should map to a link to the event's page.

The following are valid examples:

source:
  name: NWERC 2024
  url: https://2024.nwerc.example/contest

which without url can be shortened to

source: NWERC 2024

A more extensive example:

source:
  - name: NWERC 2024
    url: https://2024.nwerc.example/contest
  - SWERC 2024
  - name: SEERC 2024

License

License under which the problem may be used. Must be one of the values below.

Value Comments Link
unknown The default value. In practice means that the problem can not be used.
public domain There are no known copyrights on the problem, anywhere in the world. http://creativecommons.org/about/pdm
cc0 CC0, "no rights reserved", version 1 or later. https://creativecommons.org/publicdomain/zero/1.0/
cc by CC attribution license, version 4 or later. http://creativecommons.org/licenses/by/4.0/
cc by-sa CC attribution, share alike license, version 4 or later. http://creativecommons.org/licenses/by-sa/4.0/
educational May be freely used for educational purposes.
permission Used with permission. The rights owner must be contacted for every additional use.

Rights Owner

A rights owner is needed if the license is anything other than unknown or public domain. If rights_owner is provided, this is the rights owner. Otherwise, if one or several authors are specified in credits, that group or individual is the rights owner. Otherwise, if a source is specified, the legal entity owning the rights associated with that source is the rights owner.

Problem Publication Embargo

The embargo_until key, if present, declares that the problem package should not be made publicly available (in problem archives, online judges, etc.) until a certain date and time. The value of this key must be a calendar date, or date and time of day in UTC, in ISO-8601 extended format (YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ). The time defaults to the start of the day in UTC.

Limits

Time, memory, and other limits to be imposed on submissions. A map with the following keys:

Key Comments Default Typical system default
time_multipliers optional map as defined below see below
time_limit optional float > 0, in seconds see below
time_resolution optional float > 0, in seconds 1.0
memory optional int > 0, in MiB system default 2048
output optional int > 0, in MiB system default 8
code optional int > 0, in KiB system default 128
compilation_time optional int > 0, in seconds system default 60
compilation_memory optional int > 0, in MiB system default 2048
validation_time optional int > 0, in seconds system default 60
validation_memory optional int > 0, in MiB system default 2048
validation_output optional int > 0, in MiB system default 8
validation_passes optional int >= 2, only for multi-pass 2

For most keys, the system default will be used if nothing is specified. This can vary, but you should assume that it's reasonable. Only specify limits when the problem needs a specific limit, but do specify limits even if the "typical system default" is what is needed.

Problem Timing

time_multipliers is a map with the following keys:

Key Comments Default
ac_to_time_limit optional float >= 1 2.0
time_limit_to_tle optional float >= 1 1.5

The value of time_limit is an integer or floating-point problem time limit in seconds. The time multipliers specify safety margins relative to the slowest accepted submission, T_ac, and fastest time_limit_exceeded submission, T_tle. The time_limit must satisfy T_ac * ac_to_time_limit <= time_limit and time_limit * time_limit_to_tle <= T_tle. In these calculations, T_tle is treated as infinity if the problem does not provide at least one time_limit_exceeded submission.

If no time_limit is provided, the default value is the smallest integer multiple of time_resolution that satisfies the above inequalities. It is an error if no such multiple exists. The time_resolution key is ignored if the problem provides an explicit time limit (and in particular, the time limit is not required to be a multiple of the resolution). Since time multipliers are more future-proof than absolute time limits, avoid specifying time_limit whenever practical.

Judge systems should make a best effort to respect the problem time limit, and should warn when importing a problem whose time limit is specified with precision greater than can be resolved by system timers.

Keywords

List of keywords describing the problem.

Languages

List of one or more programming languages codes from the languages table or the string all. If the value is not all, the problem may only be solved using any supported programming languages.

File endings in parenthesis are not used for determining language.

Allow File Writing

Flag for configuring whether submissions should have access to creating, editing and deleting files in their working directory. A value of true means submissions can read and write files, while the default value of false means submissions can only read from files.

Constants

Global constant values used by the problem, specified by a map of names to values. Names must match the following regex: [a-zA-Z_][a-zA-Z0-9_]*. Constant sequences are tokens (regex words) of the form {{name}}, where name is one of the names defined in constants. Tags {{xyz}} containing a name that is not defined are not modified but may be warned for.

All constant sequences in the following files will be replaced by the value of the corresponding constant:

  • Markdown problem statements
  • input and output validators
  • included code
  • example submissions
  • test_group.yaml

Note that constants are also available in LaTeX problem statements via the dedicated command \constant{name}.

Constant sequences are not replaced in test data files or in problem.yaml itself.

Problem Statements

The problem statement of the problem is provided in the directory statement/.

This directory must contain one file per language, for at least one language, named problem.<language>.<filetype>, that contains the problem text itself, including input and output specifications. Here, <language> is a language code as described in General Requirements. Filetype can be either .tex for LaTeX files, .md for Markdown, or .pdf for PDF.

Please note that many kinds of transformations on the problem statements, such as conversion to HTML or styling to fit in a single document containing many problems will not be possible for PDF problem statements, so using this format should be avoided if at all possible.

Auxiliary files needed by the problem statement files must all be in statement/. problem.<language>.<filetype> should reference auxiliary files as if the working directory is statement/. All statement types support the image file formats .png, .jpg, .jpeg. LaTeX statements also support .pdf. Markdown statements also support .svg.

Sample Data

  • For problem statements provided in LaTeX or Markdown: the statement file must contain only the problem description and input/output specifications and no sample data. It is the judge system's responsibility to append the sample data.
  • For problem statements provided as PDFs: the judge system will display the PDF verbatim; therefore any sample data must be included in the PDF. The judge system is not required to reconcile sample data embedded in PDFs with the sample test data group nor to validate it in any other way.

LaTeX Environment and Supported Subset

Problem statements provided in LaTeX must consist only of the problem statement body (i.e., the content that would be placed within a document environment). It is the judging system's responsibility to wrap this text in an appropriate LaTeX class.

The LaTeX class shall provide the convenience environments Input, Output, and Interaction for delineating sections of the problem statement. It shall also provide the following commands:

  • \problemname{name}, which should typically be the first line of the problem statement and places the problem name into the problem statement header. The argument name is optional. If it is missing, the name value from problem.yaml matching the problem statement's language is used. If it is present, it is used instead, and must be a LaTeX-formatted version of the name value from problem.yaml matching the problem statement's language. In some cases, the problem name might contain math formulas or other text that should be typeset specially. In this case, the \problemname{name} command should be used instead and overrides the name in the header with name, a LaTeX-formatted version of the problem name.
  • \illustration{width}{filename}{caption}, a convenience command for adding a figure to the problem statement. width is a floating-point argument specifying the width of the figure as a fraction of the total width of the problem statement; filename is the image to display, and caption, the text to include below the figure. The illustration should be flushed right with text flowing around it (as in a wrapfigure).
  • \nextsample tells the judge system to include the next sample test case here. It is an error to use \nextsample when there are no remaining sample test cases.
  • \remainingsamples, tells the judge system to include all sample test cases that have not previously been included by \nextsample. It is allowed to use \remainingsamples even if there are no remaining sample test cases, which will simply include nothing.
  • \constant{name} evaluates to the value of the corresponding constant, see constants.

Arbitrary LaTeX is not guaranteed to render correctly by HTML-based judging systems. However, judging systems must make a best effort to correctly render at minimum the following LaTeX subset when displaying a LaTeX problem statement:

  • All MathJax-supported TeX commands within inline ($ $) and display ($$ $$) math mode.
  • The following text-mode environments: itemize, enumerate, lstlisting, verbatim, quote, center, tabular, figure, wrapfigure (from the wrapfig package).
  • \item within list environments and \hline, \cline, \multirow, \multicol within tables.
  • The following typesetting constructs: smart quotes (' ', << >>, `` ''), dashes (--, ---), non-breaking space (~), ellipses (\ldots and \textellipsis), and \noindent.
  • The following font weight and size modifiers: \bf, \textbf, \it, \textit, \t, \tt, \texttt, \emph, \underline, \sout, \textsc, \tiny, \scriptsize, \small, \normalsize, \large, \Large, \LARGE, \huge, \Huge.
  • \includegraphics from the package graphicx, including the Polygon-style workaround for scaling the image using \def \htmlPixelsInCm.
  • The miscellaneous commands \url, \href, \section, \subsection, and \epigraph.

Markdown Environment and Supported Features

Problem statements in Markdown must not include the problem name, as the judging system will automatically prepend it. Statements must also not contain scripting or reference external resources for content, such as images. Due to security concerns, it is strongly recommended to pass the compiled statement through a sanitizer.

Markdown statements may use .svg files. Any .svg files must not contain scripting or references to external resources.

The judging system shall provide the following commands:

  • {{nextsample}} tells the judge system to include the next sample test case here. It is an error to use {{nextsample}} when there are no remaining sample test cases.
  • {{remainingsamples}} tells the judge system to include all sample test cases that have not previously been included by {{nextsample}}. It is allowed to use {{remainingsamples}} even if there are no remaining sample test cases, which will simply include nothing.
  • {{name}} evaluates to the value of the corresponding constant, see constants.

The judging system shall support the Markdown flavor described by CommonMark. However, as many implementations are not fully compliant, full compliance with CommonMark is not required. Still, a reasonable effort shall be made to ensure that CommonMark-compliant statements render correctly.

Additionally, the following extensions shall be supported:

Attachments

Public, i.e., non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory attachments/.

Solution description

A description of how the problem is intended to be solved is provided in the directory solution/.

This directory must contain one file per language, for at least one language, named solution.<language>.<filetype>. Language is given the same way as for problem statements. Optionally, the language code can be left out; the default is then English (en). The set of languages used can be different from what was used for the problem statement. Filetype can be either .tex for LaTeX files, .md for Markdown, or .pdf for PDF.

Auxiliary files needed exclusively by the solution description files should all be in solution/. solution.<language>.<filetype> should reference auxiliary files as if the working directory is solution/. Additionally, all images in statement/ can also be referenced as if the working directory is statement/. Note that if a file with the same name exists in both statement/ and solution/, only the one in solution/ can be referenced.

Exactly how the solution description is used is up to the user or tooling.

Test data

The test data are provided in subdirectories of data/. The sample data in data/sample/ and the secret data in data/secret/.

All files and directories associated with a single test case have the same base name with varying extensions. Here, base name is defined to be the relative path from the data directory to the test case input file, without extensions. For example, the files secret/test.in and secret/test.ans are associated with the same test case that has the base name secret/test. The existence of the .in file declares the existence of the test case. If the test case exists, then an associated .ans file must exist while the others are optional. If the test case does not exist, then the other files must not exist. Note that a test case must not be named */test_group, since test_group.yaml would then be configuration for both the test case and test group. The table below summarizes the supported test data:

Extension Described In Summary
.in Input Input piped to standard input
.ans Output Validator Input to the Output Validator
.files Input Input available via file I/O
.yaml Test Case Configuration Additional configuration of the test case
.png, .jpg, .jpeg, .svg Illustrations Illustration of the test case

Judge systems may assume that the result of running a program on a test case is deterministic. For any two test cases, if the contents of their .in and .files directory are equivalent, as well as the args sequence in the .yaml file, then the input of the two test cases is equivalent. This means that for any two test cases, if their input, output validator arguments and the contents of their .ans files are equivalent, then the test cases are equivalent. The assumption of determinism means that a judge system could choose to reuse the result of a previous run, or to re-run the equivalent test case.

Input

Each test case can supply input via standard input, command-line arguments, and/or the file system. These options are not exclusive. For a test case with base name test, the file test.in is piped to the submission as standard input. The submission will be run with the args sequence defined in the test.yaml file as command-line arguments. Note that usually the submission's entry point, whether it be a binary or an interpreted file, will be the absolute first command line argument. However, there exist languages, such as Java, where there is no initial command line argument representing the entry point.

The directory test.files, if it exists, contains input files available to the submission via file I/O. All files in this directory must be copied into the submission's working directory after compiling, but before executing the submission, possibly overwriting the compiled submission file or included data in the case of name conflicts.

Illustrations

An illustration provides a visualization of the associated test case, meant for the judges. At most one illustration file may be provided per test case. The file must share the base name of the associated test case. The supported file extensions are .png, .jpg, jpeg, and .svg.

Test Case Configuration

One YAML file with additional configuration may be provided per test case. The file must share the base name of the associated test case.

The allowed keys are defined as follows. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.

Key Type Default
args Sequence of strings Inherited from test_group.yaml, which defaults to empty sequence
output_validator_args Sequence of strings Inherited from test_group.yaml, which defaults to empty sequence
input_validator_args Sequence of strings or map of strings to sequences of strings Inherited from test_group.yaml, which defaults to empty sequence
full_feedback Boolean Inherited from test_group.yaml, which defaults to false in secret and true in sample
hint String
description String

For each test case:

  • args defines arguments passed to the submission for this test case.
  • output_validator_args defines arguments passed to the output validator for the test case.
  • input_validator_args defines arguments passed to each input validator for the test case. If a sequence of strings, then those are the arguments that will be passed to each input validator for this the case. If a map, then each key is the name of the input validator and the value is the arguments to pass to that input validator for the test case. Validators not present in the map are run without any arguments.
  • When full_feedback is true, somebody whose submission didn't pass case should be shown:
    • the given input,
    • the produced output (stdout),
    • any error messages (stderr),
    • the illustration created by the output visualizer (if applicable),
    • the expected output.
  • A hint provides feedback for solving a test case to, e.g., somebody whose submission didn't pass.
  • A description conveys the purpose of a test case. It is an explanation of what aspect or edge case of the solution the input file is meant to test.

Test Data Groups

The test data for the problem can be organized into a tree-like structure rooted in the data folder. Each node of this tree is represented by a directory and referred to as a test data group. At the top level, the test data is divided into exactly two groups: sample and secret. The secret group may be further split into subgroups (each a subdirectory of secret). Each test data group (other than the root data group) may contain zero or more test cases (i.e., input-answer files). The sample directory may be omitted if a problem has no sample test cases. The secret directory must exist and the secret test group, or one of its descendent subgroups, must contain at least one test case.

Test cases and groups will be used in lexicographical order on file base name. If a specific order is desired, a numbered prefix such as 00, 01, 02, 03, and so on, can be used. A subgroup must not have the same name as a test case in the same directory. For example, if the file data/secret/huge.in exists then the directory data/secret/huge/ must not, and vice versa.

In each test data group, a YAML file test_group.yaml may be placed to specify how the result of the test data group should be computed. Some of the keys and their associated values will be inherited from the test_group.yaml in the closest ancestor group from the test case to the root data directory that has one. Others must be explicitly defined in the group's test_group.yaml file — otherwise they are set to the default values. If there is no test_group.yaml file in the root data group, one is implicitly added with the default values.

The format of test_group.yaml is as follows:

Key Type Default Inheritance Comments
scoring Map See Verdict/Score Aggregation Not inherited Description of how the results of the group test cases and subgroups should be aggregated. This key is only permitted for the secret group and its subgroups.
input_validator_args Sequence of strings or map of strings to sequences of strings empty string Inherited See Test Case Configuration.
output_validator_args Sequence of strings empty sequence Inherited See Test Case Configuration.
static_validation Map or boolean false Not applicable Configuration of static validation test data node. See Static Validator
full_feedback Boolean false in secret, true in sample Inherited See Test Case Configuration.
hint String Inherited See Test Case Configuration.
description String Inherited See Test Case Configuration.

Invalid Test Cases

The data directory may contain directories with test cases that must be rejected by validation. Their goal is to ensure the integrity and quality of the test data and validation programs.

Invalid Input

The files under invalid_input are invalid inputs. Unlike in sample and secret, there are no .ans files. Each tc.in under invalid_input must be rejected by at least one input validator.

Invalid Output

The test cases in invalid_output describe invalid outputs for non-interactive problems. They consist of three files. The input file tc.in, which must contain valid input. The output file tc.out must fail output validation for the given answer file tc.ans.

In particular, for any test case in invalid_output/, for example invalid_output/tc:

<output_validator_program> tc.in tc.ans dir [arguments] < tc.ans # MUST PASS
<output_validator_program> tc.in tc.ans dir [arguments] < tc.out # MUST FAIL

The directory invalid_output must be organized into a tree-like structure similar to secret and may contain arguments in test_group.yaml files that are passed to the validators.

Valid Output

The data directory may contain a directory of test cases that must pass validation. Their goal is to ensure the integrity and quality of validation programs. The test cases in valid_output describe valid outputs for non-interactive problems. They consist of three files. The input file tc.in, which must contain valid input. The output file tc.out must pass output validation for the given answer file tc.ans.

In particular, for any test case in valid_output/, for example valid_output/tc:

<output_validator_program> tc.in tc.ans dir [arguments] < tc.ans # MUST PASS
<output_validator_program> tc.in tc.ans dir [arguments] < tc.out # MUST PASS

The directory valid_output must be organized into a tree-like structure similar to secret and may contain arguments in test_group.yaml files that are passed to the validators.

Samples

Sample test cases can be used in three places:

  • As test cases for team submissions (with feedback possibly provided to the teams).
  • As sample input and output displayed in the problem statement.
  • As sample input and output files available for download, or otherwise made available.

By default the sample data for all three cases is taken from the .in and .ans file pairs under data/sample. Some problems require (slightly) different data in each of these cases. We allow customizing which data is used for each purpose with the additional extensions .statement and .download.

Samples For Judging Team Submissions

The data/sample directory contains test cases similar to those in data/secret. Every submission is run on these test cases (but sample test cases do not contribute to the problem score for scoring problems).

data/sample must not contain test groups. It may be missing (for problems with no samples) or empty.

Samples Shown in the Problem Statement

By default, the .in and .ans pairs from data/sample are shown in the problem statement. If a .out file exists the .out file is shown instead of the .ans file in the problem statement. This behavior can be customized by creating files with extension .in.statement and .ans.statement. If one of these files exists, its contents replaces that of the file with the same name – except the .statement extension – for purposes of the problem statement. Note that it is an error to provide both a .out and a .ans.statement file.

Interactive Problems

Interactive problems require a custom output validator, which interacts with the submission. The validator gets access to the .in and .ans files for each test case and communicates with the submission by reading from standard in and writing to standard out. Standard in of the output validator corresponds to standard out of the submission, and standard out of the output validator corresponds to standard in of the submission. Therefore, the output validator can control what information from the .in and .ans files is provided to the submission.

For interactive problems displaying two files is typically not meaningfully to capture how users are expected to interact with the output validator. Therefore, it is advised to instead provide samples for the problem statement in the form of a .interaction file. This file contains lines starting with < and >, containing an interaction log with output from the output validator starting with < and output from the submission starting with >.

Note that if you are using a .interaction file you must not provide a .in.statement, .ans.statement, or .out file.

Multi-Pass Problems

Multi-pass problems require a custom output validator, which interacts with the submission see multi-pass.

For multi-pass problems displaying two files is typically not meaningfully to capture how users are expected to interact with the output validator. Therefore, it is advised to instead provide samples for the problem statement in the form of a .interaction file. This file contains lines starting with < and >, like for interactive problems. Passes are separated by a line containing --- (three dashes). When the problem is not interactive, simply start each pass by a number of lines starting with <, containing the sample input, followed by some lines starting with >, containing the sample output.

Note that if you are using a .interaction file you must not provide a .in.statement, .ans.statement, or .out file.

Samples Available for Download

By default, the .in, .ans, and .files files in data/sample are available for download. Note that the content of .in.statement replaces that of .in and that the content of .out or .ans.statement replace that of .ans for the download. This behavior can be further customized by providing files with the extension in.download or ans.download. If one of these files exists, its contents replaces that of the file with the same name – except the .download extension – for the problem download. Additionally, any other file or directory with the extension .download is also available for download (without the .download extension).

If you want to make other files – like testing tools – available for download, you can use attachments.

Validation

All data/sample/*.in files are input-validated. For non interactive and non multi-pass problems the .out files must pass the output validator. For non interactive and non multi-pass problems the .ans files must pass the output validator if they are not overriden in any way, i.e., if they are shown in the statement. All other files are not validated in any way.

Note that .ans.statement and .out can both be used to change what is shown in the statement. However, since only the .out files are validated it is advised to use these if possible.

Validation can be customized by specifying input_validator_args and output_validator_args in data/sample/test_group.yaml.

Generators

If any generator scripts were used to automate writing test cases, it is recommended to include the generator source code in the directory generators/ along with invocation instructions in a file such as generators/README.txt. This information can be useful as a debugging aid and for archival completeness: judge systems are not responsible for executing the provided generators and all test data written by the generators must be included in the problem package.

Included Files

Files that should be included with all submissions are provided in one non-empty directory per supported language. Files that should be included for all languages are placed in the non-empty directory include/default/. Files that should be included for a specific language, overriding the default, are provided in the non-empty directory include/<language>/, where <language> is a language code as given in the languages table.

The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, but after checking whether the submission exceeds the code limit, overwriting files from the submission in the case of name collision. Language must be one of the allowed submission languages as specified by languages in problem.yaml. If any of the included files are supposed to be the main file (i.e., a driver), that file must have the language-dependent name as given in the table referred to above.

Example Submissions

Correct and incorrect solutions (file or directory programs) to the problem are provided in direct subdirectories of submissions/. That is By default, the possible subdirectories are as in the table below, but they can be customized, and more can be added; see Default directories. Submission programs (either a single file or a directory of files) must be placed in a direct subdirectory of submissions, e.g., submissions/accepted/.

Directory Requirement Comment
accepted Accepted as a correct solution for all test cases. At least one is required. Used to lower bound the time limit.
rejected At least one case is not accepted.
wrong_answer At least one case is wrong answer, and all cases are either wrong answer or accepted. Used to lower bound the time limit.
time_limit_exceeded Too slow on at least one case, and all cases are either too slow or accepted. Used to upper bound the time limit.
run_time_error Crashes for at least one case, and all cases either crash or are accepted. Used to lower bound the time limit.
brute_force Never gives the wrong answer, but not accepted because run time error or timeout.

Every file or directory in these directories represents a separate solution. It is mandatory to provide at least one accepted solution.

Metadata about the example submissions is provided in a YAML file submissions/submissions.yaml. The top-level keys in submissions.yaml are glob patterns matching files or directories under submissions/. For example, accepted and accepted/* match all submissions in the submissions/accepted/ directory. See also Glob-patterns

Each glob pattern maps to a map with keys as defined below, specifying metadata for all submissions that are matched by the glob pattern.

Key Type Default Comment
language String As determined by file endings given in the language list
entrypoint String As specified in the language list
authors Person or sequence of persons Author(s) of submission(s).
model_solution Bool false Suggested model solution, suitable to be published.
permitted Sequence of strings [AC, WA, TLE, RTE] All test cases must have a verdict in this subset of AC, WA, TLE, RTE.
required Sequence of strings [AC, WA, TLE, RTE] At least one test case must have a verdict in this subset of AC, WA, TLE, RTE.
score Float or list of two floats The score of the submission equals the given number, or is in the given inclusive range. Only for scoring problems.
message String Empty string This must appear as a substring in at least one judgemessage.txt.
use_for_time_limit Bool or string (lower/upper) See below. Controls whether this submission is used to determine the time limit.

Every submission matched by the glob pattern must satisfy:

  • all test cases must have only verdicts present in permitted;
  • at least one test case must have a verdict in required;
  • if given, the score must be in the given inclusive range or equal the given score;
  • if given, the message string must be included as a case-sensitive substring in the judgemessage.txt for at least one test case.

The tooling should check the constraints for consistency, such as that two disjoint permitted sets are never applied to a single (submission, testcase) pair.

Groups

The permitted, required, score, message, and use_for_time_limit requirements can also be given for only a subset of test cases, by adding them under a key with the name of a test group (relative to data/). In this case, the permitted, required, message, and use_for_time_limit keys only apply to the set of test cases (recursively) in the given group.

The score key puts a constraint on the aggregated score of a given test group, not on the set of test cases the group contains.

For example, the configuration below tests that the submission solves all cases in group1, but times out on at least one case in group2.

solves_group_1.py:
  sample:
    permitted: [AC]
  secret/group1:
    permitted: [AC]
  secret/group2:
    permitted: [AC, TLE]
    required: [TLE]

Glob patterns

Glob patterns can be used to apply restrictions to a subset of submissions. It is also possible to use glob patterns to put restrictions on a subset of test cases and test groups, for example, when test groups are not used:

time_limit_exceeded/solves_easy_cases.py:
  sample:
    permitted: [AC]
  secret/*-easy:
    permitted: [AC]
  secret/*-hard:
    permitted: [AC, TLE]
    required: [TLE]

This means that the submission must solve all samples and all easy cases, but must time out on at least one of the hard cases.

Submission glob patterns are matched against all paths to files and directories of submissions inside and relative to the submissions/ directory. Test case glob patterns are matched against all paths of test groups and test cases relative to data/, excluding the trailing .in. Wildcards (*) only match within a file name (i.e., do not match /). A test case is matched by the glob pattern if either itself or any of its parent test groups is matched by it, and similarly a submission is matched if either itself or a parent directory is matched.

Using ** to match any number of directories and [xyz] to match only a subset of characters is not supported. Brace expansion is supported for both submissions and test cases. Thus, one can write {simple,complex}.py or author.{py,cpp} to match multiple files.

Default directories

By default, the following requirements are defined:

# All cases must be accepted.
accepted:
  permitted: [AC]
# At least one case is not accepted.
rejected:
  required: [RTE, TLE, WA]
# All cases AC or WA, at least one WA.
wrong_answer:
  permitted: [AC, WA]
  required: [WA]
# All cases AC or TLE, at least one TLE.
time_limit_exceeded:
  permitted: [AC, TLE]
  required: [TLE]
# All cases AC or RTE, at least one RTE.
run_time_error:
  permitted: [AC, RTE]
  required: [RTE]
# Must not WA, but fail at least once.
# Note that by default these are not used for determining the time limit.
brute_force:
  permitted: [AC, RTE, TLE]
  required: [RTE, TLE]

The defaults can be overwritten in the submissions.yaml file by simply specifying the name of the directory. Keys that are not specified are inherited from the default configuration above. This is supported for backwards compatibility and is not recommended for normal usage.

time_limit_exceeded:
  permitted: [AC, WA, TLE]
  required: [TLE]

Note that the glob time_limit_exceeded/* would impose an additional requirement, instead of replacing the original requirement.

Timelimit inference

Any submission that must satisfy a required: TLE requirement, i.e., must TLE on at least one test case, is used to provide an upper bound on the time limit. Precisely, the time limit must be at most T / time_limit_to_tle, where T is the slowest runtime over the set of test cases to which the rule applies. Note that this excludes submissions that, e.g., have required: [TLE, RTE].

Any submission that is not permitted to get TLE at all (on some subset of cases), i.e., must satisfy a permitted: rule that does not contain TLE, is used to provide a lower bound on the time limit. Precisely, the time limit must be at least T * ac_to_time_limit, where T is the slowest runtime over the set of test cases to which the rule applies.

To opt out of a (set of) submission(s) from influencing the time limit, set use_for_time_limit: false alongside the permitted: or required: key that satisfies the constraints above. To opt out of a glob for submission(s) and optional subset of testcases from influencing the time limit, set use_for_time_limit: false alongside the permitted: and/or required: keys. Note that this means that if you want to exclude a submission completely, then you must add use_for_time_limit: false to every glob that matches that submission and would otherwise include it for determining the time limit.

To explicitly opt in a (set of) submissions(s) to be used for determining the time limit, use use_for_time_limit: lower and use_for_time_limit: upper. The first is equivalent to a permitted: [AC, WA, RTE] constraint, and the second to a required: [TLE] constraint. The system may warn when this makes other constraints redundant and should error when it is inconsistent with other constraints.

It is required that at least one submission is used to lower bound the time limit.

Input Validators

Input Validators, verifying the correctness of the input files, are provided in input_validators/. Input validators can be specified as VIVA-files (with file ending .viva), Checktestdata-file (with file ending .ctd), or as a program (as specified above).

All input validators provided will be run on every input file. Validation fails if any validator fails.

Invocation

An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.

All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.

When invoked, the input validator will get the input file on stdin.

The validator should be possible to use as follows on the command line:

<input_validator_program> [arguments] < inputfile

Here, arguments is the input_validator_args.

Output

The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.

Exit codes

The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.

Dependencies

The validator must not read any files outside those defined in the Invocation section. Its result must depend only on these files and the arguments.

Static Validator

Overview

A static validator is a program that is given the submission files as input and can analyze the contents to accept or reject the submission. Optionally, the static validator may assign a score to the submission for each validation test case. By default there is no static validator. A static validator may be provided under the static_validator directory, similar to a custom output validator.

Static Validation Test Cases

Each test group may define a static validation test case. It is an error to define static validation test cases without providing a static validator. A static validation test case is defined within a group's testdata.yaml file by specifying the key static_validation. If a map is specified, its allowed key are:

  • args, which maps to a string which represents the additional arguments passed to the static validator in this group's static validation test case;
  • score, the maximum score of the static validation test case (see Scoring Problems for details).

The static_validation key can also have the value of false meaning there is no static validation, or true meaning that static validation is enabled with no additional arguments and unspecified maximum score (to be determined by maximum score inference).

It is an error to provide a static validator for submit-answer type problems, or to specify a score in a test group with pass-fail aggregation.

Invocation

When invoked, the static validator will be passed at least three command line parameters.

The validator should be possible to use as follows on the command line:

<static_validator_program> language entry_point feedback_dir [additional_arguments]

The meaning of the parameters listed above are:

  • language: a string specifying the code of the language of the submission as shown in the languages table. A static validator must handle all of the programming languages specified in the languages key of problem.yaml.

  • entry_point: a string specifying the entry point, that is a filename, class name, or some other identifier, which the static validator should know how to use depending on the language of the submission.

  • feedback_dir: a string which specifies the name of a "feedback directory" in which the validator can produce "feedback files" in order to report additional information on the validation of the submission. The feedback_dir must end with a path separator (typically ‘/' or ‘\' depending on operating system), so that simply appending a filename to feedback_dir gives the path to a file in the feedback directory.

  • additional_arguments: in case the static validation test case specifies additional args, these are passed as additional arguments to the validator on the command line.

The static validator follows the semantics of an output validator for reporting a judgment.

Output Validator

Overview

Output validators are programs used to check that the output of a submission on a test case is correct. A trivial output validator could check that the submission output is equal to the answer file. The default validator does essentially this, and supports some other commonly useful options.

For problems that require more complex checks, you can create a custom output validator and provide it as a program (as specified above) in the directory output_validator/. If no custom output validator is specified, the default validator is used.

The subsections below explain how a (default or custom) output validator must be invoked and how it must report a judgement and optionally report additional feedback.

Default Output Validator Specification

The default output validator is essentially a beefed-up diff that can be used in the common case where the output validator needs only compare the output of a submitted program against a trusted judge reference solution. The default output validator supports the following command-line arguments:

Arguments Description
case_sensitive indicates that comparisons should be case-sensitive (see below for details).
space_change_sensitive indicates that differences in type or amount of whitespace should be rejected (see below for details).
float_relative_tolerance ε indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details).
float_absolute_tolerance ε indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details).
float_tolerance ε short-hand for applying ε as both relative and absolute tolerance.

Output Parsing

The default output validator parses the submission output and answer files as a string of single-byte characters and tokenizes the files by splitting on sequences of one or more consecutive whitespaces. Whitespace characters are: space (0x20), form feed (0x0c), line feed (0x0a), carriage return (0x0d), horizontal tab (0x09), and vertical tab (0x0b). In its default mode, the default output validator then ignores the whitespace and compares the submission output and answer files token by token. If there is a disagreement in the number of tokens in the two files, the validator rejects the submission instead. The validator may also reject any submission output that is unreasonably large (including due to containing unreasonable amounts of whitespace).

Comparing Floating-Point Tokens

If a floating-point tolerance has been set, the default output validator will attempt to parse each answer file token as a floating-point number (see general requirements for details). For each success, the token is compared to the submission output token using the following floating-point semantics. (If no floating-point tolerance has been set, floating-point tokens are compared as strings.)

If the submission output token cannot be parsed as floating-point, the validator rejects the submission as incorrect. Otherwise, if s is the submission output floating-point value and a is the answer file value:

  • if an absolute tolerance ε has been set, the token is accepted if |s-a| ≤ ε;
  • if a relative tolerance ε has been set, the token is accepted if |s-a| ≤ ε|a|;
  • when supplying both a relative and an absolute tolerance, the token is accepted if it is within either of the two tolerances.

It is an error to provide any of the float_relative_tolerance, float_absolute_tolerance, or float_tolerance arguments more than once, or to provide a float_tolerance alongside float_relative_tolerance and/or float_absolute_tolerance.

Note that if a floating-point tolerance has been set, the default output validator will parse exact integers as floating-point and apply the above semantics to them. For problems containing a mix of integer and floating-point output, a custom output validator must be used if exact comparison of the integer tokens is required.

Comparing String Tokens

If case_sensitive is specified, the two tokens must match exactly, i.e. consist of exactly the same byte sequence.

Otherwise, the submitted output token is accepted if it matches the answer file token up to case. The default output validator treats uppercase ASCII letters AZ as equivalent to their lowercase counterparts.

Whitespace Sensitivity

By default, whitespace in the submission output is only used for tokenization and is otherwise ignored. As a consequence, the default output validator is lenient with regard to leading and trailing whitespace, and it treats any sequence of one or more whitespace characters in between tokens as equivalent to any other such sequence.

If the space_change_sensitive argument is set, the default output validator will instead reject any submission output whose whitespace sequences (including leading and trailing) differ from those in the answer file in type or amount.

Invocation

The output validator must be invoked and must support being invoked as:

<output_validator_program> input_file answer_file feedback_dir [additional_arguments] < team_output [ > team_input ]

The meanings of the parameters listed above are:

  • input_file: a string specifying the name of the input data file that was used to test the program whose results are being validated.

  • answer_file: a string specifying the name of an arbitrary "answer file" which acts as input to the validator program. The answer file may, but is not necessarily required to, contain the "correct answer" for the problem. For example, it might contain the output that was produced by a judge's solution for the problem when run with input_file as input. Alternatively, the "answer file" might contain information, in arbitrary format, which instructs the validator in some way about how to accomplish its task.

  • feedback_dir: a string which specifies the name of a "feedback directory" in which the validator can produce "feedback files" in order to report additional information on the validation of the output file. The feedback_dir must end with a path separator (typically ‘/' or ‘\' depending on operating system), so that simply appending a filename to feedback_dir gives the path to a file in the feedback directory.

  • additional_arguments: in case output_validator_args are specified for the test case, these are passed as additional arguments to the validator on the command line.

  • team_output: the output produced by the program being validated is given on the validator's standard input.

  • team_input: when running the validator in interactive mode everything written on the validator's standard output is given to the program being validated. Please note that when running interactively the program will only receive the output produced by the validator and will not have direct access to the input file.

The two files named by input_file and answer_file must exist (though they are allowed to be empty) and the validator program must be allowed to open them for reading. The directory named by feedback_dir must also exist and the validator program must be allowed to create and write to new and existing files there.

Reporting a judgement

A validator program must report its judgement by exiting with specific exit codes:

  • If the output is a correct output for the input file (i.e., the submission that produced the output is to be Accepted), the validator must exit with exit code 42.
  • If the output is incorrect (i.e., the submission that produced the output is to be judged as Wrong Answer), the validator must exit with exit code 43.

Any other exit code, including 0, indicates that the validator did not operate properly, and the judging system invoking the validator must take measures to report this to contest personnel. The purpose of these somewhat exotic exit codes is to avoid conflict with other exit codes that results when the validator crashes. For instance, if the validator is written in Java, any unhandled exception results in the program crashing with an exit code of 1, making it unsuitable to assign a judgement meaning to this exit code.

Reporting Additional Feedback

The purpose of the feedback directory is to allow the validator program to report more information to the judging system than just the accept/reject verdict. Using the feedback directory is optional for a validator program, so if one just wants to write a bare-bones minimal validator, it can be ignored.

The validator is free to create different files in the feedback directory, in order to provide different kinds of information to the judging system, in a simple but organized way. The following files have special meaning and are described below:

  • nextpass.in may be present in a multi-pass problem to indicate another pass follows
  • score.txt may be present in a scoring problem
  • score_multiplier.txt may be present in a scoring problem
  • judgemessage.txtmay contain feedback for the judges
  • teammessage.txt may contain feedback for the team
  • judgeimage.<ext> may contain graphical feedback for the judges
  • teamimage.<ext> may contain graphical feedback for the team

The contents of a judgemessage.txt gives a message that is presented to a judge reviewing the current submission (typically used to help the judge verify why the submission was judged as incorrect, by specifying exactly what was wrong with its output). Other examples of files that may be useful in some contexts (though not in the ICPC) are a score.txt file, giving the submission a score based on other factors than correctness, or a teammessage.txt file, giving a message to the team that submitted the solution, providing additional feedback on the submission.

A judging system that implements this format must support the judgemessage.txt file described above (I.e., content of the judgemessage.txt file, if produced by the validator, must be provided by the judging system to a human judge examining the submission). Having the judging system support other files is optional.

The validator may create one or more image files in the feedback directory with the name teamimage.<ext> and/or judgeimage.<ext>, where <ext> is one of: png, jpg, jpeg, or svg. The output visualizer may modify or create these files as well, and the output validator may create files in the feedback directory containing metadata that helps the visualizer in this task. The intent is for the teamimage to be displayed to teams for judgeimage to be used as a debugging aid by judges, but the judge system may display or ignore these files as it sees fit.

Note that a validator may choose to ignore the feedback directory entirely. In particular, the judging system must not assume that the validator program creates any files there at all.

Multi-pass validation

A multi-pass validator can be used for problems that should run the submission multiple times sequentially, using a new input generated by output validator during the previous invocation of the submission.

The time and memory limit apply for each invocation separately.

To signal that the submission should be run again, the output validator must exit with code 42 and output the new input in the file nextpass.in in the feedback directory. Judging stops if no nextpass.in was created or the output validator exited with any other code. Note that the nextpass.in will be removed before the next pass.

It is a judge error to create the nextpass.in file and exit with any other code than 42. It is a judge error to run more passes than specified by the limits.validation_passes value in problem.yaml.

All other files inside the feedback directory are guaranteed to persist between passes. In particular, the validator should only append text to the judgemessage.txt to provide combined feedback for all passes.

Examples

An example of a judgemessage.txt file:

Team failed at test case 14.
Team output: "31", Judge answer: "30".
Team failed at test case 18.
Team output: "hovercraft", Judge answer: "7".
Summary: 2 test cases failed.

An example of a teammessage.txt file:

Almost all test cases failed — are you even trying to solve the problem?

Validator standard error

A validator program is allowed to write any kind of debug information to its standard error pipe. This information may be displayed to the user upon invocation of the validator.

Input Visualizer

If a tool was used to automate creating test case illustration annotations, it is recommended to include the input visualizer source code in the directory input_visualizer/ along with invocation instructions in a file such as input_visualizer/README.txt.

Output Visualizer

An output visualizer is an optional program that is run after every invocation of the output validator in order to generate images illustrating the submission output. A visualizer program must be an application (executable or interpreted) capable of being invoked with a command line call. It is invoked using the same arguments as the output validator. It must be provided as a program (as specified above) in the directory output_visualizer/.

All files written to the feedback directory by the output validator are accessible to the visualizer. The visualizer may overwrite or create image files in the feedback directory with the name teamimage.ext or judgeimage.ext, where ext is one of: png, jpg, jpeg, or svg. It must not write to score.txt, teammessage.txt, or any other files in the feedback directory other than those of the form teamimage.ext or judgeimage.ext.

Compile or run-time errors in the visualizer are not judge errors. The return value and any data written by the visualizer to standard error or standard output are ignored.

Verdict/Score Aggregation

Pass-Fail Problems

For pass-fail problems, the verdict of a submission is accepted if and only if every test case in both the sample group and the secret group and its subgroups is accepted.

Scoring Problems

For scoring problems, submissions are given a non-negative score instead of a verdict. The goal of each submission is to maximize this score. Only the secret group and its subgroups are scored.

Given a submission, scores are determined for test cases, test groups, and the submission itself (which is the score of the secret group). The scoring behavior is configured for each test data group by the following arguments in the scoring dictionary of its test_group.yaml:

Key Type Description
score String The maximum possible score of the test data group. Must be a non-negative integer or unbounded.
aggregation pass-fail, sum, or min How the score of the test data group is determined based on the scores of the subgroups and test cases. See below.
require_pass String or sequence of strings Other test cases or groups whose test cases a submission must AC in order to receive a score for this test group. See below.

The default value of aggregation is sum for the secret group and pass-fail for its subgroups.

Maximum Score Inference

The secret group, and every subgroup and test case in a group with sum or min aggregation, have a maximum possible score. The secret group's score may be any positive integer or unbounded. Subgroups of secret may only have unbounded maximum score if secret is unbounded. The default value of score for the secret group is 100.

The default score for subgroups and test cases of parent groups with sum or min aggregation is inferred from the score value of the parent group and its children:

Parent Group Maximum Score Aggregation Type Default Maximum Score of Test Case / Subgroup
unbounded sum or min unbounded
bounded value M sum (M - S)/(A + T)
bounded value M min M

where the group has T non-static-validation test cases, A subgroups and static validation test cases without a provided score, and whose other subgroups and static validation test cases have maximum scores that sum to S. This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score. It is a judge error if S > M for a group with bounded maximum score and sum aggregation.

Scoring Test Cases

Only test cases in test case groups with sum or min aggregation receive a score.

The score of a failed test case is always 0.

A custom output validator or static validator may produce a score.txt or score_multiplier.txt file for an accepted test case:

  • for test cases with bounded maximum score, score_multiplier.txt, if produced, must contain a single floating-point number in the range [0,1]. The score of the test case is this number multiplied by the test case maximum score.
  • for test cases with bounded maximum score, score.txt, if produced, must contain a single single non-negative floating-point number. The score of the test case is that number.
  • for test cases with bounded maximum score, if no score_multiplier.txt or score.txt is produced, the test case score is its maximum score.
  • for test cases with unbounded maximum score, score.txt must be produced and must contain a non-negative floating-point number. The score of the test case is that number.

It is a judge error if:

  • an output or static validator accepts a test case in an unbounded group and does not produce a score.txt;
  • an output or static validator does not accept a test case, but does produce a score.txt or a score_multiplier.txt;
  • an output or static validator produces a score_multiplier.txt for a test case with unbounded maximum score;
  • an output or static validator produces both a score.txt and a score_multiplier.txt for a test case;
  • an output or static validator produces a score.txt or score_multiplier.txt for a test case in a group with pass-fail aggregation;
  • an output or static validator produces a score.txt or score_multiplier.txt with invalid contents.

Scoring Test Groups

The score of a test group is determined by its subgroups and test cases. If it has no subgroups or test cases, then its score is 0. Otherwise, the score depends on the aggregation mode, which is either pass-fail, sum, or min.

  • If a group uses pass-fail aggregation, the group must have bounded maximum score and all subgroups must also use pass-fail aggregation. If the submission receives an accept verdict for all test cases in the group and its subgroups, the score of the group is equal to its maximum possible score. Otherwise the group score is 0.
  • If a group uses sum aggregation, the group score is the sum of the scores of its test cases and subgroups.
  • If a group uses min aggregation, then the group score is the minimum of these scores.

The submission score is the score of the secret group.

It is a judge error if the score of any group or subgroup exceeds its maximum score.

Required Dependent Groups

A group may specify that it should only be scored if a submission is accepted for another test case or all test cases in another test data group and their dependencies. Otherwise, none of the group's test cases are judged and the group score is 0.

The paths of these required test cases or groups, relative to the data folder, are listed under the require_pass key. The path of a group, relative to the data/ folder, must come later lexicographically than the paths of all dependent test cases and groups.

Each required group must be either sample or a subgroup of secret with pass-fail aggregation.