Problem Package Format
This is the 2023-07-draft
version of the Kattis problem package format.
Overview
This document describes the format of a Kattis problem package, used for distributing and sharing problems for algorithmic programming contests as well as educational use.
General Requirements
- The package must consist of a single directory containing files as described below. The directory name must consist solely of lowercase letters a–z and digits 0–9. Alternatively, the package can be a ZIP-compressed archive of such a directory with identical base name and extension
.kpp
or.zip
. - All file names for files included in the package must match the regexp
i.e., they must be of length at least 2, at most 255, consist solely of lower- or uppercase letters a–z, A–Z, digits 0–9, period, dash, or underscore, but must not begin or end with a period, dash, or underscore.^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,253}[a-zA-Z0-9]$
- All text files for a problem must be UTF-8 encoded and not have a byte-order mark (BOM).
- All text files must have Unix-style line endings (newline/LF byte only). Note that LF is line-ending and not line-separating in POSIX, which means that all non-empty text files must end with a newline.
- Natural language (for example, in the problem statement filename) must be specified as 2-letter ISO 639-1 code if it exists, otherwise as a 3-letter code from ISO 639. Optionally, it may be suffixed with a hyphen and an ISO 3166-1 alpha-2 code, as defined in BCP 47, for example,
pt-BR
to indicate Brazilian Portuguese. - All floating-point numbers (in any files, including in the problem package or submission output, that are parsed by contest system tools or the default output validator), must be given in decimal and may use scientific notation. More specifically, floating-point numbers must satisfy the following grammar, which accepts formatted floating-point output from most major programming languages:
sign [+-] digit [0123456789] expIndicator [Ee] significand ( {digit}* "." {digit}+ | {digit}+ "." | {digit}+ ) exponent {expIndicator} {sign}? {digit}+ float {sign}? {significand} {exponent}?
- The systems parsing these floating-point numbers may impose reasonable limits on the number of digits, but must support at least 30 digits before and after the decimal point. They must use an internal representation with at least 52 bits of mantissa precision and should make a "best effort" to parse floating-point numbers to their closest representable values.
- The problem package may include symbolic links to other files in the problem package. Symlinks must not have targets outside the problem package directory tree.
Problem Package Structure Overview
The following table summarizes the elements of a problem package described in this specification:
File or Folder | Required? | Described In | Description |
---|---|---|---|
problem.yaml | Yes | Problem Metadata | Metadata about the problem (e.g., source, license, limits) |
statement/ | Yes | Problem Statements | Problem statement files |
attachments/ | No | Attachments | Files available to problem-solvers other than the problem statement and sample test data |
solution/ | No | Solution Description | Written explanations of how to solve the problem |
data/sample/ | No | Test Data | Sample test data |
data/secret/ | Yes | Test Data | Secret test data |
data/invalid_input/ | No | Invalid Test Cases | Invalid test case input for testing input validation |
data/invalid_output/ | No | Invalid Test Cases | Invalid test case output for testing output validation |
generators/ | No | Generators | Scripts and documentation about how test cases were automatically generated |
include/ | No | Included Files | Files appended to all submitted solutions |
submissions/ | Yes | Example Submissions | Correct and incorrect judge solutions of the problem |
input_validators/ | Yes | Input Validators | Programs that verifies correctness of the test data inputs |
input_visualizer/ | No | Input Visualizer | Scripts and documentation about how test case illustrations were generated |
output_validator/ | No | Output Validator | Custom program for judging solutions |
output_visualizer/ | No | Output Visualizer | Program to generate images illustrating submission output |
static_validator/ | No | Static Validator | Custom program for judging solutions with source files as input |
A minimal problem package must contain problem.yaml
, a problem statement, a secret test case, an accepted judge solution, and an input validator.
Programs
There are a number of different kinds of programs that may be provided in the problem package: submissions, input validators, output validators, and output visualizers. All programs are always represented by a single file or directory. In other words, if a program consists of several files, these must be provided in a single directory. In the case that a program is a single file, it is treated as if a directory with the same name takes its place, which contains only that file. The name of the program, for the purpose of referring to it within the package, is the base name of the file or the name of the directory. There can't be two programs of the same kind with the same name.
Submissions
The language of a submission program is determined by the language
key in submissions.yaml
if present; otherwise, by comparing the file extensions of the submission file(s) to those specified in the languages table. If a single language can't be determined, building fails. Included files, if any, must be copied into the submission folder before building the submission.
For languages where there could be several entry points, the entry point specified by the entrypoint
key in submissions.yaml
is used if present; otherwise, the default entry point in the languages table is used.
Each submission must be run with a working directory that contains (a copy of) the submitted files, any compiled binaries and compilation byproducts, as well as any included files, and nothing else. In particular the working directory must not contain any of the test data files.
Other Programs
Other programs (validators and visualizers) provided as a directory may include one of two POSIX-compliant shell scripts, build
and run
. If at least one of these two files is included:
- First, if the
build
script is present, it must be executable and will be run. The working directory will be (a copy of) the program directory. Therun
file must exist in that directory and be executable afterbuild
is done. - Then, the
run
file (which now exists, and is an executable binary or POSIX-compliant shell script) will be used as the validator or visualizer program.
Scripts may assume that a C and C++ compiler are available on the system search path, aliased to cc
and cpp
respectively.
Programs that do not include a build
or run
script must have one of the following forms:
- a single Python 3 file.
- a directory containing multiple Python 3 source files, two of which are
__init__.py
, (defining a module), and__main__.py
(which will be used as the program entry point). - a single C or C++ source file, or a directory containing one or more such files.
The language of files is inferred from their extension as listed in the languages table.
Each input validator must be run with a working directory that contains the files in the program directory of the input validator in question.
Each output validator, output visualizer, and static validator must be run with a working directory that contains the submitted files and any compiled binaries of the submission being validated. (Note in particular that this working directory does not contain the files in the program directory of the output validator, output visualizer or static validator.)
Problem Metadata
Metadata about the problem (e.g., source, license, limits) are provided in a YAML file named problem.yaml
placed in the root directory of the package.
The keys are defined as below. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.
Key | Type | Required | Default |
---|---|---|---|
problem_format_version | String | Yes | |
type | String or sequence of strings | No | pass-fail |
name | String or map of strings | Yes | |
uuid | String | Yes | |
version | String | No | |
credits | String or map with keys as defined below | No | |
source | String, a sequence, or a map as defined below | No | |
license | String | No | unknown |
rights_owner | String | See below | See below |
limits | Map with keys as defined below | No | See below |
keywords | Sequence of strings | No | |
languages | String or sequence of strings | No | all |
constants | Map of strings to int, float, or string | No |
Problem format version
Version of the Problem Package Format used for this package. If using this version of the Format, it must be the string 2023-07-draft
. The string will be in the form <yyyy>-<mm>
for a stable version, <yyyy>-<mm>-draft
or draft
for a draft version, or legacy
or legacy-icpc
for the version before the addition of problem_format_version. Documentation for version <version>
is available at https://www.kattis.com/problem-package-format/spec/<version>
.
Type
Type of problem. Must be either a single string or a sequence of strings, from the table below, with no repetition. Two values listed as incompatible must not both be in the sequence.
Value | Incompatible with | Comments |
---|---|---|
pass-fail | scoring | Default. Submissions are judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc). |
scoring | pass-fail | An accepted submission is additionally given a score, which is a non-negative numeric value (and the goal is to maximize this value). |
multi-pass | submit-answer | A submission should run multiple times with inputs for the next pass generated by the output validator of the current pass. |
interactive | submit-answer | The output validator is run interactively with the submission. |
submit-answer | multi-pass , interactive | A submission consists of the answers to the test cases instead of source code for a program that produces the answers. |
Name
The name of the problem in each language for which a problem statement exists. If there are statements in more than one language, the name
field must be a map with the language codes as keys and the problem names as values. The set of languages for which name
is given must exactly match the set of languages for which a problem statement exists.
A deliberately complex and construed example:
name:
en: Hello World!
pt-BR: Olá mundo!
pt-PT: Oi mundo!
fil: Kumusta mundo!
If only a single problem statement exists, this may be a string with the name of the problem in that language (but a map with a single key is allowed).
So the following example implies that only an English language problem statement exists:
name: Hello World!
UUID and version
The uuid
is meant to track a problem, even if its package name and/or name
changes. For example, it can be used to identify the existing problem to update in an online problem archive and not accidentally upload it as a new one. The intention is that a new uuid
should be assigned if the problem significantly changes.
The version
is meant for tracking (slightly) evolving versions of a problem, possibly during development, but also to track fixes to it. This can be used to check whether a problem uploaded to a contest system needs to be updated since it does not contain the latest fixes.
This specification currently does not imply any more semantic meaning to these fields.
Credits
Map specifying who should get credits for creating this problem. A person is specified as a string with the full name, optionally followed by an email wrapped in <>
, (e.g.: Full Name
or Full Name <fullname@problem.example>
).
Key | Type | Comments |
---|---|---|
authors | Person or sequence of persons | The people who conceptualized the problem. |
contributors | Person or sequence of persons | The people who developed the problem package, such as the statement, validators, and test data. |
testers | Person or sequence of persons | The people who tested the problem package, for example, by providing a solution and reviewing the statement. |
translators | Map of strings to sequences of persons | The people who translated the statement to other languages. Each key must be a language code as described in General Requirements. |
packagers | Person or sequence of persons | The people who created the problem package out of an existing problem. |
acknowledgements | Person or sequence of persons | Extra acknowledgements or special thanks in addition to the previously mentioned. |
A full example would be
credits:
authors: Authy McAuth <authy@mcauth.example>
contributors:
- Authy McAuth <authy@mcauth.example>
- Additional Contributor <extra@contrib.example>
testers:
- Tester One
- Tester Two
- Tester Three
translators:
da: Mads Jensen <mads@mads.example>
eo: Ludoviko Lazaro Zamenhofo
acknowledgements:
- Inspirational Speaker 1
- Inspirational Speaker 2
packagers:
- Package Creatorson
which demonstrates all the available credit types.
Credits are sometimes omitted when authors instead choose to only give source credit, but both may be specified. If a string is provided instead of a map for credits, such as
credits: Authy McAuth <authy@mcauth.example>
it is treated as if only a single author is being specified, so it is equivalent to
credits:
authors: Authy McAuth <authy@mcauth.example>
to support a less verbose credits section.
Source
The source
key contains one or more source entries that this problem originates from. Each entry consists of either a map with keys name
and url
, where name
is required, but url
is optional, or alternatively a string with value equivalent to that of the name
key. If there is only a single source entry, it can be specified directly as the value of source
; otherwise source
contains a list with all entries.
The name
should typically contain the name (and year) of the problem set (such as a contest or a course), where the problem was first used or for which it was created, and the key url
should map to a link to the event's page.
The following are valid examples:
source:
name: NWERC 2024
url: https://2024.nwerc.example/contest
which without url
can be shortened to
source: NWERC 2024
A more extensive example:
source:
- name: NWERC 2024
url: https://2024.nwerc.example/contest
- SWERC 2024
- name: SEERC 2024
License
License under which the problem may be used. Must be one of the values below.
Value | Comments | Link |
---|---|---|
unknown | The default value. In practice means that the problem can not be used. | |
public domain | There are no known copyrights on the problem, anywhere in the world. | http://creativecommons.org/about/pdm |
cc0 | CC0, "no rights reserved", version 1 or later. | https://creativecommons.org/publicdomain/zero/1.0/ |
cc by | CC attribution license, version 4 or later. | http://creativecommons.org/licenses/by/4.0/ |
cc by-sa | CC attribution, share alike license, version 4 or later. | http://creativecommons.org/licenses/by-sa/4.0/ |
educational | May be freely used for educational purposes. | |
permission | Used with permission. The rights owner must be contacted for every additional use. |
rights_owner
is the owner of the copyright of the problem. Values other than unknown
or public domain
require rights_owner
to have a value. rights_owner
defaults to credits.authors
, if present, otherwise value of source
.
Limits
Time, memory, and other limits to be imposed on submissions. A map with the following keys:
Key | Comments | Default | Typical system default |
---|---|---|---|
time_multipliers | optional | see below | |
time_limit | optional float, in seconds | see below | |
time_resolution | optional float, in seconds | 1.0 | |
memory | optional, in MiB | system default | 2048 |
output | optional, in MiB | system default | 8 |
code | optional, in KiB | system default | 128 |
compilation_time | optional, in seconds | system default | 60 |
compilation_memory | optional, in MiB | system default | 2048 |
validation_time | optional, in seconds | system default | 60 |
validation_memory | optional, in MiB | system default | 2048 |
validation_output | optional, in MiB | system default | 8 |
validation_passes | optional | 2 |
For most keys, the system default will be used if nothing is specified. This can vary, but you should assume that it's reasonable. Only specify limits when the problem needs a specific limit, but do specify limits even if the "typical system default" is what is needed.
Problem Timing
time_multipliers
is a map with the following keys:
Key | Comments | Default |
---|---|---|
ac_to_time_limit | float | 2.0 |
time_limit_to_tle | float | 1.5 |
The value of time_limit
is an integer or floating-point problem time limit in seconds. The time multipliers specify safety margins relative to the slowest accepted submission, T_ac
, and fastest time_limit_exceeded submission, T_tle
. The time_limit
must satisfy T_ac * ac_to_time_limit <= time_limit
and time_limit * time_limit_to_tle <= T_tle
. In these calculations, T_tle
is treated as infinity if the problem does not provide at least one time_limit_exceeded submission.
If no time_limit
is provided, the default value is the smallest integer multiple of time_resolution
that satisfies the above inequalities. It is an error if no such multiple exists. The time_resolution
key is ignored if the problem provides an explicit time limit (and in particular, the time limit is not required to be a multiple of the resolution). Since time multipliers are more future-proof than absolute time limits, avoid specifying time_limit
whenever practical.
Judge systems should make a best effort to respect the problem time limit, and should warn when importing a problem whose time limit is specified with precision greater than can be resolved by system timers.
Keywords
List of keywords describing the problem. Keywords should not contain spaces.
Languages
List of programming languages codes from the languages table or the string all
. If the value is not all
, the problem may only be solved using the listed programming languages.
File endings in parenthesis are not used for determining language.
Constants
Global constant values used by the problem, specified by a map of names to values. Names must match the following regex: [a-zA-Z_][a-zA-Z0-9_]*
. Constant sequences are tokens (regex words) of the form {{name}}
, where name
is one of the names defined in constants
. Tags {{xyz}}
containing a name that is not defined are not modified but may be warned for.
All constant sequences in the following files will be replaced by the value of the corresponding constant:
- problem statements
- input and output validators
- included code
- example submissions
testdata.yaml
Constant sequences are not replaced in test data files or in problem.yaml
itself.
Problem Statements
The problem statement of the problem is provided in the directory statement/
.
This directory must contain one file per language, for at least one language, named problem.<language>.<filetype>
, that contains the problem text itself, including input and output specifications. Here, <language>
is a language code as described in General Requirements. Optionally, the language code can be left out; the default is then English (en
). Filetype can be either .tex
for LaTeX files, .md
for Markdown, or .pdf
for PDF.
Please note that many kinds of transformations on the problem statements, such as conversion to HTML or styling to fit in a single document containing many problems will not be possible for PDF problem statements, so using this format should be avoided if at all possible.
Auxiliary files needed by the problem statement files must all be in statement/
. problem.<language>.<filetype>
should reference auxiliary files as if the working directory is statement/
. Image file formats supported are .png
, .jpg
, .jpeg
, and .pdf
.
Sample Data
- For problem statements provided in LaTeX or Markdown: the statement file must contain only the problem description and input/output specifications and no sample data. It is the judge system's responsibility to append the sample data.
- For problem statements provided as PDFs: the judge system will display the PDF verbatim; therefore any sample data must be included in the PDF. The judge system is not required to reconcile sample data embedded in PDFs with the
sample
test data group nor to validate it in any other way.
LaTeX Environment and Supported Subset
Problem statements provided in LaTeX must consist only of the problem statement body (i.e., the content that would be placed within a document
environment). It is the judging system's responsibility to wrap this text in an appropriate LaTeX class.
The LaTeX class shall provide the convenience environments Input
, Output
, and Interaction
for delineating sections of the problem statement. It shall also provide the following commands:
\problemname
, which must be the first line of the problem statement and places thename
value matching the problem statement's language fromproblem.yaml
into the problem statement header. In some cases, the problem name might contain math formulas or other text that should be typeset specially. In this case, the\problemname{name}
command should be used instead and overrides the name in the header withname
, a LaTeX-formatted version of the problem name.\illustration{width}{filename}{caption}
, a convenience command for adding a figure to the problem statement.width
is a floating-point argument specifying the width of the figure as a fraction of the total width of the problem statement;filename
is the image to display, andcaption
, the text to include below the figure. The illustration should be flushed right with text flowing around it (as in awrapfigure
).\nextsample
tells the judge system to include the next sample test case here. It is an error to use\nextsample
when there are no remaining sample test cases.\remainingsamples
, tells the judge system to include all sample test cases that have not previously been included by\nextsample
. It is allowed to use\remainingsamples
even if there are no remaining sample test cases, which will simply include nothing.
Arbitrary LaTeX is not guaranteed to render correctly by HTML-based judging systems. However, judging systems must make a best effort to correctly render at minimum the following LaTeX subset when displaying a LaTeX problem statement:
- All MathJax-supported TeX commands within inline (
$ $
) and display ($$ $$
) math mode. - The following text-mode environments:
itemize
,enumerate
,lstlisting
,verbatim
,quote
,center
,tabular
,figure
,wrapfigure
(from thewrapfig
package). \item
within list environments and\hline
,\cline
,\multirow
,\multicol
within tables.- The following typesetting constructs: smart quotes (
' '
,<< >>
,`` ''
), dashes (--
,---
), non-breaking space (~
), ellipses (\ldots
and\textellipsis
), and\noindent
. - The following font weight and size modifiers:
\bf
,\textbf
,\it
,\textit
,\t
,\tt
,\texttt
,\emph
,\underline
,\sout
,\textsc
,\tiny
,\scriptsize
,\small
,\normalsize
,\large
,\Large
,\LARGE
,\huge
,\Huge
. \includegraphics
from the packagegraphicx
, including the Polygon-style workaround for scaling the image using\def \htmlPixelsInCm
.- The miscellaneous commands
\url
,\href
,\section
,\subsection
, and\epigraph
.
Attachments
Public, i.e., non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory attachments/
.
Solution description
A description of how the problem is intended to be solved is provided in the directory solution/
.
This directory must contain one file per language, for at least one language, named solution.<language>.<filetype>
. Language is given the same way as for problem statements. Optionally, the language code can be left out; the default is then English (en
). The set of languages used can be different from what was used for the problem statement. Filetype can be either .tex
for LaTeX files, .md
for Markdown, or .pdf
for PDF.
Auxiliary files needed by the solution description files must all be in solution/
. solution.<language>.<filetype>
should reference auxiliary files as if the working directory is solution/
.
Exactly how the solution description is used is up to the user or tooling.
Test data
The test data are provided in subdirectories of data/
. The sample data in data/sample/
and the secret data in data/secret/
.
All files and directories associated with a single test case have the same base name with varying extensions. Here, base name is defined to be the relative path from the data
directory to the test case input file, without extensions. For example, the files secret/test.in
and secret/test.ans
are associated with the same test case that has the base name secret/test
. The existence of the .in
file declares the existence of the test case. If the test case exists, then an associated .ans
file must exist while the others are optional. If the test case does not exist, then the other files must not exist. The table below summarizes the supported test data:
Extension | Described In | Summary |
---|---|---|
.in | Input | Input piped to standard input |
.ans | Output Validator | Input to the Output Validator |
.files | Input | Input available via file I/O |
.yaml | Configuration | Additional configuration of the test case |
.png , .jpg , .jpeg , .svg | Illustrations | Illustration of the test case |
Judge systems may assume that the result of running a program on a test case is deterministic. For any two test cases, if the contents of their .in
and .files
directory are equivalent, as well as the args
sequence in the .yaml
file, then the input of the two test cases is equivalent. This means that for any two test cases, if their input, output validator arguments and the contents of their .ans
files are equivalent, then the test cases are equivalent. The assumption of determinism means that a judge system could choose to reuse the result of a previous run, or to re-run the equivalent test case.
Input
Each test case can supply input via standard input, command-line arguments, and/or the file system. These options are not exclusive. For a test case with base name test
, the file test.in
is piped to the submission as standard input. The submission will be run with the args
sequence defined in the test.yaml
file as command-line arguments. Note that usually the submission's entry point, whether it be a binary or an interpreted file, will be the absolute first command line argument. However, there exist languages, such as Java, where there is no initial command line argument representing the entry point.
The directory test.files
, if it exists, contains privileged data files available to the submission via file I/O. All files in this directory must be copied into the submission's working directory after compiling, but before executing the submission, possibly overwriting the compiled submission file or included data in the case of name conflicts.
Illustrations
One illustration file may be provided per test case. The file must share the base name of the associated test case. Illustration files are meant to be privileged information. The supported file extensions are .png
, .jpg
, jpeg
, and .svg
.
An illustration provides a visualization of the associated test case. Note that at most one image file may exist for each test case.
Test Case Configuration
One YAML file with additional configuration may be provided per test case. The file must share the base name of the associated test case.
The allowed keys are defined as follows. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.
Key | Type | Default |
---|---|---|
args | Sequence of strings | Inherited from testdata.yaml , which defaults to empty sequence |
output_validator_args | Sequence of strings | Inherited from testdata.yaml , which defaults to empty sequence |
input_validator_args | Sequence of strings or map of strings to sequences of strings | Inherited from testdata.yaml , which defaults to empty sequence |
full_feedback | Boolean | Inherited from testdata.yaml , which defaults to false in secret and true in sample |
hint | String | |
description | String |
For each test case:
args
defines arguments passed to the submission for this test case.output_validator_args
defines arguments passed to the output validator for the test case.input_validator_args
defines arguments passed to each input validator for the test case. If a sequence of strings, then those are the arguments that will be passed to each input validator for this the case. If a map, then each key is the name of the input validator and the value is the arguments to pass to that input validator for the test case. Validators not present in the map are run without any arguments.- When
full_feedback
istrue
, somebody whose submission didn't pass case should be shown:- the given input,
- the produced output (stdout),
- any error messages (stderr),
- the illustration created by the output visualizer (if applicable),
- the expected output.
- A hint provides feedback for solving a test case to, e.g., somebody whose submission didn't pass.
- A description conveys the purpose of a test case. It is an explanation of what aspect or edge case of the solution that the input file is meant to test.
Test Data Groups
The test data for the problem can be organized into a tree-like structure rooted in the data
folder. Each node of this tree is represented by a directory and referred to as a test data group. At the top level, the test data is divided into exactly two groups: sample
and secret
. The secret
group may be further split into subgroups (each a subdirectory of secret
). Each test data group (other than the root data
group) may contain zero or more test cases (i.e., input-answer files). The sample
directory may be omitted if a problem has no sample test cases. The secret
directory must exist and the secret
test group, or one of its descendent subgroups, must contain at least one test case.
Test cases and groups will be used in lexicographical order on file base name. If a specific order is desired, a numbered prefix such as 00
, 01
, 02
, 03
, and so on, can be used.
In each test data group, a YAML file testdata.yaml
may be placed to specify how the result of the test data group should be computed. Some of the keys and their associated values will be inherited from the testdata.yaml
in the closest ancestor group from the test case to the root data
directory that has one. Others must be explicitly defined in the group's testdata.yaml
file — otherwise they are set to the default values. If there is no testdata.yaml
file in the root data
group, one is implicitly added with the default values.
The format of testdata.yaml
is as follows:
Key | Type | Default | Inheritance | Comments |
---|---|---|---|---|
scoring | Map | See Verdict/Score Aggregation | Not inherited | Description of how the results of the group test cases and subgroups should be aggregated. This key is only permitted for the secret group and its subgroups. |
input_validator_args | Sequence of strings or map of strings to sequences of strings | empty string | Inherited | See Test Case Configuration. |
output_validator_args | Sequence of strings | empty sequence | Inherited | See Test Case Configuration. |
static_validation | Map or boolean | false | Not applicable | Configuration of static validation test data node. See Static Validator |
full_feedback | Boolean | false in secret , true in sample | Inherited | See Test Case Configuration. |
hint | String | Inherited | See Test Case Configuration. | |
description | String | Inherited | See Test Case Configuration. |
Invalid Test Cases
The data
directory may contain directories of test cases that must be rejected by validation. Their goal is to ensure the integrity and quality of the test data and validation programs.
Invalid Input
The files under invalid_input
are invalid inputs. Unlike in sample
and secret
, there are no .ans
files. Each tc.in
under invalid_input
must be rejected by at least one input validator.
Invalid Output
The test cases in invalid_output
describe invalid outputs for non-interactive problems. They consist of three files. The input file tc.in
must pass input validation. The output file tc.out
must fail output validation for the given answer file tc.ans
.
In particular, for an existing feedback directory dir
,
<output_validator_program> tc.in tc.ans dir [arguments] < tc.ans # MUST PASS
<output_validator_program> tc.in tc.ans dir [arguments] < tc.out # MUST FAIL
The directory invalid
can be organized into a tree-like structure similar to secret
and contain arguments in testdata.yaml
files that are passed to the validators.
Samples
Sample test cases can be used in three places:
- All submissions are run on the samples, and possibly feedback is provided to teams.
- The samples are shown in the problem statement.
- The samples are available for download.
By default the sample data for all three cases is taken from the .in
and .ans
file pairs under data/sample
. Some problems require (slightly) different data in each of these cases. We allow customizing this in subdirectories data/sample/statement
and data/sample/download
. In this case it is recommended to symlink identical files from these subdirectories to those in data/sample
.
Samples used by the judge
The data/sample
directory contains test cases similar to those in data/secret
. Every submission is run against these.
data/sample
must not contain test groups. Also, it does not have to contain any test cases, for example when the samples shown in the PDF are not actually valid cases.
Samples shown in the problem statement
By default, the .in
and .ans
pairs from data/sample
are shown in the problem statement. This can be customized by creating the data/sample/statement
directory. If this directory exists, its contents replaces that of data/sample
for purposes of the problem statement. This directory is required for interactive problems.
For each test case, data/sample/statement
must contain one of the following for default or custom validation (non-interactive) problems:
- An
.in
and.ans
file, which are both shown. - An
.in
,.ans
, and.out
file, of which the.in
and.out
file are shown, and the.ans
file is used for validation.
Interactive Problems
For interactive problems, data/sample/statement
must contain one of:
- An
.in
and.out
file, which are both shown, neither of which is validated. - An
.interaction
file that contains lines starting with<
and>
, containing an interaction log with output from the output validator starting with<
and output from the submission starting with>
.- An
.interaction
file may be specified without a corresponding test case — any such interaction log will be displayed as usual.
- An
If you want to provide files related to interactive problems (such as testing tools or input files), you can use data/sample/download.
Samples available for download
When data/sample/download
exists, the files in there are available for download. If this directory does not exist, the contents of data/sample/statement
are available for download, with any .out
files renamed to .ans
(replacing existing .ans
files). When that also does not exist, the test cases in data/sample
are available for download. data/sample/download
is required for interactive problems.
The data/sample/download
directory may contain anything, including, e.g., a testing tool for interactive problems, and is not validated. Testing tools may warn when test cases that are listed in data/sample
or data/sample/statement
do not appear in data/sample/download
, or when data/sample/download
contains test cases that do not appear in one of the two other locations.
Validation
All data/sample/*.in
files are input-validated.
When data/sample/statement/
does not exist, data/sample/*.ans
is output validated (i.e., must be accepted).
When data/sample/statement/
does exists, and the problem is not interactive, all data/sample/statement/*.in
are input validated, and data/sample/statement/*.out
(with fallback to data/sample/statement/*.ans
) are output validated (i.e., must be accepted).
Files in data/sample/download
are never validated, although tooling may warn when there are inconsistencies between download
, statement
, and data/sample/
itself.
Validation can be customized by specifying input_validator_args
and output_validator_args
in data/sample/testdata.yaml
.
Generators
If any generator scripts were used to automate writing test cases, it is recommended to include the generator source code in the directory generators/
along with invocation instructions in a file such as generators/README.txt
. This information can be useful as a debugging aid and for archival completeness: judge systems are not responsible for executing the provided generators and all test data written by the generators must be included in the problem package.
Included Files
Files that should be included with all submissions are provided in one non-empty directory per supported language. Files that should be included for all languages are placed in the non-empty directory include/default/
. Files that should be included for a specific language, overriding the default, are provided in the non-empty directory include/<language>/
, where <language>
is a language code as given in the languages table.
The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, but after checking whether the submission exceeds the code limit, overwriting files from the submission in the case of name collision. Language must be one of the allowed submission languages as specified by languages
in problem.yaml
. If any of the included files are supposed to be the main file (i.e., a driver), that file must have the language-dependent name as given in the table referred to above.
Example Submissions
Correct and incorrect solutions (file or directory programs) to the problem are provided in direct subdirectories of submissions/
. That is By default, the possible subdirectories are as in the table below, but they can be customized, and more can be added; see Default directories. Submission programs (either a single file or a directory of files) must be placed in a subdirectory of submissions
, e.g., submissions/accepted/
. No extra levels of subdirectories are allowed.
Directory | Requirement | Comment |
---|---|---|
accepted | Accepted as a correct solution for all test cases. | At least one is required. Used to lower bound the time limit. |
rejected | At least one case is not accepted. | |
wrong_answer | At least one case is wrong answer, and all cases are either wrong answer or accepted. | Used to lower bound the time limit. |
time_limit_exceeded | Too slow on at least one case, and all cases are either too slow or accepted. | Used to upper bound the time limit. |
run_time_error | Crashes for at least one case, and all cases either crash or are accepted. | Used to lower bound the time limit. |
brute_force | Never gives the wrong answer, but not accepted because run time error or timeout. |
Every file or directory in these directories represents a separate solution. It is mandatory to provide at least one accepted solution.
Metadata about the example submissions is provided in a YAML file submissions/submissions.yaml
. The top-level keys in submissions.yaml
are glob patterns matching files or directories under submissions/
. For example, accepted
and accepted/*
match all submissions in the submissions/accepted/
directory. See also Glob-patterns
Each glob pattern maps to a map with keys as defined below, specifying metadata for all submissions that are matched by the glob pattern.
Key | Type | Default | Comment |
---|---|---|---|
language | String | As determined by file endings given in the language list | |
entrypoint | String | As specified in the language list | |
authors` | Person or sequence of persons | Author(s) of submission(s). | |
permitted | Sequence of strings | [AC, WA, TLE, RTE] | All test cases must have a verdict in this subset of AC , WA , TLE , RTE . |
required | Sequence of strings | [AC, WA, TLE, RTE] | At least one test case must have a verdict in this subset of AC , WA , TLE , RTE . |
score | Float or list of two floats | The score of the submission equals the given number, or is in the given range. Only for scoring problems. | |
message | String | Empty string | This must appear as a substring in at least one judgemessage.txt . |
use_for_time_limit | Bool or string (lower /upper ) | See below. | Controls whether this submission is used to determine the time limit. |
Every submission matched by the glob pattern must satisfy:
- all test cases must have only verdicts present in
permitted
; - at least one test case must have a verdict in
required
; - if given, the score must be in the given range or equal the given score;
- if given, the
message
string must be included as a case-sensitive substring in thejudgemessage.txt
for at least one test case.
The tooling should check the constraints for consistency, such as that two disjoint permitted
sets are never applied to a single (submission, testcase)
pair.
Groups
The permitted
, required
, score
, message
, and use_for_time_limit
requirements can also be given for only a subset of test cases, by adding them under a key with the name of a test group (relative to data/
). In this case, the permitted
, required
, message
, and use_for_time_limit
keys only apply to the set of test cases (recursively) in the given group.
The score
key puts a constraint on the aggregated score of a given test group, not on the set of test cases the group contains.
For example, the configuration below tests that the submission solves all cases in group1
, but times out on at least one case in group2
.
solves_group_1.py:
sample:
permitted: [AC]
secret/group1:
permitted: [AC]
secret/group2:
permitted: [AC, TLE]
required: [TLE]
Glob patterns
Glob patterns can be used to apply restrictions to a subset of submissions. It is also possible to use glob patterns to put restrictions on a subset of test cases, for example, when test groups are not used:
time_limit_exceeded/solves_easy_cases.py:
sample:
permitted: [AC]
secret/*-easy:
permitted: [AC]
secret/*-hard:
permitted: [AC, TLE]
required: [TLE]
This means that the submission must solve all samples and all easy cases, but must time out on at least one of the hard cases.
Submission glob patterns are matched against all paths to files and directories of submissions inside and relative to the submissions/
directory. Test case glob patterns are matched against all paths of test groups and test cases relative to data/
, excluding the trailing .in
. Wildcards (*
) only match within a file name (i.e., do not match /
). A test case is matched by the glob pattern if either itself or any of its parent test groups is matched by it, and similarly a submission is matched if either itself or a parent directory is matched.
Using **
to match any number of directories and [xyz]
to match only a subset of characters is not supported. Brace expansion is supported for both submissions and test cases. Thus, one can write {simple,complex}.py
or author.{py,cpp}
to match multiple files.
Default directories
By default, the following requirements are defined:
# All cases must be accepted.
accepted:
permitted: [AC]
# At least one case is not accepted.
rejected:
required: [RTE, TLE, WA]
# All cases AC or WA, at least one WA.
wrong_answer:
permitted: [AC, WA]
required: [WA]
# All cases AC or TLE, at least one TLE.
time_limit_exceeded:
permitted: [AC, TLE]
required: [TLE]
# All cases AC or RTE, at least one RTE.
run_time_error:
permitted: [AC, RTE]
required: [RTE]
# Must not WA, but fail at least once.
# Note that by default these are not used for determining the time limit.
brute_force:
permitted: [AC, RTE, TLE]
required: [RTE, TLE]
The defaults can be overwritten in the submissions.yaml
file by simply specifying the name of the directory. Keys that are not specified are inherited from the default configuration above. This is supported for backwards compatibility and is not recommended for normal usage.
time_limit_exceeded:
permitted: [AC, WA, TLE]
required: [TLE]
Note that the glob time_limit_exceeded/*
would impose an additional requirement, instead of replacing the original requirement.
Timelimit inference
Any submission that must satisfy a required: TLE
requirement, i.e., must TLE
on at least one test case, is used to provide an upper bound on the time limit. Precisely, the time limit must be at most T / time_limit_to_tle
, where T
is the slowest runtime over the set of test cases to which the rule applies. Note that this excludes submissions that, e.g., have required: [TLE, RTE]
.
Any submission that is not permitted to get TLE
at all (on some subset of cases), i.e., must satisfy a permitted:
rule that does not contain TLE
, is used to provide a lower bound on the time limit. Precisely, the time limit must be at least T * ac_to_time_limit
, where T
is the slowest runtime over the set of test cases to which the rule applies.
To opt out of a (set of) submission(s) from influencing the time limit, set use_for_time_limit: false
alongside the permitted:
or required:
key that satisfies the constraints above. To opt out of a glob for submission(s) and optional subset of testcases from influencing the time limit, set use_for_time_limit: false
alongside the permitted:
and/or required:
keys. Note that this means that if you want to exclude a submission completely, then you must add use_for_time_limit: false
to every glob that matches that submission and would otherwise include it for determining the time limit.
To explicitly opt in a (set of) submissions(s) to be used for determining the time limit, use use_for_time_limit: lower
and use_for_time_limit: upper
. The first is equivalent to a permitted: [AC, WA, RTE]
constraint, and the second to a required: [TLE]
constraint. The system may warn when this makes other constraints redundant and should error when it is inconsistent with other constraints.
It is required that at least one submission is used to lower bound the time limit.
Input Validators
Input Validators, verifying the correctness of the input files, are provided in input_validators/
. Input validators can be specified as VIVA-files (with file ending .viva
), Checktestdata-file (with file ending .ctd
), or as a program. Programs must either be written in C++ or Python 3, or must provide a build
or run
script as specified above.
All input validators provided will be run on every input file. Validation fails if any validator fails.
Invocation
An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.
All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.
When invoked, the input validator will get the input file on stdin.
The validator should be possible to use as follows on the command line:
<input_validator_program> [arguments] < inputfile
Here, arguments
is the input_validator_args
.
Output
The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.
Exit codes
The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.
Dependencies
The validator must not read any files outside those defined in the Invocation section. Its result must depend only on these files and the arguments.
Input Visualizer
If a tool was used to automate creating test case illustration annotations, it is recommended to include the input visualizer source code in the directory input_visualizer/
along with invocation instructions in a file such as input_visualizer/README.txt
.
Output Validator
Overview
An output validator is a program that is given the output of a submitted program, together with the corresponding input file, and an answer file for the input, and then decides whether the output provided is a correct output for the given input file.
A validator program must be an application (executable or interpreted) capable of being invoked with a command line call. The details of this invocation are described below. The validator program has two ways of reporting back the results of validating:
- The validator must give a judgement (see Reporting a judgement).
- The validator may give additional feedback, e.g., an explanation of the judgement to humans (see Reporting Additional Feedback).
A custom output validator is used if the problem requires more complicated output validation than what is provided by the default diff variant described below. It must be provided as the directory output_validator/
. It must either be written in C++ or Python 3, or must provide a build
or run
script as specified above. It must adhere to the output validator specification described below. If no custom validator is provided, the default output validator will be used.
The output validator will be run on the output for every test data file using the arguments specified for the test data group.
Default Output Validator Specification
The default output validator is essentially a beefed-up diff that can be used in the common case where the output validator needs only compare the output of a submitted program against a trusted judge reference solution. In its default mode, it tokenizes the output and answer files and compares them token by token. It supports the following command-line arguments to control how tokens are compared.
Arguments | Description |
---|---|
case_sensitive | indicates that comparisons should be case-sensitive. |
space_change_sensitive | indicates that changes in the amount of whitespace should be rejected (the default is that any sequence of 1 or more whitespace characters are equivalent). |
float_relative_tolerance ε | indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details). |
float_absolute_tolerance ε | indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details). |
float_tolerance ε | short-hand for applying ε as both relative and absolute tolerance. |
When supplying both a relative and an absolute tolerance, the semantics are that a token is accepted if it is within either of the two tolerances. When a floating-point tolerance has been set, any valid formatting of floating-point numbers is accepted for floating-point tokens. So, for instance, if a token in the answer file says 0.0314
, a token of 3.14000000e-2
in the output file would be accepted.
It is an error to provide any of the float_relative_tolerance
, float_absolute_tolerance
, or float_tolerance
arguments more than once, or to provide a float_tolerance
alongside float_relative_tolerance
and/or float_absolute_tolerance
.
If no floating-point tolerance has been set, floating-point tokens are treated just like any other token and must match exactly.
Invocation
When invoked, the output validator will be passed at least three command line parameters and the output stream to validate on stdin.
The validator should be possible to use as follows on the command line:
<output_validator_program> input answer_file feedback_dir [additional_arguments] < team_output [ > team_input ]
The meaning of the parameters listed above are:
-
input: a string specifying the name of the input data file that was used to test the program whose results are being validated.
-
answer_file: a string specifying the name of an arbitrary "answer file" which acts as input to the validator program. The answer file may, but is not necessarily required to, contain the "correct answer" for the problem. For example, it might contain the output that was produced by a judge's solution for the problem when run with input file as input. Alternatively, the "answer file" might contain information, in arbitrary format, which instructs the validator in some way about how to accomplish its task.
-
feedback_dir: a string which specifies the name of a "feedback directory" in which the validator can produce "feedback files" in order to report additional information on the validation of the output file. The feedback_dir must end with a path separator (typically ‘/' or ‘\' depending on operating system), so that simply appending a filename to feedback_dir gives the path to a file in the feedback directory.
-
additional_arguments: in case
output_validator_args
are specified for the test case, these are passed as additional arguments to the validator on the command line. -
team_output: the output produced by the program being validated is given on the validator's standard input pipe.
-
team_input: when running the validator in interactive mode everything written on the validator's standard output pipe is given to the program being validated. Please note that when running interactive the program will only receive the output produced by the validator and will not have direct access to the input file.
The two files pointed to by input and answer_file must exist (though they are allowed to be empty) and the validator program must be allowed to open them for reading. The directory pointed to by feedback_dir must also exist.
Reporting a judgement
A validator program is required to report its judgement by exiting with specific exit codes:
- If the output is a correct output for the input file (i.e., the submission that produced the output is to be Accepted), the validator exits with exit code 42.
- If the output is incorrect (i.e., the submission that produced the output is to be judged as Wrong Answer), the validator exits with exit code 43.
Any other exit code (including 0!) indicates that the validator did not operate properly, and the judging system invoking the validator must take measures to report this to contest personnel. The purpose of these somewhat exotic exit codes is to avoid conflict with other exit codes that results when the validator crashes. For instance, if the validator is written in Java, any unhandled exception results in the program crashing with an exit code of 1, making it unsuitable to assign a judgement meaning to this exit code.
Reporting Additional Feedback
The purpose of the feedback directory is to allow the validator program to report more information to the judging system than just the accept/reject verdict. Using the feedback directory is optional for a validator program, so if one just wants to write a bare-bones minimal validator, it can be ignored.
The validator is free to create different files in the feedback directory, in order to provide different kinds of information to the judging system, in a simple but organized way. For instance, there may be a judgemessage.txt
file, the contents of which gives a message that is presented to a judge reviewing the current submission (typically used to help the judge verify why the submission was judged as incorrect, by specifying exactly what was wrong with its output). Other examples of files that may be useful in some contexts (though not in the ICPC) are a score.txt
file, giving the submission a score based on other factors than correctness, or a teammessage.txt
file, giving a message to the team that submitted the solution, providing additional feedback on the submission.
A judging system that implements this format must support the judgemessage.txt
file described above (I.e., content of the judgemessage.txt
file, if produced by the validator, must be provided by the judging system to a human judge examining the submission). Having the judging system support other files is optional.
The validator may create one or more image files in the feedback directory with the name teamimage.ext
and/or judgeimage.ext
, where ext
is one of: png
, jpg
, jpeg
, or svg
. The output visualizer may modify or create these files as well, and the output validator may create files in the feedback directory containing metadata that helps the visualizer in this task. The intent is for the teamimage
to be displayed to contestants and for judgeimage
to be privileged information used as a debugging aid by contest judges, but the judge system may display or ignore these files as it sees fit.
Note that a validator may choose to ignore the feedback directory entirely. In particular, the judging system must not assume that the validator program creates any files there at all.
Multi-pass validation
A multi-pass validator can be used for problems that should run the submission multiple times sequentially, using a new input generated by output validator during the previous invocation of the submission.
The time and memory limit apply for each invocation separately.
To signal that the submission should be run again, the output validator must exit with code 42 and output the new input in the file nextpass.in
in the feedback directory. Judging stops if no nextpass.in
was created or the output validator exited with any other code. Note that the nextpass.in
will be removed before the next pass.
It is a judge error to create the nextpass.in
file and exit with any other code than 42. It is a judge error to run more passes than specified by the limits.validation_passes
value in problem.yaml
.
All other files inside the feedback directory are guaranteed to persist between passes. In particular, the validator should only append text to the judgemessage.txt
to provide combined feedback for all passes.
Samples for multi-pass problems must be provided in a .interaction
file, like for interactive problems. Passes are separated by a line containing ---
(three dashes). When the problem is not interactive, simply start each pass by a number of lines starting with <
, containing the sample input, followed by some lines starting with >
, containing the sample answer.
Examples
An example of a judgemessage.txt
file:
Team failed at test case 14.
Team output: "31", Judge answer: "30".
Team failed at test case 18.
Team output: "hovercraft", Judge answer: "7".
Summary: 2 test cases failed.
An example of a teammessage.txt
file:
Almost all test cases failed — are you even trying to solve the problem?
Validator standard error
A validator program is allowed to write any kind of debug information to its standard error pipe. This information may be displayed to the user upon invocation of the validator.
Output Visualizer
An output visualizer is an optional program that is run after every invocation of the output validator in order to generate images illustrating the submission output. A visualizer program must be an application (executable or interpreted) capable of being invoked with a command line call. It is invoked using the same arguments as the output validator. It must be provided as the directory output_visualizer/
. It must either be written in C++ or Python 3, or must provide a build
or run
script as specified above.
All files written to the feedback directory by the output validator are accessible to the visualizer. The visualizer may overwrite or create image files in the feedback directory with the name teamimage.ext
or judgeimage.ext
, where ext
is one of: png
, jpg
, jpeg
, or svg
. It must not write to score.txt
, teammessage.txt
, or any other files in the feedback directory other than those of the form teamimage.ext
or judgeimage.ext
.
Compile or run-time errors in the visualizer are not judge errors. The return value and any data written by the visualizer to standard error or standard output are ignored.
Verdict/Score Aggregation
Pass-Fail Problems
For pass-fail problems, the verdict of a submission is accepted if and only if every test case in both the sample
group and the secret
group and its subgroups is accepted.
Scoring Problems
In scoring problems, submissions are given a non-negative score instead of a verdict. The goal of each submission is to maximize this score. Only the secret
group and its subgroups are scored.
Given a submission, scores are determined for test cases, test groups, and the submission itself (which is the score of the secret
group). The scoring behavior is configured for each test data group by the following arguments in the scoring
dictionary of its testdata.yaml
:
Key | Type | Description |
---|---|---|
score | String | The maximum possible score of the test data group. Must be a non-negative integer or unbounded . |
aggregation | pass-fail , sum , or min | How the score of the test data group is determined based on the scores of the subgroups and test cases. See below. |
require-pass | String or sequence of strings | Other test cases or groups whose test cases a submission must AC in order to receive a score for this test group. See below. |
The default value of aggregation
is sum
for the secret
group and pass-fail
for its subgroups.
Maximum Score Inference
The secret
group, its subgroups, and every test case in these groups have a maximum possible score. The secret
group's score may be any positive integer or unbounded
. Subgroups of secret
may only have unbounded
maximum score if secret
is unbounded. The default value of score
for the secret
group is 100.
The default score
for other test data groups is inferred from the score
value of its parent and siblings, as is the maximum score of each test case in the group:
Group Maximum Score | Aggregation Type | Maximum Score of Test Case / Subgroup |
---|---|---|
unbounded | any | unbounded |
bounded value M | sum or pass-fail | (M - S)/(A + T) |
bounded value M | min | M - S |
where the group has T
test cases, A
subgroups without a provided score
, and whose other subgroups have maximum scores that sum to S
. It is a judge error if S > M
. This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score.
Scoring Test Cases
The score of a failed test case is always 0. By default, the score of an accepted test case is its maximum score, computed as described above. A custom output validator may produce a score.txt
file for a test case:
-
for test cases in a group with bounded maximum score,
score.txt
must contain a single floating-point number in the range[0,1]
. The score of the test case is this number multiplied by the test case maximum score. -
for test cases in unbounded groups,
score.txt
must contain a non-negative floating-point number. The score of the test case is that number.
It is a judge error if an output validator accepts a test case in an unbounded group and does not produce a score.txt
. It is also a judge error if an output validator produces a score.txt
for a test case in a group with passs-fail
aggregation.
Scoring Test Groups
The score of a test group is determined by its subgroups and test cases. If it has no subgroups or test cases, then its score is 0. Otherwise, the score depends on the aggregation mode, which is either pass-fail
, sum
, or min
. If a group uses pass-fail
aggregation, the group must have bounded maximum score and all subgroups must also use pass-fail aggregation. If the submission receives an accept verdict for all test cases in the group and its subgroups, the score of the group is equal to its maximum possible score. Otherwise the group score is 0. If a group uses sum
aggregation, the group score is the sum of the scores of its test cases and subgroups. If a group uses min
aggregation, then the group score is the minimum of these scores.
The submission score is the score of the secret
group.
Required Dependent Groups
A group may specify that it should only be scored if a submission is accepted for another test case or all test cases in another test data group and their dependencies. Otherwise, none of the group's test cases are judged and the group score is 0.
The paths of these required test cases or groups, relative to the data
folder, are listed under the require-pass
key. The path of a group, relative to the data/
folder, must come later lexicographically than the paths of all dependent test cases and groups.
Each required group must be either sample
or a subgroup of secret
with pass-fail
aggregation.
Static Validator
Overview
A static validator is a program that is given the submission files as input and can analyze the contents to accept or reject the submission. Optionally, the static validator may assign a score to the submission. By default there is no static validator. A static validator may be provided under the static_validator
directory, similar to a custom output validator.
Static Validation Test Cases
Each test group may define a static validation test case. It is an error to define static validation test cases without providing a static validator. A static validation test case is defined within a group's testdata.yaml
file by specifying the key static_validation
. If a map is specified, it may have two keys args
, and in the case of scoring test groups, score
. The key args
maps to a string which represents the additional arguments passed to the static validator in this group's static validation test case. The key score
maps to a float which represents both the maximum score achievable for the static validation test case and the default score assigned in case the static validator accepts the submission for that test case. The static validator can override this score by outputting a value to score.txt
in the feedback directory, the same as an output validator. It can have the value of false
meaning there is no static validation, true
meaning that static validation is enabled with no score defined and no arguments. Aggregation is then applied to the test case in the same manner as other test cases. It is an error to assign a score in a pass-fail
test group. It is an error to not assign a score in a min
or sum
test group. It is also an error to provide a static validator for submit-answer
type problems.
Invocation
When invoked, the static validator will be passed at least three command line parameters.
The validator should be possible to use as follows on the command line:
<static_validator_program> language entry_point feedback_dir [additional_arguments]
The meaning of the parameters listed above are:
-
language: a string specifying the code of the language of the submission as shown in the languages table.
-
entry_point: a string specifying the entry point, that is a filename, class name, or some other identifier, which the static validator should know how to use depending on the language of the submission.
-
feedback_dir: a string which specifies the name of a "feedback directory" in which the validator can produce "feedback files" in order to report additional information on the validation of the submission. The feedback_dir must end with a path separator (typically ‘/' or ‘\' depending on operating system), so that simply appending a filename to feedback_dir gives the path to a file in the feedback directory.
-
additional_arguments: in case the static validation test case specifies additional args, these are passed as additional arguments to the validator on the command line.