Artifact Evaluation

Reproducibility of experimental results is crucial to foster an atmosphere of trustworthy, open, and reusable research. To improve and reward reproducibility, QEST+FORMATS 2025 includes a dedicated Artifact Evaluation (AE). An artifact is any additional material (software, data sets, machine-checkable proofs, etc.) that supports the claims made in the paper and, in the ideal case, makes them fully reproducible. In case of a tool, a typical artifact consists of the binary or source code of the tool, its documentation, the input files (e.g., models analyzed or input data) used for the tool evaluation in the paper, and a configuration file or document describing the parameters used to obtain the results.

Submission of an artifact is mandatory for tool papers, and optional – but encouraged – for research papers if it can support the results presented in the paper. Artifacts will be reviewed concurrently to the corresponding papers. The results of the artifact evaluation will be taken into consideration in the paper reviewing discussion. However, the primary goal of the artifact evaluation is to give positive feedback to authors as well as encourage and reward reproducible research.

Benefits for Authors: By providing an artifact supporting experimental claims, authors increase the confidence of readers in their contribution. Accepted papers with a successfully evaluated artifact will receive a badge to be included on the paper’s title page. Finally, artifacts that significantly exceed expectations may receive an Outstanding Artifact Award.

Important Dates

All dates are AoE

Artifact intention: April 18, 2025 (on paper submission)
Artifact submission: April 25, 2025
Review phase I (internal): April 25–30, 2025
Artifact fix: April 30–May 5, 2025
Review phase II (internal): April 30 – May 25, 2025
Author notification: May 31, 2025

Review phases of the evaluation process are explained below.

Artifact Evaluation

Criteria

The goal of this initiative is to encourage research that is openly accessible and reproducible also in the future (time-proof). The AE Committee will assign a score to the artifact based on the notion of reproducibility as detailed in the EAPLS guidelines. The AEC will focus on both “Functional” and “Available” aspects by evaluating:

consistency with the results in the paper, and their reproducibility (are the observations consistent with the paper?),
completeness (which proportion of the experiments in the paper can be reproduced?),
quality of documentation and easiness of use (can non-experts produce and interpret the results?),
availability (can the artifact be publicly accessed?),
future-proofness (is it reasonable to assume that the results can be still be reproduced in five years time?).

For example, artifacts that need to download third-party material using a network connection, e.g. to retrieve external dependencies not bundled in the artifact, will not be considered future-proof. In contrast, if dependencies are bundled in the artifact and installed with automated script, the artifact can be considered future-proof.

Badging

To obtain the available badge, the artifact must be made publicly and permanently available with a DOI, e.g. in Zenodo, at the latest by the artifact submission date.
To obtain the functional badge, the artifact must be properly documented, and it must be consistent, complete, and exercisable as per the EAPLS guidelines.
Other badges (e.g. reusable) are not implemented.

Papers that undergo a successful evaluation by the criteria above will be allowed to place a badge on the title page for the camera-ready version of their article.
Note that the paper must clearly indicate the exact artifact version used for the evaluation, i.e. the DOI used for the submission. The paper may additionally link to the most recent version and/or source code repositories. Sample $\LaTeX$ code to place the badge will be provided.

Process

The artifact evaluation is single blind. This in particular means that the artifact does not need to be anonymized.

The evaluation consists of two phases: the quick-check phase (Phase I) and the full-review phase (Phase II), which proceed as follows.

Phase I: Reviewers will download artifacts, read the instructions, and attempt to run some minimal experiments. They will not try to verify any claims of the paper, but will merely check for technical issues. Any arising issues will be communicated to the authors, and authors may update their submission to fix these problems.
- For artifacts where Phase I reveals small technical issues, authors have an Artifact fix period (April 30–May 5, 2025) to mend these before Phase II begins for their artifacts.
Phase II: The submissions are closed to the authors and the actual full reviewing process begins. In case of unexpected technical problems, authors might be contacted by the AEC chair.

Outstanding Artifact Award

AEC members will nominate artifacts that significantly exceed expectations for an outstanding artifact award. The AEC chairs will consider these nominations, and might award them to no or multiple artifacts. Awardees will receive a certificate during the social event.

Artifact Submission

An artifact submission consists of:

A PDF of the submitted paper
- Artifacts should have the same title and authors as the accepted paper
An abstract summarizing the artifact and its relation to the paper
- Indicate which experiments can be reproduced, and which can’t (if any)
- Indicate the type of paper submitted (Research, Tool, or Case study)
A URL—preferably a DOI—to a publicly available zip/rar/tar.gz/tar.xz compressed file containing the artifact and all relevant files
- We recommend Zenodo (else Figshare) for hosting the artifact
The SHA-256 checksum of the zip/rar/… compressed <file>, generated as:
- Windows: CertUtil -hashfile <file> SHA256
- MacOS: shasum -a 256 <file>
- Linux: sha256sum <file>
Additional information, including:
- the platform on which the artifact has been prepared and tested,
- roughly how much time the overall evaluation takes,
- whether network access is required by the artifact,
- special hardware and license requirements (if any), and
- any further information deemed relevant.

Submissions are made through EasyChair at this link: https://easychair.org/conferences/?conf=aeqestformats2025

The Artifact Itself

In the spirit of reproducibility and future-proofness, some requirements are imposed on the actual artifact. In case any of these points cannot be implemented, e.g. due to the use of licensed software that cannot be distributed, please contact the AEC chairs as soon as possible to discuss specific arrangements.

Contents of the artifact must include:

A README file, describing in clear and simple steps how to install and use the artifact, how to reproduce the results presented in the paper, and how the files/output produced in the artitfact corresponds to tables and figures in the paper:
- If applicable, the README file should provide a “toy example” or “test run” to easily check the setup during Phase I;
- In case network access is required by the artifact, an explanation of when and why it is required should be provided.
A LICENCE file, which at the very least allows the AEC to download and execute the artifact.
The concrete binaries for experimental reproduction of results in the paper, either as a Virtual Machine (VM) or docker image, containing everything that is needed to run the artifact:
- For Virtual Machine: use VirtualBox and save the VM as an Open Virtual Appliance (OVA) file;
- For Docker: include the complete image saved with docker save (potentially compressed with e.g. gzip).

The artifact must be provided as a single compressed file, e.g. an artifact.zip or .rar/tar.gz/tar.xz archive, that contains a complete virtual image generated via docker save or VirtualBox OVA file as indicated in point 3.

The artifact must be made available through archive-quality storage (Zenodo, Figshare, etc.) that provides a citable DOI.

The artifact must not require financial cost to be evaluated, e.g. by running on a cloud service.

It is also highly recommended, but not strictly required:

to provide detailed instructions on how to extract and build the artifact,
to provide the logs and other files used to obtain the results of the paper,
to include push-button scripts to simplify the artifact evaluation process,
that the artifact is self-contained and does not require network access,
to run independent tests before the submission, and
for docker-based submissions, to include concrete bind-mounts (e.g. -v $(pwd)/result:/results:rw) in the README instructions to simplify extracting all results.

In general, it should be as simple as possible for reviewers to conclude reproducibility.

Sources and Reusability

Authors are also encouraged to include all sources, dependencies, and instructions needed to modify and/or (re-)build the artifact (e.g. through tarballs and a Dockerfile). This may, of course, rely on network access. We recommend to strive to be as self-contained as possible. In particular, the artifact should contain as much of its dependencies as reasonably possible, and any downloads should only refer to stable sources (e.g. use a standard Debian docker image as base) and precise versions (e.g. a concrete commit / tag in a GitHub repository or docker image version instead of an unspecific “latest”). This maximizes the chances that the tool can still be modified and built several years from now.