Skip to content

Contributing

GitHub Merge Requests

Contributions to OFRAK should be made as a Pull Request on Github.

Pre-commit

OFRAK uses pre-commit to run automated tools on the code base before every commit.

Install pre-commit with the following commands:

pip3 install --user pre-commit
pre-commit install

Now each git commit will be preceded with a run of all the tools. If some of them fail, the commit will not proceed.

You can manually trigger a run of the tools on the current state of your code with:

pre-commit run --all-files

See the file .pre-commit-config.yaml for more details.

Python

PEP 8

OFRAK code should comply with PEP 8, with the following exceptions and elaborations:

  • The character line limit is 100
  • PEP 8 compliance is enfoced by black
  • PEP 8 compliance is not enforced on autogenerated files

Black

Black is an "uncompromising Python code formatter" that is PEP 8 compliant. See its documentation for information on how to install and run black manually. Black reads configuration information from the pyproject.toml file. The following is an example of a pyproject.toml file:

[tool.black]
line-length = 100

Of note is the fact that a .git-blame-ignore-revs file can be used to instruct git to ignore commits. This feature, which is supported in Git versions since 2.23, is useful for ignoring commits used to adhere to black. This feature can be configured on the command line using git config blame.ignoreRevsFile .git-blame-ignore-revs. See the black documentation for more details.

Type Annotations and MyPy

Whenever appropriate, OFRAK code should strive to use type annotations (type hints). Type annotations make code easier to read for humans; they help speed up development and refactoring since they are recognized by IDEs such as PyCharm; and they allow the use of static analysis tools such as mypy to find errors and bugs. Detailed information about how to properly use type annotations can be found in PEP 484.

MyPy is a static type checker for Python.

Raising Exceptions

When raising exceptions, the exception should always be instantiated: raise NotImplementedError(), not raise NotImplementedError. This standard is intentionally stricter than the Python documentation for raise, which states that a raise statement can take either a class or instances as its argument.

It is worth reviewing the above-referenced documentation for the raise statement and how it can be used with a from statement for exception chaining.

Cryptography

Any cryptographic operations should be performed using the pyca/cryptography library.

Docstrings

Public-facing classes, methods, and functions should contain docstrings that describe their usage.

We use mkdocstrings to generate code documentation. The following conventions are followed to keep this generated code documentation readable:

Functions and Methods

Public-facing functions and methods should have docstrings. Thus, each interface or abstract class should have a docstring on each of its methods. This docstring should not be duplicated in implementations, unless the implementation has unique behavior which could be of concern to the user (perhaps a significant side effect, or additional errors are raised).

As mentioned above, docstrings should use Sphinx conventions, with the following modifications:

  • When type information is already present via type annotations, the :type: lines should be removed from the docstring
  • The description should start on the next line after the triple quotes, instead of on the same line
  • Each error that can be raised should be documented and described:
    • Describe all the errors raised within the body of the function itself
    • Use your judgement for (uncaught) errors that are raised deeper in the call graph below this function. It is not necessary to list all of them. Particular consideration should be given to errors which are raised as a direct result of arguments to the top level function (i.e. if those arguments are passed to another function which will raise an error if its own arguments are outside a certain range).
  • Do describe the return value
  • Leave one blank line between the function description, params description, raises description, and return description.
  • Where possible the descriptions for params, raises, and returns should be kept to a single sentence and should not end in a period
    • If multiple sentences are needed to describe one of these, consider adding to the function's general description instead, as long as each description is still clear enough on its own
    • When following this convention, the descriptions for raises and return should read as a normal sentence, e.g.: :return: True if the method succeeds, False otherwise.
  • When referring to named code artifacts, use a single back-tick (`) around that name

The following is an example of a docstring that complies with this specification:

def my_method(self, arg1: int, arg2: float) -> bool
    """
    Performs some task, and returns the result.

    :param arg1: an integer value indexing (something)
    :param arg2: a scalar value applied to (something)

    :raises NotFoundError: if the data needed to run the method are not present
    :raises ValueError: if `arg2` is less than 0 or greater than 1

    :return: True if the method succeeds, False otherwise
    """

Classes

Classes should have a docstring below the class definition describing the class. An exception to this are dataclasses, since their fields should usually be self-explanatory.

The class's constructor should not have a docstring. If the arguments to the constructor are not obvious or intuitive, they should be described in the class-level docstring:

class MyClass:
    """
    Encapsulates some really cool data and behavior.

    :param arg1: Part of the cool data.
    :param arg2: A cool greeting to preface the data. Defaults to 'Hello'
    """

    def __init__(self, arg1: str, arg2: str = "Hello"):
        ...

Adding new packages to OFRAK build

The build script build_image.py expects a config file similar to ofrak-core-dev.yml. Each of the packages listed under packages_paths in the YAML files should correspond to a directory containing two files: Makefile and Dockerstub. They may also contain a Dockerstage file for multi-stage builds.

Imagine we are adding a new package with the following structure:

ofrak_package_x
 |--Dockerstub
 |--Makefile
 |--setup.py
 |--ofrak_package_x_python_module
     |...
 |--ofrak_package_x_python_module_test
     |...

Makefile

At a minimum, an OFRAK package Makefile should contain the following targets:

  • install: a phony target that installs the package
  • develop: a phony target that installs the package in editable mode (for development)
  • inspect: a phony target that runs static code analysis on the package
  • test: a phony target that:
  • has a dependency on inspect
  • runs the packages tests, recording "term-missing" coverage information
  • uses fun-coverage to assert that the package has 100% function coverage

An example of such a Makefile for ofrak_package_x is:

PYTHON=python3
PIP=pip3

.PHONY: install
install:
    $(PIP) install .

.PHONY: develop
develop:
    $(PIP) install -e .[test]

.PHONY: inspect
inspect:
    mypy

.PHONY: test
test: inspect
    $(PYTHON) -m pytest -n auto --cov=ofrak_package_x_python_module --cov-report=term-missing --cov-fail-under=100 ofrak_package_x_python_module_test
    fun-coverage --cov-fail-under=100

Dockerstub & Dockerstage

Dockerstub should read as a normal Dockerfile, only without a base image specified at the top. This file should contain all of the steps necessary to install this package in a Docker image. During build, all packages' Dockerstubs will be concatenated, so specifying a base image is unnecessary. Also, any specified entrypoint may be overridden.

The build relies on the following assumptions:

  • Dockerstub, Dockerstage, and Makefile should not use any relative paths which go into the parent directory of ofrak_package_x because at build time that parent directory will not be the same.
  • All rules in Makefile should assume the working directory is ofrak_package_x (but at a different path as explained above)
  • Dockerstub and Dockerstage should be written assuming the build context is the parent directory of ofrak_package_x. Do not assume anything is present in the build context besides the contents of ofrak_package_x and what Makefile adds to ofrak_package_x in the dependencies rule.

Static Code Analysis

The repository has several automated static code anaysis workflows.

  1. Pre-commit is used, both as a pre-commit hook and as a CI/CD workflow, to ensure repository-wide:
    1. PEP8 compliance (this is done using black).
    2. No Python modules contain unused imports.
    3. Files are either empty or end with a newline.
  2. Each package contains an inspect target responsible for static code analysis for that package.
    1. As part of this, MyPy is used to perform static type checking.