4 min read

Open Source Labeling Best Practices

A proper labeling scheme is one of the best ways to keep an open source project in GitHub organized. Labels are a helpful visual indicator, improve discoverability, and allow for automated analytics and reporting.

Many projects don't reap the benefits of proper labeling because it's difficult to come up with a comprehensive labeling scheme and even harder to enforce one.

In this blog post, we'll look at a few concepts and examples that should help.

Label Categories

A high-quality labeling scheme starts with high-quality label categories.

A label category is a way to group labels together. For platforms like GitHub—that don't support hierarchical labels—the label category is typically the prefix of the label.

For example, a web app might have an area category indicating an area of the application with individual labels such as area:frontend, area:backend, area:infra, area:docs, etc.

Label categories make it easy to quickly understand the impact and scope of an issue or pull request, but it's important to note that mot every label needs a category (eg. good first issue), and not every issue needs to have a label from each label category within a project.

Example: Element-Web

Element (https://github.com/element-hq) is an open-source, decentralized collaboration platform built on the Matrix protocol.

Element has a large community of contributors, and their repositories give an example of what a mature labeling scheme looks like for large, active projects.

The labels below are pulled directly from Element's Issue Labeling wiki, and are considered the "core" part of the labeling scheme they strive to apply to every issue:

    • Type (every issue is assigned a type):
      • T-Defect: Bugs, crashes, hangs, vulnerabilities, or other problems
      • T-Enhancement: New features, changes in functionality, improvements
      • T-Task: Refactoring, enabling / disabling functionality, other tasks
      • T-Other: Questions, user support, anything else
    • Severity (only issues labeled T-Defect are also assigned a severity):
      • S-Critical: Prevents work, causes data loss and/or has no workaround
      • S-Major: Severely degrades major functionality or product features
      • S-Minor: Impairs non-critical functionality, suitable workarounds exist
      • S-Tolerable: Low/no impact on users
    • Occurrence (all issues labeled T-Defect are also assigned a prevalence):
      • O-Frequent: Affects or can be seen by most users regularly or impacts most users
      • O-Occasional: Affects or can be seen by some users regularly or most users rarely
      • O-Uncommon: Most users are unlikely to come across this or unexpected workflow This label may also be used for other types of issues.
    • Area
      • Most issues are assigned one or several "areas"
      • Uses one of the many A- prefixed labels,
        • e.g. A-Composer or A-Spaces
      • Each area label maps to a group of features or portion of the UI surface in the app

Example: Creative Commons

Creative Commons is a non-profit that offers licenses and legal tools for freely sharing work. The organization maintains several open-source libraries and has a well-defined labeling guide for those projects.

Here are their main label categories as outlined in their Repository Labels guide:

    • Priority: derived from a combination of urgency and importance
    • Status: determines whether the issue is ready for work or not
    • Goal: the end result achieved when the issue is resolved (fix, enhancement)
    • Aspect: the side of the project that the issue deals with
    • Skill: The technical skills a person is required to have to work on the issue
    • Talk: Issues with interaction labels do not entail any work to be done on the repo (eg. Q&A / discussion)
    • Friendliness: The level of friendliness of a particular issue is the valency of the issue towards contributions from the community. (eg. Good first issue, help wanted)

Case Study: Continue.dev

Continue.dev is the leading open-source AI code assistant that allows users to connect any models and any context to create custom autocomplete and chat experiences inside the IDE.

At Dosu, we had the pleasure of working with the Continue team to come up with a new labeling scheme for their open source repository. Previously, they had used GitHub's default labels with a couple of additions but wanted something more robust before onboarding Dosu for auto-labeling.

In the end, we decided on 4 label categories for the repository:

  • area: Indicates the area of application the issue was concerned with
  • kind: Indicates the type of issue that was being opened
  • ide: Indicates which of the 2 supported IDEs the issue was relevant to
  • priority: Indicates how important the issue is

Looking at Continue's issues today you can see the label scheme we came up with is still in good use.

Our Recommendation

Having worked with a number of different projects regarding labeling schemes, we've come up with the following general guidelines.

"Always-Include" Label Categories

There are a few categories that are relevant for almost any repository:

  • Type / Kind
    • What kind of issue is this?
    • Some of these labels are included in the GitHub default set
    • Labels:
      • type:bug / kind:bug
      • type:question / kind:question
      • type:enhancement / kind:enhancement
  • Priority / Severity
    • How important is this issue?
    • Labels:
      • P0 / severity:critical
      • P1 / severity:major
      • P2 / severity:minor
      • P3 / severity:tolerable
  • Area
    • What part of the product does this affect?
    • Labels:
      • Project dependent
      • Examples: area:backend , area:documentation

Supplemental Label Categories

Additional label categories can help narrow scope even further, some examples are:

  • Platform
    • What environment is this issue relevant to?
    • Examples: platform:ubuntu , platform:cloud
  • Integration
    • What integration is this issue relevant to?
    • Examples: integration:github , integration:linear
  • Team
    • What team is this issue relevant to?

Happy Labeling

Labels are an important aspect of maintaining any GitHub repository—open source or internal.

Getting started is the hardest part, but hopefully after reading this it's a little bit easier.

Once you get started, the next challenge is enforcing your labeling scheme overtime. If you want to enforce it automatically, check out Dosu's free auto-labeling feature.

Happy labeling!