diff --git a/documentation/content/en/articles/committers-guide/_index.adoc b/documentation/content/en/articles/committers-guide/_index.adoc index afaea189cb..ebaad17afd 100644 --- a/documentation/content/en/articles/committers-guide/_index.adoc +++ b/documentation/content/en/articles/committers-guide/_index.adoc @@ -1,3880 +1,3880 @@ --- title: Committer's Guide authors: - author: The FreeBSD Documentation Project copyright: 1999-2022 The FreeBSD Documentation Project description: Introductory information for FreeBSD committers trademarks: ["freebsd", "coverity", "git", "github", "gitlab", "ibm", "intel", "general"] weight: 25 tags: ["FreeBSD Committer's Guide", "Guide", "Community"] --- = Committer's Guide :doctype: article :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :source-highlighter: rouge :experimental: :images-path: articles/committers-guide/ ifdef::env-beastie[] ifdef::backend-html5[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] :imagesdir: ../../../images/{images-path} endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [.abstract-title] Abstract This document provides information for the FreeBSD committer community. All new committers should read this document before they start, and existing committers are strongly encouraged to review it from time to time. Almost all FreeBSD developers have commit rights to one or more repositories. However, a few developers do not, and some of the information here applies to them as well. (For instance, some people only have rights to work with the Problem Report database.) Please see crossref:committers-guide[non-committers, Issues Specific to Developers Who Are Not Committers] for more information. This document may also be of interest to members of the FreeBSD community who want to learn more about how the project works. ''' toc::[] [[admin]] == Administrative Details [.informaltable] [cols="1,1", frame="none"] |=== |_Login Methods_ |man:ssh[1], protocol 2 only |_Main Shell Host_ |`freefall.FreeBSD.org` |_Reference Machines_ |`ref*.FreeBSD.org`, `universe*.freeBSD.org` (see also link:https://www.FreeBSD.org/internal/machines/[FreeBSD Project Hosts]) |_SMTP Host_ |`smtp.FreeBSD.org:587` (see also crossref:committers-guide[smtp-setup, SMTP Access Setup]). |`_src/_` Git Repository |`ssh://git@gitrepo.FreeBSD.org/src.git` |`_doc/_` Git Repository |`ssh://git@gitrepo.FreeBSD.org/doc.git` |`_ports/_` Git Repository |`ssh://git@gitrepo.FreeBSD.org/ports.git` |_Internal Mailing Lists_ |developers (technically called all-developers), doc-developers, doc-committers, ports-developers, ports-committers, src-developers, src-committers. (Each project repository has its own -developers and -committers mailing lists. Archives for these lists can be found in the files [.filename]#/local/mail/repository-name-developers-archive# and [.filename]#/local/mail/repository-name-committers-archive# on `freefall.FreeBSD.org`.) |_Core Team monthly reports_ |[.filename]#/home/core/public/reports# on the `FreeBSD.org` cluster. |_Ports Management Team monthly reports_ |[.filename]#/home/portmgr/public/monthly-reports# on the `FreeBSD.org` cluster. |_Noteworthy `src/` Git Branches:_ |`stable/n` (`n`-STABLE), `main` (-CURRENT) |=== man:ssh[1] is required to connect to the project hosts. For more information, see crossref:committers-guide[ssh.guide, SSH Quick-Start Guide]. Useful links: * link:https://www.FreeBSD.org/internal/[FreeBSD Project Internal Pages] * link:https://www.FreeBSD.org/internal/machines/[FreeBSD Project Hosts] * link:https://www.FreeBSD.org/administration/[FreeBSD Project Administrative Groups] [[pgpkeys]] == OpenPGP Keys for FreeBSD Cryptographic keys conforming to the OpenPGP (__Pretty Good Privacy__) standard are used by the FreeBSD project to authenticate committers. Messages carrying important information like public SSH keys can be signed with the OpenPGP key to prove that they are really from the committer. See https://nostarch.com/releases/pgp_release.pdf[PGP & GPG: Email for the Practical Paranoid by Michael Lucas] and https://en.wikipedia.org/wiki/Pretty_Good_Privacy[] for more information. [[pgpkeys-creating]] === Creating a Key Existing keys can be used, but should be checked with [.filename]#documentation/tools/checkkey.sh# first. In this case, make sure the key has a FreeBSD user ID. For those who do not yet have an OpenPGP key, or need a new key to meet FreeBSD security requirements, here we show how to generate one. [[pgpkeys-create-steps]] [.procedure] ==== . Install [.filename]#security/gnupg#. Enter these lines in [.filename]#~/.gnupg/gpg.conf# to set minimum acceptable defaults for signing and new key preferences (see the link:https://www.gnupg.org/documentation/manuals/gnupg/GPG-Options.html[GnuPG options documentation] for more details): + [.programlisting] .... # Sorted list of preferred algorithms for signing (strongest to weakest). personal-digest-preferences SHA512 SHA384 SHA256 SHA224 # Default preferences for new keys default-preference-list SHA512 SHA384 SHA256 SHA224 AES256 CAMELLIA256 AES192 CAMELLIA192 AES CAMELLIA128 CAST5 BZIP2 ZLIB ZIP Uncompressed .... . Generate a key: + [source,shell] .... % gpg --full-gen-key gpg (GnuPG) 2.1.8; Copyright (C) 2015 Free Software Foundation, Inc. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Warning: using insecure memory! Please select what kind of key you want: (1) RSA and RSA (default) (2) DSA and Elgamal (3) DSA (sign only) (4) RSA (sign only) Your selection? 1 RSA keys may be between 1024 and 4096 bits long. What keysize do you want? (2048) 2048 <.> Requested keysize is 2048 bits Please specify how long the key should be valid. 0 = key does not expire = key expires in n days w = key expires in n weeks m = key expires in n months y = key expires in n years Key is valid for? (0) 3y <.> Key expires at Wed Nov 4 17:20:20 2015 MST Is this correct? (y/N) y GnuPG needs to construct a user ID to identify your key. Real name: Chucky Daemon <.> Email address: notreal@example.com Comment: You selected this USER-ID: "Chucky Daemon " Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o You need a Passphrase to protect your secret key. .... <.> 2048-bit keys with a three-year expiration provide adequate protection at present (2022-10). <.> A three year key lifespan is short enough to obsolete keys weakened by advancing computer power, but long enough to reduce key management problems. <.> Use your real name here, preferably matching that shown on government-issued ID to make it easier for others to verify your identity. Text that may help others identify you can be entered in the `Comment` section. + After the email address is entered, a passphrase is requested. Methods of creating a secure passphrase are contentious. Rather than suggest a single way, here are some links to sites that describe various methods: https://world.std.com/~reinhold/diceware.html[], https://www.iusmentis.com/security/passphrasefaq/[], https://xkcd.com/936/[], https://en.wikipedia.org/wiki/Passphrase[]. ==== Protect the private key and passphrase. If either the private key or passphrase may have been compromised or disclosed, immediately notify mailto:accounts@FreeBSD.org[accounts@FreeBSD.org] and revoke the key. Committing the new key is shown in crossref:committers-guide[commit-steps, Steps for New Committers]. [[kerberos-ldap]] == Kerberos and LDAP web Password for FreeBSD Cluster The FreeBSD cluster requires a Kerberos password to access certain services. The Kerberos password also serves as the LDAP web password, since LDAP is proxying to Kerberos in the cluster. Some of the services which require this include: * https://bugs.freebsd.org/bugzilla[Bugzilla] To create a new Kerberos account in the FreeBSD cluster, or to reset a Kerberos password for an existing account using a random password generator: [source,shell] .... % ssh kpasswd.freebsd.org .... [NOTE] ==== This must be done from a machine outside of the FreeBSD.org cluster. ==== A Kerberos password can also be set manually by logging into `freefall.FreeBSD.org` and running: [source,shell] .... % kpasswd .... [NOTE] ==== Unless the Kerberos-authenticated services of the FreeBSD.org cluster have been used previously, `Client unknown` will be shown. This error means that the `ssh kpasswd.freebsd.org` method shown above must be used first to initialize the Kerberos account. ==== [[committer.types]] == Commit Bit Types The FreeBSD repository has a number of components which, when combined, support the basic operating system source, documentation, third party application ports infrastructure, and various maintained utilities. When FreeBSD commit bits are allocated, the areas of the tree where the bit may be used are specified. Generally, the areas associated with a bit reflect who authorized the allocation of the commit bit. Additional areas of authority may be added at a later date: when this occurs, the committer should follow normal commit bit allocation procedures for that area of the tree, seeking approval from the appropriate entity and possibly getting a mentor for that area for some period of time. [.informaltable] [cols="1,1,1", frame="none"] |=== |__Committer Type__ |__Responsible__ |__Tree Components__ |src |srcmgr@ |src/ |doc |doceng@ |doc/, ports/, src/ documentation |ports |portmgr@ |ports/ |=== Commit bits allocated prior to the development of the notion of areas of authority may be appropriate for use in many parts of the tree. However, common sense dictates that a committer who has not previously worked in an area of the tree seek review prior to committing, seek approval from the appropriate responsible party, and/or work with a mentor. Since the rules regarding code maintenance differ by area of the tree, this is as much for the benefit of the committer working in an area of less familiarity as it is for others working on the tree. Committers are encouraged to seek review for their work as part of the normal development process, regardless of the area of the tree where the work is occurring. === Policy for Committer Activity in Other Trees * All committers may modify [.filename]#src/share/misc/committers-*.dot#, [.filename]#src/usr.bin/calendar/calendars/calendar.freebsd#, and [.filename]#ports/astro/xearth/files#. * doc committers may commit documentation changes to [.filename]#src# files, such as manual pages, READMEs, fortune databases, calendar files, and comment fixes without approval from a src committer, subject to the normal care and tending of commits. * Any committer may make changes to any other tree with an "Approved by" from a non-mentored committer with the appropriate bit. Mentored committers can provide a "Reviewed by" but not an "Approved by". * Committers can acquire an additional bit by the usual process of finding a mentor who will propose them to srcmgr, doceng, or portmgr, as appropriate. When approved, they will be added to 'access' and the normal mentoring period will ensue, which will involve a continuing of "Approved by" for some period. [[doc-blanket-approval]] ==== Documentation Implicit (Blanket) Approval Some types of fixes have "blanket approval" from the {doceng}, allowing any committer to fix those categories of problems on any part of the doc tree. These fixes do not need approval or review from a doc committer if the author doesn't have a doc commit bit. Blanket approval applies to these types of fixes: * Typos * Trivial fixes + Punctuation, URLs, dates, paths and file names with outdated or incorrect information, and other common mistakes that may confound the readers. Over the years, some implicit approvals were granted in the doc tree. This list shows the most common cases: * Changes in [.filename]#documentation/content/en/books/porters-handbook/versions/_index.adoc# + extref:{porters-handbook}versions/[__FreeBSD_version Values (Porter's Handbook)], mainly used for src committers. * Changes in [.filename]#doc/shared/contrib-additional.adoc# + extref:{contributors}[Additional FreeBSD Contributors, contrib-additional] maintenance. * All link:#commit-steps[Steps for New Committers], doc related * Security advisories; Errata Notices; Releases; + Used by {security-officer} and {re}. * Changes in [.filename]#website/content/en/donations/donors.adoc# + Used by {donations}. Before any commit, a build test is necessary; see the 'Overview' and 'The FreeBSD Documentation Build Process' sections of the extref:{fdp-primer}[FreeBSD Documentation Project Primer for New Contributors] for more details. [[git-primer]] == Git Primer [[git-basics]] === Git basics When one searches for "Git Primer" a number of good ones come up. Daniel Miessler's link:https://danielmiessler.com/study/git/[A git primer] and Willie Willus' link:https://gist.github.com/williewillus/068e9a8543de3a7ef80adb2938657b6b[Git - Quick Primer] are both good overviews. The Git book is also complete, but much longer https://git-scm.com/book/en/v2. There is also this website https://dangitgit.com/ for common traps and pitfalls of Git, in case you need guidance to fix things up. Finally, an introduction link:https://eagain.net/articles/git-for-computer-scientists/[targeted at computer scientists] has proven helpful to some at explaining the Git world view. This document will assume that you've read through it and will try not to belabor the basics (though it will cover them briefly). [[git-mini-primer]] === Git Mini Primer This primer is less ambitiously scoped than the old Subversion Primer, but should cover the basics. ==== Scope If you want to download FreeBSD, compile it from sources, and generally keep up to date that way, this primer is for you. It covers getting the sources, updating the sources, bisecting and touches briefly on how to cope with a few local changes. It covers the basics, and tries to give good pointers to more in-depth treatment for when the reader finds the basics insufficient. Other sections of this guide cover more advanced topics related to contributing to the project. The goal of this section is to highlight those bits of Git needed to track sources. They assume a basic understanding of Git. There are many primers for Git on the web, but the https://git-scm.com/book/en/v2[Git Book] provides one of the better treatments. [[git-mini-primer-getting-started]] ==== Getting Started For Developers This section describes the read-write access for committers to push the commits from developers or contributors. [[git-mini-daily-use]] ===== Daily use [NOTE] ==== In the examples below, replace `${repo}` with the name of the desired FreeBSD repository: `doc`, `ports`, or `src`. ==== * Clone the repository: + [source,shell] .... % git clone -o freebsd --config remote.freebsd.fetch='+refs/notes/*:refs/notes/*' https://git.freebsd.org/${repo}.git .... + Then you should have the official mirrors as your remote: + [source,shell] .... % git remote -v freebsd https://git.freebsd.org/${repo}.git (fetch) freebsd https://git.freebsd.org/${repo}.git (push) .... * Configure the FreeBSD committer data: + The commit hook in repo.freebsd.org checks the "Commit" field matches the committer's information in FreeBSD.org. The easiest way to get the suggested config is by executing `/usr/local/bin/gen-gitconfig.sh` script on freefall: + [source,shell] .... % gen-gitconfig.sh [...] % git config user.name (your name in gecos) % git config user.email (your login)@FreeBSD.org .... * Set the push URL: + [source,shell] .... % git remote set-url --push freebsd git@gitrepo.freebsd.org:${repo}.git .... + Then you should have separated fetch and push URLs as the most efficient setup: + [source,shell] .... % git remote -v freebsd https://git.freebsd.org/${repo}.git (fetch) freebsd git@gitrepo.freebsd.org:${repo}.git (push) .... + Again, note that `gitrepo.freebsd.org` has been canonicalized to `repo.freebsd.org`. * Install commit message template hook: + For doc repository: + [source,shell] .... % cd .git/hooks % ln -s ../../.hooks/prepare-commit-msg .... + For ports repository: + [source,shell] .... % git config --add core.hooksPath .hooks .... + For src repository: + [source,shell] .... % cd .git/hooks % ln -s ../../tools/tools/git/hooks/prepare-commit-msg .... [[admin-branch]] ===== "admin" branch The `access` and `mentors` files are stored in an orphan branch, `internal/admin`, in each repository. Following example is how to check out the `internal/admin` branch to a local branch named `admin`: [source,shell] .... % git config --add remote.freebsd.fetch '+refs/internal/*:refs/internal/*' % git fetch % git checkout -b admin internal/admin .... Alternatively, you can add a worktree for the `admin` branch: [source,shell] .... git worktree add -b admin ../${repo}-admin internal/admin .... For browsing `internal/admin` branch on web: `https://cgit.freebsd.org/${repo}/log/?h=internal/admin` For pushing, specify the full refspec: [source,shell] .... git push freebsd HEAD:refs/internal/admin .... ==== Keeping Current With The FreeBSD src Tree [[keeping_current]] First step: cloning a tree. This downloads the entire tree. There are two ways to download. Most people will want to do a deep clone of the repository. However, there are times when you may wish to do a shallow clone. ===== Branch Names FreeBSD-CURRENT uses the `main` branch. `main` is the default branch. For FreeBSD-STABLE, branch names include `stable/12` and `stable/13`. For FreeBSD-RELEASE, release engineering branch names include `releng/12.4` and `releng/13.2`. https://www.freebsd.org/releng/[] shows: * `main` and `stable/⋯` branches open * `releng/⋯` branches, each of which is frozen when a release is tagged. Examples: * tag https://cgit.freebsd.org/src/tag/?h=release/13.1.0[release/13.1.0] on the https://cgit.freebsd.org/src/log/?h=releng/13.1[releng/13.1] branch * tag https://cgit.freebsd.org/src/tag/?h=release/13.2.0[release/13.2.0] on the https://cgit.freebsd.org/src/log/?h=releng/13.2[releng/13.2] branch. ===== Repositories Please see the crossref:committers-guide[admin,Administrative Details] for the latest information on where to get FreeBSD sources. $URL below can be obtained from that page. Note: The project doesn't use submodules as they are a poor fit for our workflows and development model. How we track changes in third-party applications is discussed elsewhere and generally of little concern to the casual user. ===== Deep Clone A deep clone pulls in the entire tree, as well as all the history and branches. It is the easiest to do. It also allows you to use Git's worktree feature to have all your active branches checked out into separate directories but with only one copy of the repository. [source,shell] .... % git clone -o freebsd $URL -b branch [] .... -- will create a deep clone. `branch` should be one of the branches listed in the previous section. If no `branch` is given: the default (`main`) will be used. If no `` is given: the name of the new directory will match the name of the repo ([.filename]#doc#, [.filename]#ports# or [.filename]#src#). You will want a deep clone if you are interested in the history, plan on making local changes, or plan on working on more than one branch. It is the easiest to keep up to date as well. If you are interested in the history, but are working with only one branch and are short on space, you can also use --single-branch to only download the one branch (though some merge commits will not reference the merged-from branch which may be important for some users who are interested in detailed versions of history). ===== Shallow Clone A shallow clone copies just the most current code, but none or little of the history. This can be useful when you need to build a specific revision of FreeBSD, or when you are just starting out and plan to track the tree more fully. You can also use it to limit history to only so many revisions. However, see below for a significant limitation of this approach. [source,shell] .... % git clone -o freebsd -b branch --depth 1 $URL [dir] .... This clones the repository, but only has the most recent version in the repository. The rest of the history is not downloaded. Should you change your mind later, you can do `git fetch --unshallow` to get the old history. [WARNING] ==== When you make a shallow clone, you will lose the commit count in your uname output. This can make it more difficult to determine if your system needs to be updated when a security advisory is issued. ==== ===== Building Once you've downloaded, building is done as described in the handbook, e.g.: [source,shell] .... % cd src % make buildworld % make buildkernel % make installkernel % make installworld .... so that won't be covered in depth here. If you want to build a custom kernel, extref:{handbook}kernelconfig[the kernel config section, kernelconfig] of the FreeBSD Handbook recommends creating a file MYKERNEL under sys/${ARCH}/conf with your changes against GENERIC. To have MYKERNEL disregarded by Git, it can be added to .git/info/exclude. ===== Updating To update both types of trees uses the same commands. This pulls in all the revisions since your last update. [source,shell] .... % git pull --ff-only .... will update the tree. In Git, a 'fast forward' merge is one that only needs to set a new branch pointer and doesn't need to re-create the commits. By always doing a fast forward merge/pull, you'll ensure that you have an exact copy of the FreeBSD tree. This will be important if you want to maintain local patches. See below for how to manage local changes. The simplest is to use `--autostash` on the `git pull` command, but more sophisticated options are available. ==== Selecting a Specific Version In Git, `git checkout` checks out both branches and specific versions. Git's versions are the long hashes rather than a sequential number. When you checkout a specific version, just specify the hash you want on the command line (the git log command can help you decide which hash you might want): [source,shell] .... % git checkout 08b8197a74 .... and you have that checked out. You will be greeted with a message similar to the following: [source,shell] .... Note: checking out '08b8197a742a96964d2924391bf9fdfeb788865d'. You are in a 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at 08b8197a742a hook gpiokeys.4 to the build .... where the last line is generated from the hash you are checking out and the first line of the commit message from that revision. The hash can be abbreviated to the shortest unique length. Git itself is inconsistent about how many digits it displays. ==== Bisecting Sometimes, things go wrong. The last version worked, but the one you just updated to does not. A developer may ask you to bisect the problem to track down which commit caused the regression. Git makes bisecting changes easy with a powerful `git bisect` command. Here's a brief outline of how to use it. For more information, you can view https://www.metaltoad.com/blog/beginners-guide-git-bisect-process-elimination or https://git-scm.com/docs/git-bisect for more details. The man git-bisect page is good at describing what can go wrong, what to do when versions won't build, when you want to use terms other than 'good' and 'bad', etc, none of which will be covered here. `git bisect start --first-parent` will start the bisection process. Next, you need to tell a range to go through. `git bisect good XXXXXX` will tell it the working version and `git bisect bad XXXXX` will tell it the bad version. The bad version will almost always be HEAD (a special tag for what you have checked out). The good version will be the last one you checked out. The `--first-parent` argument is necessary so that subsequent `git bisect` commands do not try to check out a vendor branch which lacks the full FreeBSD source tree. [TIP] ==== If you want to know the last version you checked out, you should use `git reflog`: [source,shell] .... 5ef0bd68b515 (HEAD -> main, freebsd/main, freebsd/HEAD) HEAD@{0}: pull --ff-only: Fast-forward a8163e165c5b (upstream/main) HEAD@{1}: checkout: moving from b6fb97efb682994f59b21fe4efb3fcfc0e5b9eeb to main ... .... shows me moving the working tree to the `main` branch (a816...) and then updating from upstream (to 5ef0...). In this case, bad would be HEAD (or 5ef0bd68b515) and good would be a8163e165c5b. As you can see from the output, HEAD@{1} also often works, but isn't foolproof if you have done other things to your Git tree after updating, but before you discover the need to bisect. ==== Set the 'good' version first, then set the bad (though the order doesn't matter). When you set the bad version, it will give you some statistics on the process: [source,shell] .... % git bisect start --first-parent % git bisect good a8163e165c5b % git bisect bad HEAD Bisecting: 1722 revisions left to test after this (roughly 11 steps) [c427b3158fd8225f6afc09e7e6f62326f9e4de7e] Fixup r361997 by balancing parens. Duh. .... You would then build/install that version. If it's good you'd type `git bisect good` otherwise `git bisect bad`. If the version doesn't compile, type `git bisect skip`. You will get a similar message to the above after each step. When you are done, report the bad version to the developer (or fix the bug yourself and send a patch). `git bisect reset` will end the process and return you back to where you started (usually tip of `main`). Again, the git-bisect manual (linked above) is a good resource for when things go wrong or for unusual cases. [[git-gpg-signing]] ==== Signing the commits, tags, and pushes, with GnuPG Git knows how to sign commits, tags, and pushes. When you sign a Git commit or a tag, you can prove that the code you submitted came from you and wasn't altered while you were transferring it. You also can prove that you submitted the code and not someone else. A more in-depth documentation on signing commits and tags can be found in the https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work[Git Tools - Signing Your Work] chapter of the Git's book. The rationale behind signing pushes can be found in the https://github.com/git/git/commit/a85b377d0419a9dfaca8af2320cc33b051cbed04[commit that introduced the feature]. The best way is to simply tell Git you always want to sign commits, tags, and pushes. You can do this by setting a few configuration variables: [source,shell] .... % git config --add user.signingKey LONG-KEY-ID % git config --add commit.gpgSign true % git config --add tag.gpgSign true % git config --add push.gpgSign if-asked .... // push.gpgSign should probably be set to `yes` once we enable it, or be set with --global, so that it is enabled for all repositories. [NOTE] ====== To avoid possible collisions, make sure you give a long key id to Git. You can get the long id with: `gpg --list-secret-keys --keyid-format LONG`. ====== [TIP] ====== To use specific subkeys, and not have GnuPG to resolve the subkey to a primary key, attach `!` to the key. For example, to encrypt for the subkey `DEADBEEF`, use `DEADBEEF!`. ====== ===== Verifying signatures Commit signatures can be verified by running either `git verify-commit `, or `git log --show-signature`. Tag signatures can be verified with `git verify-tag `, or `git tag -v `. //// Commented out for now until we decide what to do. Git pushes are a bit different, they live in a special ref in the repository. TODO: write how to verify them //// ==== Ports Considerations The ports tree operates the same way. The branch names are different and the repositories are in different locations. The cgit repository web interface for use with web browsers is at https://cgit.FreeBSD.org/ports/ . The production Git repository is at https://git.FreeBSD.org/ports.git and at ssh://anongit@git.FreeBSD.org/ports.git (or `anongit@git.FreeBSD.org:ports.git`). There is also a mirror on GitHub, see extref:{handbook}mirrors[External mirrors, mirrors] for an overview. The _latest_ branch is `main`. The _quarterly_ branches are named `yyyyQn` for year 'yyyy' and quarter 'n'. [[port-commit-message-formats]] ===== Commit message formats A hook is available in the ports repository to help you write up your commit messages in https://cgit.freebsd.org/ports/tree/.hooks/prepare-commit-msg[.hooks/prepare-commit-message]. It can be enabled by running ``git config --add core.hooksPath .hooks``. The main point being that a commit message should be formatted in the following way: .... category/port: Summary. Description of why the changes where made. PR: 12345 .... [IMPORTANT] ==== The first line is the subject of the commit, it contains what port was changed, and a summary of the commit. It should contain 50 characters or less. A blank line should separate it from the rest of the commit message. The rest of the commit message should be wrapped at the 72 characters boundary. Another blank line should be added if there are any metadata fields, so that they are easily distinguishable from the commit message. ==== ==== Managing Local Changes This section addresses tracking local changes. If you have no local changes you can skip this section. One item that is important for all of them: all changes are local until pushed. Unlike Subversion, Git uses a distributed model. For users, for most things, there is very little difference. However, if you have local changes, you can use the same tool to manage them as you use to pull in changes from FreeBSD. All changes that you have not pushed are local and can easily be modified (git rebase, discussed below does this). ===== Keeping local changes The simplest way to keep local changes (especially trivial ones) is to use `git stash`. In its simplest form, you use `git stash` to record the changes (which pushes them onto the stash stack). Most people use this to save changes before updating the tree as described above. They then use `git stash apply` to re-apply them to the tree. The stash is a stack of changes that can be examined with `git stash list`. The git-stash man page (https://git-scm.com/docs/git-stash) has all the details. This method is suitable when you have tiny tweaks to the tree. When you have anything non trivial, you'll likely be better off keeping a local branch and rebasing. Stashing is also integrated with the `git pull` command: just add `--autostash` to the command line. ===== Keeping a local branch [[keeping_a_local_branch]] It is much easier to keep a local branch with Git than Subversion. In Subversion you need to merge the commit, and resolve the conflicts. This is manageable, but can lead to a convoluted history that's hard to upstream should that ever be necessary, or hard to replicate if you need to do so. Git also allows one to merge, along with the same problems. That's one way to manage the branch, but it's the least flexible. In addition to merging, Git supports the concept of 'rebasing' which avoids these issues. The `git rebase` command replays all the commits of a branch at a newer location on the parent branch. We will cover the most common scenarios that arise using it. ====== Create a branch Let's say you want to make a change to FreeBSD's ls command to never, ever do color. There are many reasons to do this, but this example will use that as a baseline. The FreeBSD ls command changes from time to time, and you'll need to cope with those changes. Fortunately, with Git rebase it usually is automatic. [source,shell] .... % cd src % git checkout main % git checkout -b no-color-ls % cd bin/ls % vi ls.c # hack the changes in % git diff # check the changes diff --git a/bin/ls/ls.c b/bin/ls/ls.c index 7378268867ef..cfc3f4342531 100644 --- a/bin/ls/ls.c +++ b/bin/ls/ls.c @@ -66,6 +66,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#undef COLORLS #ifdef COLORLS #include #include % # these look good, make the commit... % git commit ls.c .... The commit will pop you into an editor to describe what you've done. Once you enter that, you have your own **local** branch in the Git repo. Build and install it like you normally would, following the directions in the handbook. Git differs from other version control systems in that you have to tell it explicitly which files to commit. I have opted to do it on the commit command line, but you can also do it with `git add` which many of the more in depth tutorials cover. ====== Time to update When it is time to bring in a new version, it is almost the same as w/o the branches. You would update like you would above, but there is one extra command before you update, and one after. The following assumes you are starting with an unmodified tree. It is important to start rebasing operations with a clean tree (Git requires this). [source,shell] .... % git checkout main % git pull --ff-only % git rebase -i main no-color-ls .... This will bring up an editor that lists all the commits in it. For this example, do not change it at all. This is typically what you are doing while updating the baseline (though you also use the Git rebase command to curate the commits you have in the branch). Once you are done with the above, you have to move the commits to ls.c forward from the old version of FreeBSD to the newer one. Sometimes there are merge conflicts. That is OK. Do not panic. Instead, handle them the same as any other merge conflicts. To keep it simple, I will just describe a common issue that may arise. A pointer to a complete treatment can be found at the end of this section. Let's say the includes changes upstream in a radical shift to terminfo as well as a name change for the option. When you updated, you might see something like this: [source,shell] .... Auto-merging bin/ls/ls.c CONFLICT (content): Merge conflict in bin/ls/ls.c error: could not apply 646e0f9cda11... no color ls Resolve all conflicts manually, mark them as resolved with "git add/rm ", then run "git rebase --continue". You can instead skip this commit: run "git rebase --skip". To abort and get back to the state before "git rebase", run "git rebase --abort". Could not apply 646e0f9cda11... no color ls .... which looks scary. If you bring up an editor, you will see it is a typical 3-way merge conflict resolution that you may be familiar with from other source code systems (the rest of ls.c has been omitted): [source,shell] .... <<<<<<< HEAD #ifdef COLORLS_NEW #include ======= #undef COLORLS #ifdef COLORLS #include >>>>>>> 646e0f9cda11... no color ls .... The new code is first, and your code is second. The right fix here is to just add a #undef COLORLS_NEW before #ifdef and then delete the old changes: [source,shell] .... #undef COLORLS_NEW #ifdef COLORLS_NEW #include .... save the file. The rebase was interrupted, so you have to complete it: [source,shell] .... % git add ls.c % git rebase --continue .... which tells Git that ls.c has been fixed and to continue the rebase operation. Since there was a conflict, you will get kicked into the editor to update the commit message if necessary. If the commit message is still accurate, just exit the editor. If you get stuck during the rebase, do not panic. git rebase --abort will take you back to a clean slate. It is important, though, to start with an unmodified tree. An aside: The above mentioned `git reflog` comes in handy here, as it will have a list of all the (intermediate) commits that you can view or inspect or cherry-pick. For more on this topic, https://www.freecodecamp.org/news/the-ultimate-guide-to-git-merge-and-git-rebase/ provides a rather extensive treatment. It is a good resource for issues that arise occasionally but are too obscure for this guide. ===== Switching to a Different FreeBSD Branch If you wish to shift from stable/12 to the current branch. If you have a deep clone, the following will suffice: [source,shell] .... % git checkout main % # build and install here... .... If you have a local branch, though, there are one or two caveats. First, rebase will rewrite history, so you will likely want to do something to save it. Second, jumping branches tends to cause more conflicts. If we pretend the example above was relative to stable/12, then to move to `main`, I'd suggest the following: [source,shell] .... % git checkout no-color-ls % git checkout -b no-color-ls-stable-12 # create another name for this branch % git rebase -i stable/12 no-color-ls --onto main .... What the above does is checkout no-color-ls. Then create a new name for it (no-color-ls-stable-12) in case you need to get back to it. Then you rebase onto the `main` branch. This will find all the commits to the current no-color-ls branch (back to where it meets up with the stable/12 branch) and then it will replay them onto the `main` branch creating a new no-color-ls branch there (which is why I had you create a place holder name). [[mfc-with-git]] === MFC (Merge From Current) Procedures ==== Summary MFC workflow can be summarized as `git cherry-pick -x` plus `git commit --amend` to adjust the commit message. For multiple commits, use `git rebase -i` to squash them together and edit the commit message. ==== Single commit MFC [source,shell] .... % git checkout stable/X % git cherry-pick -x $HASH --edit .... For MFC commits, for example a vendor import, you would need to specify one parent for cherry-pick purposes. Normally, that would be the "first parent" of the branch you are cherry-picking from, so: [source,shell] .... % git checkout stable/X % git cherry-pick -x $HASH -m 1 --edit .... If things go wrong, you'll either need to abort the cherry-pick with `git cherry-pick --abort` or fix it up and do a `git cherry-pick --continue`. Once the cherry-pick is finished, push with `git push`. If you get an error due to losing the commit race, use `git pull --rebase` and try to push again. ==== MFC to RELENG branch MFCs to branches that require approval require a bit more care. The process is the same for either a typical merge or an exceptional direct commit. * Merge or direct commit to the appropriate `stable/X` branch first before merging to the `releng/X.Y` branch. * Use the hash that's in the `stable/X` branch for the MFC to `releng/X.Y` branch. * Leave both "cherry picked from" lines in the commit message. * Be sure to add the `Approved by:` line when you are in the editor. [source,shell] .... % git checkout releng/13.0 % git cherry-pick -x $HASH --edit .... -If you forget to to add the `Approved by:` line, you can do a `git commit --amend` to edit the commit message before you push the change. +If you forget to add the `Approved by:` line, you can do a `git commit --amend` to edit the commit message before you push the change. ==== Multiple commit MFC [source,shell] .... % git checkout -b tmp-branch stable/X % for h in $HASH_LIST; do git cherry-pick -x $h; done % git rebase -i stable/X # mark each of the commits after the first as 'squash' # Update the commit message to reflect all elements of commit, if necessary. # Be sure to retain the "cherry picked from" lines. % git push freebsd HEAD:stable/X .... If the push fails due to losing the commit race, rebase and try again: [source,shell] .... % git checkout stable/X % git pull % git checkout tmp-branch % git rebase stable/X % git push freebsd HEAD:stable/X .... Once the MFC is complete, you can delete the temporary branch: [source,shell] .... % git checkout stable/X % git branch -d tmp-branch .... ==== MFC a vendor import Vendor imports are the only thing in the tree that creates a merge commit in the `main` branch. Cherry picking merge commits into stable/XX presents an additional difficulty because there are two parents for a merge commit. Generally, you'll want the first parent's diff since that's the diff to `main` (though there may be some exceptions). [source,shell] .... % git cherry-pick -x -m 1 $HASH .... is typically what you want. This will tell cherry-pick to apply the correct diff. There are some, hopefully, rare cases where it's possible that the `main` branch was merged backwards by the conversion script. Should that be the case (and we've not found any yet), you'd change the above to `-m 2` to pickup the proper parent. Just do: [source,shell] .... % git cherry-pick --abort % git cherry-pick -x -m 2 $HASH .... to do that. The `--abort` will cleanup the failed first attempt. ==== Redoing a MFC If you do a MFC, and it goes horribly wrong and you want to start over, then the easiest way is to use `git reset --hard` like so: [source,shell] .... % git reset --hard freebsd/stable/12 .... though if you have some revs you want to keep, and others you don't, using `git rebase -i` is better. ==== Considerations when MFCing When committing source commits to stable and releng branches, we have the following goals: * Clearly mark direct commits distinct from commits that land a change from another branch. * Avoid introducing known breakage into stable and releng branches. * Allow developers to determine which changes have or have not been landed from one branch to another. With Subversion, we used the following practices to achieve these goals: * Using `MFC` and `MFS` tags to mark commits that merged changes from another branch. * Squashing fixup commits into the main commit when merging a change. * Recording mergeinfo so that `svn mergeinfo --show-revs` worked. With Git, we will need to use different strategies to achieve the same goals. This document aims to define best practices when merging source commits using Git that achieve these goals. In general, we aim to use Git's native support to achieve these goals rather than enforcing practices built on Subversion's model. One general note: due to technical differences with Git, we will not be using Git "merge commits" (created via `git merge`) in stable or releng branches. Instead, when this document refers to "merge commits", it means a commit originally made to `main` that is replicated or "landed" to a stable branch, or a commit from a stable branch that is replicated to a releng branch with some variation of `git cherry-pick`. ==== Finding Eligible Hashes to MFC Git provides some built-in support for this via the `git cherry` and `git log --cherry` commands. These commands compare the raw diffs of commits (but not other metadata such as log messages) to determine if two commits are identical. This works well when each commit from `main` is landed as a single commit to a stable branch, but it falls over if multiple commits from `main` are squashed together as a single commit to a stable branch. The project makes extensive use of `git cherry-pick -x` with all lines preserved to work around these difficulties and is working on automated tooling to take advantage of this. ==== Commit message standards ===== Marking MFCs The project has adopted the following practice for marking MFCs: * Use the `-x` flag with `git cherry-pick`. This adds a line to the commit message that includes the hash of the original commit when merging. Since it is added by Git directly, committers do not have to manually edit the commit log when merging. When merging multiple commits, keep all the "cherry picked from" lines. ===== Trim Metadata? One area that was not clearly documented with Subversion (or even CVS) is how to format metadata in log messages for MFC commits. Should it include the metadata from the original commit unchanged, or should it be altered to reflect information about the MFC commit itself? Historical practice has varied, though some of the variance is by field. For example, MFCs that are relevant to a PR generally include the PR field in the MFC so that MFC commits are included in the bug tracker's audit trail. Other fields are less clear. For example, Phabricator shows the diff of the last commit tagged to a review, so including Phabricator URLs replaces the main commit with the landed commits. The list of reviewers is also not clear. If a reviewer has approved a change to `main`, does that mean they have approved the MFC commit? Is that true if it's identical code only, or with merely trivial rework? It's clearly not true for more extensive reworks. Even for identical code what if the commit doesn't conflict but introduces an ABI change? A reviewer may have ok'd a commit for `main` due to the ABI breakage but may not approve of merging the same commit as-is. One will have to use one's best judgment until clear guidelines can be agreed upon. For MFCs regulated by re@, new metadata fields are added, such as the Approved by tag for approved commits. This new metadata will have to be added via `git commit --amend` or similar after the original commit has been reviewed and approved. We may also want to reserve some metadata fields in MFC commits such as Phabricator URLs for use by re@ in the future. Preserving existing metadata provides a very simple workflow. Developers use `git cherry-pick -x` without having to edit the log message. If instead we choose to adjust metadata in MFCs, developers will have to edit log messages explicitly via the use of `git cherry-pick --edit` or `git commit --amend`. However, as compared to svn, at least the existing commit message can be pre-populated and metadata fields can be added or removed without having to re-enter the entire commit message. The bottom line is that developers will likely need to curate their commit message for MFCs that are non-trivial. [[vendor-import-git]] === Vendor Imports with Git This section describes the vendor import procedure with Git in detail. ==== Branch naming convention All vendor branches and tags start with `vendor/`. These branches and tags are visible by default. [NOTE] ==== This chapter follows the convention that the `freebsd` origin is the origin name for the official FreeBSD Git repository. If you use a different convention, replace `freebsd` with the name you use instead in the examples below. ==== We will explore an example for updating NetBSD's mtree that is in our tree. The vendor branch for this is `vendor/NetBSD/mtree`. ==== Updating an old vendor import The vendor trees usually have only the subset of the third-party software that is appropriate to FreeBSD. These trees are usually tiny in comparison to the FreeBSD tree. Git worktrees are thus quite small and fast and the preferred method to use. Make sure that whatever directory you choose below (the `../mtree`) does not currently exist. [source,shell] .... % git worktree add ../mtree vendor/NetBSD/mtree .... ==== Update the Sources in the Vendor Branch Prepare a full, clean tree of the vendor sources. Import everything but merge only what is needed. This example assumes the NetBSD source is checked out from their GitHub mirror in `~/git/NetBSD`. Note that "upstream" might have added or removed files, so we want to make sure deletions are propagated as well. package:net/rsync[] is commonly installed, so I'll use that. [source,shell] .... % cd ../mtree % rsync -va --del --exclude=".git" ~/git/NetBSD/usr.sbin/mtree/ . % git add -A % git status ... % git diff --staged ... % git commit -m "Vendor import of NetBSD's mtree at 2020-12-11" [vendor/NetBSD/mtree 8e7aa25fcf1] Vendor import of NetBSD's mtree at 2020-12-11 7 files changed, 114 insertions(+), 82 deletions(-) % git tag -a vendor/NetBSD/mtree/20201211 .... It is critical to verify that the source code you are importing comes from a trustworthy source. Many open-source projects use cryptographic signatures to sign code changes, git tags, and/or source code tarballs. Always verify these signatures, and use isolation mechanisms like jails, chroot, in combination with a dedicated, non-privileged user account that is different from the one you regularly use (see the Updating the FreeBSD source tree section below for more details), until you are confident that the source code you are importing looks safe. Following the upstream development and occasionally reviewing the upstream code changes can greatly help in improving code quality and benefit everyone involved. It is also a good idea to examine the git diff results before importing them into the vendor area. Always run the `git diff` and `git status` commands and examine the results carefully. When in doubt, it is useful to do a `git annotate` on the vendor branch or the upstream git repository to see who and why a change was made. In the example above we used `-m` to illustrate, but you should compose a proper message in an editor (using a commit message template). It is also important to create an annotated tag using `git tag -a`, otherwise the push will be rejected. Only annotated tags are allowed to be pushed. The annotated tag gives you a chance to enter a commit message. Enter the version you are importing, along with any salient new features or fixes in that version. ==== Updating the FreeBSD Copy At this point you can push the import to `vendor` into our repo. [source,shell] .... % git push --follow-tags freebsd vendor/NetBSD/mtree .... `--follow-tags` tells `git push` to also push tags associated with the locally committed revision. ==== Updating the FreeBSD source tree Now you need to update the mtree in FreeBSD. The sources live in `contrib/mtree` since it is upstream software. From time to time, we may have to make changes to the contributed code to better satisfy FreeBSD's needs. Whenever possible, please try to contribute the local changes back to the upstream projects, this helps them to better support FreeBSD, and also saves your time for future conflict resolutions when importing updates. [source,shell] .... % cd ../src % git subtree merge -P contrib/mtree vendor/NetBSD/mtree .... This would generate a subtree merge commit of `contrib/mtree` against the local `vendor/NetBSD/mtree` branch. Examine the diff from the merge result and the contents of the upstream branch. If the merge reduced our local changes to more trivial difference like blank line or indenting changes, try amending the local changes to reduce diff against upstream, or try to contribute the remaining changes back to the upstream project. If there were conflicts, you would need to fix them before committing. Include details about the changes being merged in the merge commit message. Some open-source software includes a `configure` script that generates files used to define how the code is built; usually, these generated files like `config.h` should be updated as part of the import process. When doing this, always keep in mind that these scripts are executable code running under the current user's credentials. This process should always be run in an isolated environment, ideally inside a jail that does not have network access, and with an unprivileged account; or, at minimum, a dedicated account that is different from the user account you normally use for everyday purposes or for pushing to the FreeBSD source code repository. This minimizes the risk of encountering bugs that can cause data loss or, in worse cases, maliciously planted code. Using an isolated jail also prevents the configure scripts from detecting locally installed software packages, which may lead to unexpected results. When testing your changes, run them in a chroot or jailed environment, or even within a virtual machine first, especially for kernel or library modifications. This approach helps prevent adverse interactions with your working environment. It can be particularly beneficial for changes to libraries that many base system components use, among others. ==== Rebasing your change against latest FreeBSD source tree Because the current policy recommends against using merges, if the upstream FreeBSD `main` moved forward before you get a chance to push, you would have to redo the merge. Regular `git rebase` or `git pull --rebase` doesn't know how to rebase a merge commit **as a merge commit**, so instead of that you would have to recreate the commit. The following steps should be taken to easily recreate the merge commit as if `git rebase --merge-commits` worked properly: * cd to the top of the repo * Create a side branch `XXX` with the **contents** of the merged tree. * Update this side branch `XXX` to be merged and up-to-date with FreeBSD's `main` branch. ** In the worst case scenario, you would still have to resolve merge conflicts, if there was any, but this should be really rare. ** Resolve conflicts, and collapse multiple commits down to 1 if need be (without conflicts, there's no collapse needed) * checkout `main` * create a branch `YYY` (allows for easier unwinding if things go wrong) * Re-do the subtree merge * Instead of resolving any conflicts from the subtree merge, checkout the contents of XXX on top of it. ** The trailing `.` is important, as is being at the top level of the repo. ** Rather than switching branches to XXX, it splats the contents of XXX on top of the repo * Commit the results with the prior commit message (the example assumes there's only one merge on the XXX branch). * Make sure the branches are the same. * Do whatever review you need, including having others check it out if you think that's needed. * Push the commit, if you 'lost the race' again, just redo these steps again (see below for a recipe) * Delete the branches once the commit is upstream. They are throw-a-way. The commands one would use, following the above example of mtree, would be like so (the `#` starts a comment to help link commands to descriptions above): [source,shell] .... % cd ../src # CD to top of tree % git checkout -b XXX # create new throw-away XXX branch for merge % git fetch freebsd # Get changes from upstream from upstream % git merge freebsd/main # Merge the changes and resolve conflicts % git checkout -b YYY freebsd/main # Create new throw-away YYY branch for redo % git subtree merge -P contrib/mtree vendor/NetBSD/mtree # Redo subtree merge % git checkout XXX . # XXX branch has the conflict resolution % git commit -c XXX~1 # -c reuses the commit message from commit before rebase % git diff XXX YYY # Should be empty % git show YYY # Should only have changes you want, and be a merge commit from vendor branch .... Note: if things go wrong with the commit, you can reset the `YYY` branch by reissuing the checkout command that created it with -B to start over: [source,shell] .... % git checkout -B YYY freebsd/main # Create new throw-away YYY branch if starting over is just going to be easier .... ==== Pushing the changes Once you think you have a set of changes that are good, you can push it to a fork off GitHub or GitLab for others to review. One nice thing about Git is that it allows you to publish rough drafts of your work for others to review. While Phabricator is good for content review, publishing the updated vendor branch and merge commits lets others check the details as they will eventually appear in the repository. After review, when you are sure it is a good change, you can push it to the FreeBSD repo: [source,shell] .... % git push freebsd YYY:main # put the commit on upstream's 'main' branch % git branch -D XXX # Throw away the throw-a-way branches. % git branch -D YYY .... Note: I used `XXX` and `YYY` to make it obvious they are terrible names and should not leave your machine. If you use such names for other work, then you'll need to pick different names, or risk losing the other work. There is nothing magic about these names. Upstream will not allow you to push them, but never the less, please pay attention to the exact commands above. Some commands use syntax that differs only slightly from typical uses and that different behavior is critical to this recipe working. ==== How to redo things if need be If you've tried to do the push in the previous section and it fails, then you should do the following to 'redo' things. This sequence keeps the commit with the commit message always at XXX~1 to make committing easier. [source,shell] .... % git checkout -B XXX YYY # recreate that throw-away-branch XXX and switch to it % git merge freebsd/main # Merge the changes and resolve conflicts % git checkout -B YYY freebsd/main # Recreate new throw-away YYY branch for redo % git subtree merge -P contrib/mtree vendor/NetBSD/mtree # Redo subtree merge % git checkout XXX . # XXX branch has the conflict resolution % git commit -c XXX~1 # -c reuses the commit message from commit before rebase .... Then go check it out as above and push as above when ready. === Creating a new vendor branch There are a number of ways to create a new vendor branch. The recommended way is to create a new repository and then merge that with FreeBSD. If one is importing `glorbnitz` into the FreeBSD tree, release 3.1415. For the sake of simplicity, we will not trim this release. It is a simple user command that puts the nitz device into different magical glorb states and is small enough trimming will not save much. ==== Create the repo [source,shell] .... % cd /some/where % mkdir glorbnitz % cd glorbnitz % git init % git checkout -b vendor/glorbnitz .... At this point, you have a new repo, where all new commits will go on the `vendor/glorbnitz` branch. Git experts can also do this right in their FreeBSD clone, using `git checkout --orphan vendor/glorbnitz` if they are more comfortable with that. ==== Copy the sources in Since this is a new import, you can just cp the sources in, or use tar or even rsync as shown above. And we will add everything, assuming no dot files. [source,shell] .... % cp -r ~/glorbnitz/* . % git add * .... At this point, you should have a pristine copy of glorbnitz ready to commit. [source,shell] .... % git commit -m "Import GlorbNitz frobnosticator revision 3.1415" .... As above, I used `-m` for simplicity, but you should likely create a commit message that explains what a Glorb is and why you'd use a Nitz to get it. Not everybody will know so, for your actual commit, you should follow the crossref:committers-guide[commit-log-message,commit log message] section instead of emulating the brief style used here. ==== Now import it into our repository Now you need to import the branch into our repository. [source,shell] .... % cd /path/to/freebsd/repo/src % git remote add glorbnitz /some/where/glorbnitz % git fetch glorbnitz vendor/glorbnitz .... Note the vendor/glorbnitz branch is in the repo. At this point the `/some/where/glorbnitz` can be deleted, if you like. It was only a means to an end. // perhaps the real treasure was the friends it made along the way... ==== Tag and push Steps from here on out are much the same as they are in the case of updating a vendor branch, though without the updating the vendor branch step. [source,shell] .... % git worktree add ../glorbnitz vendor/glorbnitz % cd ../glorbnitz % git tag --annotate vendor/glorbnitz/3.1415 # Make sure the commit is good with "git show" % git push --follow-tags freebsd vendor/glorbnitz .... By 'good' we mean: . All the right files are present . None of the wrong files are present . The vendor branch points at something sensible . The tag looks good, and is annotated . The commit message for the tag has a quick summary of what's new since the last tag ==== Time to finally merge it into the base tree [source,shell] .... % cd ../src % git subtree add -P contrib/glorbnitz vendor/glorbnitz # Make sure the commit is good with "git show" % git commit --amend # one last sanity check on commit message % git push freebsd .... Here 'good' means: . All the right files, and none of the wrong ones, were merged into contrib/glorbnitz. . No other changes are in the tree. . The commit messages look crossref:committers-guide[commit-log-message,good]. It should contain a summary of what's changed since the last merge to the FreeBSD `main` branch and any caveats. . `RELNOTES` and `UPDATING` should be updated if there is anything of note, such as user visible changes, important upgrade concerns, etc. [NOTE] ==== This hasn't connected `glorbnitz` to the build yet. How so do that is specific to the software being imported and is beyond the scope of this tutorial. ==== ===== Keeping current So, time passes. It's time now to update the tree for the latest changes upstream. When you checkout `main` make sure that you have no diffs. It's a lot easier to commit those to a branch (or use `git stash`) before doing the following. If you are used to `git pull`, we strongly recommend using the `--ff-only` option, and further setting it as the default option. Alternatively, `git pull --rebase` is useful if you have changes staged in the `main` branch. [source,shell] .... % git config --global pull.ff only .... You may need to omit the --global if you want this setting to apply to only this repository. [source,shell] .... % cd freebsd-src % git checkout main % git pull (--ff-only|--rebase) .... There is a common trap, that the combination command `git pull` will try to perform a merge, which would sometimes creates a merge commit that didn't exist before. This can be harder to recover from. The longer form is also recommended. [source,shell] .... % cd freebsd-src % git checkout main % git fetch freebsd % git merge --ff-only freebsd/main .... These commands reset your tree to the `main` branch, and then update it from where you pulled the tree from originally. It's important to switch to `main` before doing this so it moves forward. Now, it's time to move the changes forward: [source,shell] .... % git rebase -i main working .... This will bring up an interactive screen to change the defaults. For now, just exit the editor. Everything should just apply. If not, then you'll need to resolve the diffs. https://docs.github.com/en/free-pro-team@latest/github/using-git/resolving-merge-conflicts-after-a-git-rebase[This github document] can help you navigate this process. [[git-push-upstream]] ===== Time to push changes upstream First, ensure that the push URL is properly configured for the upstream repository. [source,shell] .... % git remote set-url --push freebsd ssh://git@gitrepo.freebsd.org/src.git .... Then, verify that user name and email are configured right. We require that they exactly match the passwd entry in FreeBSD cluster. Use [source,shell] .... freefall% gen-gitconfig.sh .... on freefall.freebsd.org to get a recipe that you can use directly, assuming /usr/local/bin is in the PATH. The below command merges the `working` branch into the upstream `main` branch. It's important that you curate your changes to be just like you want them in the FreeBSD source repo before doing this. This syntax pushes the `working` branch to `main`, moving the `main` branch forward. You will only be able to do this if this results in a linear change to `main` (e.g. no merges). [source,shell] .... % git push freebsd working:main .... If your push is rejected due to losing a commit race, rebase your branch before trying again: [source,shell] .... % git checkout working % git fetch freebsd % git rebase freebsd/main % git push freebsd working:main .... [[git-push-upstream-alt]] ===== Time to push changes upstream (alternative) Some people find it easier to merge their changes to their local `main` before pushing to the remote repository. Also, `git arc stage` moves changes from a branch to the local `main` when you need to do a subset of a branch. The instructions are similar to the prior section: [source,shell] .... % git checkout main % git merge --ff-only `working` % git push freebsd .... If you lose the race, then try again with [source,shell] .... % git pull --rebase % git push freebsd .... These commands will fetch the most recent `freebsd/main` and then rebase the local `main` changes on top of that, which is what you want when you lose the commit race. Note: merging vendor branch commits will not work with this technique. ===== Finding the Subversion Revision You'll need to make sure that you've fetched the notes (see the crossref:committers-guide[git-mini-daily-use, Daily use]for details). Once you have these, notes will show up in the git log command like so: [source,shell] .... % git log .... If you have a specific version in mind, you can use this construct: [source,shell] .... % git log --grep revision=XXXX .... to find the specific revision. The hex number after 'commit' is the hash you can use to refer to this commit. [[git-faq]] === Git FAQ This section provides a number of targeted answers to questions that are likely to come up often for users and developers. [NOTE] ==== We use the common convention of having the origin for the FreeBSD repository being 'freebsd' rather than the default 'origin' to allow people to use that for their own development and to minimize "whoops" pushes to the wrong repository. ==== ==== Users ===== How do I track -current and -stable with only one copy of the repository? **Q:** Although disk space is not a huge issue, it's more efficient to use only one copy of the repository. With SVN mirroring, I could checkout multiple trees from the same repository. How do I do this with Git? **A:** You can use Git worktrees. There's a number of ways to do this, but the simplest way is to use a clone to track -current, and a worktree to track stable releases. While using a 'bare repository' has been put forward as a way to cope, it's more complicated and will not be documented here. First, you need to clone the FreeBSD repository, shown here cloning into `freebsd-current` to reduce confusion. $URL is whatever mirror works best for you: [source,shell] .... % git clone -o freebsd --config remote.freebsd.fetch='+refs/notes/*:refs/notes/*' $URL freebsd-current .... then once that's cloned, you can simply create a worktree from it: [source,shell] .... % cd freebsd-current % git worktree add ../freebsd-stable-12 stable/12 .... this will checkout `stable/12` into a directory named `freebsd-stable-12` that's a peer to the `freebsd-current` directory. Once created, it's updated very similarly to how you might expect: [source,shell] .... % cd freebsd-current % git checkout main % git pull --ff-only # changes from upstream now local and current tree updated % cd ../freebsd-stable-12 % git merge --ff-only freebsd/stable/12 # now your stable/12 is up to date too .... I recommend using `--ff-only` because it's safer and you avoid accidentally getting into a 'merge nightmare' where you have an extra change in your tree, forcing a complicated merge rather than a simple one. Here's https://adventurist.me/posts/00296[a good writeup] that goes into more detail. ==== Developers ===== Ooops! I committed to `main`, instead of another branch. **Q:** From time to time, I goof up and mistakenly commit to the `main` branch. What do I do? **A:** First, don't panic. Second, don't push. In fact, you can fix almost anything if you haven't pushed. All the answers in this section assume no push has happened. The following answer assumes you committed to `main` and want to create a branch called `issue`: [source,shell] .... % git checkout -b issue # Create the 'issue' branch % git checkout -B main freebsd/main # Reset main to upstream % git checkout issue # Back to where you were .... ===== Ooops! I committed something to the wrong branch! **Q:** I was working on feature on the `wilma` branch, but accidentally committed a change relevant to the `fred` branch in 'wilma'. What do I do? **A:** The answer is similar to the previous one, but with cherry picking. This assumes there's only one commit on wilma, but will generalize to more complicated situations. It also assumes that it's the last commit on wilma (hence using wilma in the `git cherry-pick` command), but that too can be generalized. [source,shell] .... # We're on branch wilma % git checkout fred # move to fred branch % git cherry-pick wilma # copy the misplaced commit % git checkout wilma # go back to wilma branch % git reset --hard HEAD^ # move what wilma refers to back 1 commit .... If it is not the last commit, you can cherry-pick that one change from wilma onto fred, then use `git rebase -i` to remove the change from wilma. [source,shell] .... # We're on branch wilma % git checkout fred # move to fred branch % git cherry-pick HASH_OF_CHANGE # copy the misplaced commit % git rebase -i main wilma # drop the cherry-picked change .... **Q:** But what if I want to commit a few changes to `main`, but keep the rest in `wilma` for some reason? **A:** The same technique above also works if you are wanting to 'land' parts of the branch you are working on into `main` before the rest of the branch is ready (say you noticed an unrelated typo, or fixed an incidental bug). You can cherry pick those changes into `main`, then push to the parent repository. Once you've done that, cleanup couldn't be simpler: just `git rebase -i`. Git will notice you've done this and skip the common changes automatically (even if you had to change the commit message or tweak the commit slightly). There's no need to switch back to wilma to adjust it: just rebase! **Q:** I want to split off some changes from branch `wilma` into branch `fred` **A:** The more general answer would be the same as the previous. You'd checkout/create the `fred` branch, cherry pick the changes you want from `wilma` one at a time, then rebase `wilma` to remove those changes you cherry picked. `git rebase -i main wilma` will toss you into an editor, and remove the `pick` lines that correspond to the commits you copied to `fred`. If all goes well, and there are no conflicts, you're done. If not, you'll need to resolve the conflicts as you go. The other way to do this would be to checkout `wilma` and then create the branch `fred` to point to the same point in the tree. You can then `git rebase -i` both these branches, selecting the changes you want in `fred` or `wilma` by retaining the pick likes, and deleting the rest from the editor. Some people would create a tag/branch called `pre-split` before starting in case something goes wrong in the split. You can undo it with the following sequence: [source,shell] .... % git checkout pre-split # Go back % git branch -D fred # delete the fred branch % git checkout -B wilma # reset the wilma branch % git branch -d pre-split # Pretend it didn't happen .... The last step is optional. If you are going to try again to split, you'd omit it. **Q:** But I did things as I read along and didn't see your advice at the end to create a branch, and now `fred` and `wilma` are all screwed up. How do I find what `wilma` was before I started. I don't know how many times I moved things around. **A:** All is not lost. You can figure out it, so long as it hasn't been too long, or too many commits (hundreds). So I created a wilma branch and committed a couple of things to it, then decided I wanted to split it into fred and wilma. Nothing weird happened when I did that, but let's say it did. The way to look at what you've done is with the `git reflog`: [source,shell] .... % git reflog 6ff9c25 (HEAD -> wilma) HEAD@{0}: rebase -i (finish): returning to refs/heads/wilma 6ff9c25 (HEAD -> wilma) HEAD@{1}: rebase -i (start): checkout main 869cbd3 HEAD@{2}: rebase -i (start): checkout wilma a6a5094 (fred) HEAD@{3}: rebase -i (finish): returning to refs/heads/fred a6a5094 (fred) HEAD@{4}: rebase -i (pick): Encourage contributions 1ccd109 (freebsd/main, main) HEAD@{5}: rebase -i (start): checkout main 869cbd3 HEAD@{6}: rebase -i (start): checkout fred 869cbd3 HEAD@{7}: checkout: moving from wilma to fred 869cbd3 HEAD@{8}: commit: Encourage contributions ... % .... Here we see the changes I've made. You can use it to figure out where things went wrong. I'll just point out a few things here. The first one is that HEAD@{X} is a 'commitish' thing, so you can use that as an argument to a command. Although if that command commits anything to the repository, the X numbers change. You can also use the hash (first column). Next, 'Encourage contributions' was the last commit I made to `wilma` before I decided to split things up. You can also see the same hash is there when I created the `fred` branch to do that. I started by rebasing `fred` and you see the 'start', each step, and the 'finish' for that process. While we don't need it here, you can figure out exactly what happened. Fortunately, to fix this, you can follow the prior answer's steps, but with the hash `869cbd3` instead of `pre-split`. While that seems a bit verbose, it's easy to remember since you're doing one thing at a time. You can also stack: [source,shell] .... % git checkout -B wilma 869cbd3 % git branch -D fred .... and you are ready to try again. The `checkout -B` with the hash combines checking out and creating a branch for it. The `-B` instead of `-b` forces the movement of a pre-existing branch. Either way works, which is what's great (and awful) about Git. One reason I tend to use `git checkout -B xxxx hash` instead of checking out the hash, and then creating / moving the branch is purely to avoid the slightly distressing message about detached heads: [source,shell] .... % git checkout 869cbd3 M faq.md Note: checking out '869cbd3'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at 869cbd3 Encourage contributions % git checkout -B wilma .... this produces the same effect, but I have to read a lot more and severed heads aren't an image I like to contemplate. ===== Ooops! I did a `git pull` and it created a merge commit, what do I do? **Q:** I was on autopilot and did a `git pull` for my development tree and that created a merge commit on `main`. How do I recover? **A:** This can happen when you invoke the pull with your development branch checked out. Many developers use `git pull --rebase` to avoid this situation. Right after the pull, you will have the new merge commit checked out. Git supports a `HEAD^#` syntax to examine the parents of a merge commit: [source,shell] .... git log --oneline HEAD^1 # Look at the first parent's commits git log --oneline HEAD^2 # Look at the second parent's commits .... From those logs, you can easily identify which commit is your development work. Then you simply reset your branch to the corresponding `HEAD^#`: [source,shell] .... git reset --hard HEAD^1 .... In addition, a `git pull --rebase` at this stage will rebase your changes to 'main' to the latest 'freebsd/main'. **Q:** But I also need to fix my `main` branch. How do I do that? **A:** Git keeps track of the remote repository branches in a `freebsd/` namespace. To fix your `main` branch, just make it point to the remote's `main`: [source,shell] .... git branch -f main freebsd/main .... There's nothing magical about branches in Git: they are just labels on a graph that are automatically moved forward by making commits. So the above works because you're just moving a label. There's no metadata about the branch that needs to be preserved due to this. ===== Mixing and matching branches **Q:** So I have two branches `worker` and `async` that I'd like to combine into one branch called `feature` while maintaining the commits in both. **A:** This is a job for cherry pick. [source,shell] .... % git checkout worker % git checkout -b feature # create a new branch % git cherry-pick main..async # bring in the changes .... You now have a new branch called `feature`. This branch combines commits from both branches. You can further curate it with `git rebase`. **Q:** I have a branch called `driver` and I'd like to break it up into `kernel` and `userland` so I can evolve them separately and commit each branch as it becomes ready. **A:** This takes a little bit of prep work, but `git rebase` will do the heavy lifting here. [source,shell] .... % git checkout driver # Checkout the driver % git checkout -b kernel # Create kernel branch % git checkout -b userland # Create userland branch .... Now you have two identical branches. So, it's time to separate out the commits. We'll assume first that all the commits in `driver` go into either the `kernel` or the `userland` branch, but not both. [source,shell] .... % git rebase -i main kernel .... and just include the changes you want (with a 'p' or 'pick' line) and just delete the commits you don't (this sounds scary, but if worse comes to worse, you can throw this all away and start over with the `driver` branch since you've not yet moved it). [source,shell] .... % git rebase -i main userland .... and do the same thing you did with the `kernel` branch. **Q:** Oh great! I followed the above and forgot a commit in the `kernel` branch. How do I recover? **A:** You can use the `driver` branch to find the hash of the commit is missing and cherry pick it. [source,shell] .... % git checkout kernel % git log driver % git cherry-pick $HASH .... **Q:** OK. I have the same situation as the above, but my commits are all mixed up. I need parts of one commit to go to one branch and the rest to go to the other. In fact, I have several. Your rebase method to select sounds tricky. **A:** In this situation, you'd be better off to curate the original branch to separate out the commits, and then use the above method to split the branch. So let's assume that there's just one commit with a clean tree. You can either use `git rebase` with an `edit` line, or you can use this with the commit on the tip. The steps are the same either way. The first thing we need to do is to back up one commit while leaving the changes uncommitted in the tree: [source,shell] .... % git reset HEAD^ .... Note: Do not, repeat do not, add `--hard` here since that also removes the changes from your tree. Now, if you are lucky, the change needing to be split up falls entirely along file lines. In that case you can just do the usual `git add` for the files in each group than do a `git commit`. Note: when you do this, you'll lose the commit message when you do the reset, so if you need it for some reason, you should save a copy (though `git log $HASH` can recover it). If you are not lucky, you'll need to split apart files. There's another tool to do that which you can apply one file at a time. [source,shell] .... git add -i foo/bar.c .... will step through the diffs, prompting you, one at time, whether to include or exclude the hunk. Once you're done, `git commit` and you'll have the remainder in your tree. You can run it multiple times as well, and even over multiple files (though I find it easier to do one file at a time and use the `git rebase -i` to fold the related commits together). ===== Joining the FreeBSD GitHub oranization. **Q:** How do I join the FreeBSD GitHub organization? **A:** Please see https://wiki.freebsd.org/GitHub#Joining_the_Organisation[our GitHub Wiki Info] page for details. Briefly, all FreeBSD committers may join. Those who are not committers who request joining will be considered on a case by case basis. ==== Cloning and Mirroring **Q:** I'd like to mirror the entire Git repository, how do I do that? **A:** If all you want to do is mirror, then [source,shell] .... % git clone --mirror $URL .... will do the trick. However, there are two disadvantages to this if you want to use it for anything other than a mirror you'll reclone. First, this is a 'bare repository' which has the repository database, but no checked out worktree. This is great for mirroring, but terrible for day to day work. There's a number of ways around this with `git worktree`: [source,shell] .... % git clone --mirror https://git.freebsd.org/ports.git ports.git % cd ports.git % git worktree add ../ports main % git worktree add ../quarterly branches/2020Q4 % cd ../ports .... But if you aren't using your mirror for further local clones, then it's a poor match. The second disadvantage is that Git normally rewrites the refs (branch name, tags, etc) from upstream so that your local refs can evolve independently of upstream. This means that you'll lose changes if you are committing to this repository on anything other than private project branches. **Q:** So what can I do instead? **A:** Well, you can stuff all of the upstream repository's refs into a private namespace in your local repository. Git clones everything via a 'refspec' and the default refspec is: [source,shell] .... fetch = +refs/heads/*:refs/remotes/freebsd/* .... which says just fetch the branch refs. However, the FreeBSD repository has a number of other things in it. To see those, you can add explicit refspecs for each ref namespace, or you can fetch everything. To setup your repository to do that: [source,shell] .... git config --add remote.freebsd.fetch '+refs/*:refs/freebsd/*' .... which will put everything in the upstream repository into your local repository's `refs/freebsd/` namespace. Please note, that this also grabs all the unconverted vendor branches and the number of refs associated with them is quite large. You'll need to refer to these 'refs' with their full name because they aren't in and of Git's regular namespaces. [source,shell] .... git log refs/freebsd/vendor/zlib/1.2.10 .... would look at the log for the vendor branch for zlib starting at 1.2.10. === Collaborating with others One of the keys to good software development on a project as large as FreeBSD is the ability to collaborate with others before you push your changes to the tree. The FreeBSD project's Git repositories do not, yet, allow user-created branches to be pushed to the repository, and therefore if you wish to share your changes with others you must use another mechanism, such as a hosted GitLab or GitHub, to share changes in a user-generated branch. The following instructions show how to set up a user-generated branch, based on the FreeBSD `main` branch, and push it to GitHub. Before you begin, make sure that your local Git repo is up to date and has the correct origins set crossref:committers-guide[keeping_current,as shown above]. [source,shell] .... % git remote -v freebsd https://git.freebsd.org/src.git (fetch) freebsd ssh://git@gitrepo.freebsd.org/src.git (push) .... The first step is to create a fork of https://github.com/freebsd/freebsd-src[FreeBSD] on GitHub following these https://docs.github.com/en/github/getting-started-with-github/fork-a-repo[guidelines]. The destination of the fork should be your own, personal, GitHub account (gvnn3 in my case). Now add a remote on your local system that points to your fork: [source,shell] .... % git remote add github git@github.com:gvnn3/freebsd-src.git % git remote -v github git@github.com:gvnn3/freebsd-src.git (fetch) github git@github.com:gvnn3/freebsd-src.git (push) freebsd https://git.freebsd.org/src.git (fetch) freebsd ssh://git@gitrepo.freebsd.org/src.git (push) .... With this in place you can create a branch crossref:committers-guide[keeping_a_local_branch,as shown above]. [source,shell] .... % git checkout -b gnn-pr2001-fix .... Make whatever modifications you wish in your branch. Build, test, and once you're ready to collaborate with others it's time to push your changes into your hosted branch. Before you can push you'll have to set the appropriate upstream, as Git will tell you the first time you try to push to your +github+ remote: [source,shell] .... % git push github fatal: The current branch gnn-pr2001-fix has no upstream branch. To push the current branch and set the remote as upstream, use git push --set-upstream github gnn-pr2001-fix .... Setting the push as +git+ advises allows it to succeed: [source,shell] .... % git push --set-upstream github gnn-feature Enumerating objects: 20486, done. Counting objects: 100% (20486/20486), done. Delta compression using up to 8 threads Compressing objects: 100% (12202/12202), done. Writing objects: 100% (20180/20180), 56.25 MiB | 13.15 MiB/s, done. Total 20180 (delta 11316), reused 12972 (delta 7770), pack-reused 0 remote: Resolving deltas: 100% (11316/11316), completed with 247 local objects. remote: remote: Create a pull request for 'gnn-feature' on GitHub by visiting: remote: https://github.com/gvnn3/freebsd-src/pull/new/gnn-feature remote: To github.com:gvnn3/freebsd-src.git * [new branch] gnn-feature -> gnn-feature Branch 'gnn-feature' set up to track remote branch 'gnn-feature' from 'github'. .... Subsequent changes to the same branch will push correctly by default: [source,shell] .... % git push Enumerating objects: 4, done. Counting objects: 100% (4/4), done. Delta compression using up to 8 threads Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), 314 bytes | 1024 bytes/s, done. Total 3 (delta 1), reused 1 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (1/1), completed with 1 local object. To github.com:gvnn3/freebsd-src.git 9e5243d7b659..cf6aeb8d7dda gnn-feature -> gnn-feature .... At this point your work is now in your branch on +GitHub+ and you can share the link with other collaborators. [[github-pull-land]] === Landing a github pull request This section documents how to land a GitHub pull request that's submitted against the FreeBSD Git mirrors at GitHub. While this is not an official way to submit patches at this time, sometimes good fixes come in this way and it is easiest just to bring them into a committer's tree and have them pushed into the FreeBSD's tree from there. Similar steps can be used to pull branches from other repositories and land those. When committing pull requests from others, one should take extra care to examine all the changes to ensure they are exactly as represented. Before beginning, make sure that the local Git repo is up to date and has the correct origins set crossref:committers-guide[keeping_current,as shown above]. In addition, make sure to have the following origins: [source,shell] .... % git remote -v freebsd https://git.freebsd.org/src.git (fetch) freebsd ssh://git@gitrepo.freebsd.org/src.git (push) github https://github.com/freebsd/freebsd-src (fetch) github https://github.com/freebsd/freebsd-src (fetch) .... Often pull requests are simple: requests that contain only a single commit. In this case, a streamlined approach may be used, though the approach in the prior section will also work. Here, a branch is created, the change is cherry picked, the commit message adjusted, and sanity-checked before being pushed. The branch `staging` is used in this example but it can be any name. This technique works for any number of commits in the pull request, especially when the changes apply cleanly to the FreeBSD tree. However, when there's multiple commits, especially when minor adjustments are needed, `git rebase -i` works better than `git cherry-pick`. Briefly, these commands create a branch; cherry-picks the changes from the pull request; tests it; adjusts the commit messages; and fast forward merges it back to `main`. The PR number is `$PR` below. When adjusting the message, add `Pull Request: https://github.com/freebsd-src/pull/$PR`. All pull requests committed to the FreeBSD repository should be reviewed by at least one person. This need not be the person committing it, but in that case the person committing it should trust the other reviewers competence to review the commit. Committers that do a code review of pull requests before pushing them into the repo should add a `Reviewed by:` line to the commit, because in this case it is not implicit. Add anybody that reviews and approves the commit on github to `Reviewed by:` as well. As always, care should be taken to ensure the change does what it is supposed to, and that no malicious code is present. [NOTE] ====== In addition, please check to make sure that the pull request author name is not anonymous. Github's web editing interface generates names like: [source,shell] .... Author: github-user <38923459+github-user@users.noreply.github.com> .... A polite request to the author for a better name and/or email should be made. Extra care should be taken to ensure no style issue or malicious code is introduced. ====== [source,shell] .... % git fetch github pull/$PR/head:staging % git rebase -i main staging # to move the staging branch forward, adjust commit message here % git checkout main % git pull --ff-only # to get the latest if time has passed % git checkout main % git merge --ff-only staging % git push freebsd --push-option=confirm-author .... [.procedure] ==== For complicated pull requests that have multiple commits with conflicts, follow the following outline. . checkout the pull request `git checkout github/pull/XXX` . create a branch to rebase `git checkout -b staging` . rebase the `staging` branch to the latest `main` with `git rebase -i main staging` . resolve conflicts and do whatever testing is needed . fast forward the `staging` branch into `main` as above . final sanity check of changes to make sure all is well . push to FreeBSD's Git repository. This will also work when bringing branches developed elsewhere into the local tree for committing. ==== Once finished with the pull request, close it using GitHub's web interface. It is worth noting that if your `github` origin uses `https://`, the only step you'll need a GitHub account for is closing the pull request. [[vcs-history]] == Version Control History The project has moved to crossref:committers-guide[git-primer,git]. The FreeBSD source repository switched from CVS to Subversion on May 31st, 2008. The first real SVN commit is __r179447__. The source repository switched from Subversion to Git on December 23rd, 2020. The last real svn commit is __r368820__. The first real git commit hash is __5ef5f51d2bef80b0ede9b10ad5b0e9440b60518c__. The FreeBSD `doc/www` repository switched from CVS to Subversion on May 19th, 2012. The first real SVN commit is __r38821__. The documentation repository switched from Subversion to Git on December 8th, 2020. The last SVN commit is __r54737__. The first real git commit hash is __3be01a475855e7511ad755b2defd2e0da5d58bbe__. The FreeBSD `ports` repository switched from CVS to Subversion on July 14th, 2012. The first real SVN commit is __r300894__. The ports repository switched from Subversion to Git on April 6, 2021. The last SVN commit is __r569609__ The first real git commit hash is __ed8d3eda309dd863fb66e04bccaa513eee255cbf__. [[conventions]] == Setup, Conventions, and Traditions There are a number of things to do as a new developer. The first set of steps is specific to committers only. These steps must be done by a mentor for those who are not committers. [[conventions-committers]] === For New Committers Those who have been given commit rights to the FreeBSD repositories must follow these steps. * Get mentor approval before committing each of these changes! * All [.filename]#src# commits go to FreeBSD-CURRENT first before being merged to FreeBSD-STABLE. The FreeBSD-STABLE branch must maintain ABI and API compatibility with earlier versions of that branch. Do not merge changes that break this compatibility. [[commit-steps]] [.procedure] ==== *Steps for New Committers* . Add an Author Entity + [.filename]#doc/shared/authors.adoc# - Add an author entity. Later steps depend on this entity, and missing this step will cause the [.filename]#doc/# build to fail. This is a relatively easy task, but remains a good first test of version control skills. . Update the List of Developers and Contributors + [.filename]#doc/shared/contrib-committers.adoc# - Add an entry, which will then appear in the "Developers" section of the extref:{contributors}[Contributors List, staff-committers]. Entries are sorted by last name. + [.filename]#doc/shared/contrib-additional.adoc# - _Remove_ the entry. Entries are sorted by first name. . Add a News Item + [.filename]#doc/website/data/en/news/news.toml# - Add an entry. Look for the other entries that announce new committers and follow the format. Use the date from the commit bit approval email. . Add a PGP Key + `{des}` has written a shell script ([.filename]#doc/documentation/tools/addkey.sh#) to make this easier. See the https://cgit.freebsd.org/doc/plain/documentation/static/pgpkeys/README[README] file for more information. + Use [.filename]#doc/documentation/tools/checkkey.sh# to verify that keys meet minimal best-practices standards. + After adding and checking a key, add both updated files to source control and then commit them. Entries in this file are sorted by last name. + [NOTE] ====== It is very important to have a current PGP/GnuPG key in the repository. The key may be required for positive identification of a committer. For example, the `{admins}` might need it for account recovery. A complete keyring of `FreeBSD.org` users is available for download from link:https://docs.FreeBSD.org/pgpkeys/pgpkeys.txt[https://docs.FreeBSD.org/pgpkeys/pgpkeys.txt]. ====== . Update Mentor and Mentee Information + [.filename]#src/share/misc/committers-.dot# - Add an entry to the current committers section, where _repository_ is `doc`, `ports`, or `src`, depending on the commit privileges granted. + Add an entry for each additional mentor/mentee relationship in the bottom section. . Update git mailmap file + [.filename]#src/.mailmap#, [.filename]#doc/.mailmap#, and [.filename]#ports/.mailmap# - Add an entry for commits you created prior to becoming a FreeBSD committer. + Mapping to your FreeBSD address allows us to track external committers who may be ready for a commit bit more easily. You can also use this to correct old names, mispelled names, etc in the default `git log` output. . Generate a Kerberos Password + See crossref:committers-guide[kerberos-ldap, Kerberos and LDAP web Password for FreeBSD Cluster] to generate or set a Kerberos account for use with other FreeBSD services like the link:https://bugs.freebsd.org/bugzilla/[bug-tracking database] (you get a bug-tracking account as part of that step). . Optional: Enable Wiki Account + link:https://wiki.freebsd.org[FreeBSD Wiki] Account - A wiki account allows sharing projects and ideas. Those who do not yet have an account can follow instructions on the link:https://wiki.freebsd.org/Wiki/About[Wiki/About page] to obtain one. Contact mailto:wiki-admin@FreeBSD.org[wiki-admin@FreeBSD.org] if you need help with your Wiki account. . Optional: Update Wiki Information + Wiki Information - After gaining access to the wiki, some people add entries to the https://wiki.freebsd.org/HowWeGotHere[How We Got Here], https://wiki.freebsd.org/IRC/Nicknames[IRC Nicks], https://wiki.freebsd.org/Community/Dogs[Dogs of FreeBSD], and or https://wiki.freebsd.org/Community/Cats[Cats of FreeBSD] pages. . Optional: Update Ports with Personal Information + [.filename]#ports/astro/xearth/files/freebsd.committers.markers# and [.filename]#src/usr.bin/calendar/calendars/calendar.freebsd# - Some people add entries for themselves to these files to show where they are located or the date of their birthday. . Optional: Prevent Duplicate Mailings + Subscribers to {dev-commits-doc-all}, {dev-commits-ports-all} or {dev-commits-src-all} might wish to unsubscribe to avoid receiving duplicate copies of commit messages and followups. ==== [[conventions-everyone]] === For Everyone [[conventions-everyone-steps]] [.procedure] ==== . Introduce yourself to the other developers, otherwise no one will have any idea who you are or what you are working on. The introduction need not be a comprehensive biography, just write a paragraph or two about who you are, what you plan to be working on as a developer in FreeBSD, and who will be your mentor. Email this to the {developers-name} and you will be on your way! . Log into `freefall.FreeBSD.org` and create a [.filename]#/var/forward/user# (where _user_ is your username) file containing the e-mail address where you want mail addressed to _yourusername_@FreeBSD.org to be forwarded. This includes all of the commit messages as well as any other mail addressed to the {committers-name} and the {developers-name}. Really large mailboxes which have taken up permanent residence on `freefall` may get truncated without warning if space needs to be freed, so forward it or save it elsewhere. + [NOTE] ====== If your e-mail system uses SPF with strict rules, you should exclude `mx2.FreeBSD.org` from SPF checks. ====== + Due to the severe load dealing with SPAM places on the central mail servers that do the mailing list processing, the front-end server does do some basic checks and will drop some messages based on these checks. At the moment proper DNS information for the connecting host is the only check in place but that may change. Some people blame these checks for bouncing valid email. To have these checks turned off for your email, create a file named [.filename]#~/.spam_lover# on `freefall.FreeBSD.org`. + [NOTE] ====== Those who are developers but not committers will not be subscribed to the committers or developers mailing lists. The subscriptions are derived from the access rights. ====== ==== [[smtp-setup]] ==== SMTP Access Setup For those willing to send e-mail messages through the FreeBSD.org infrastructure, follow the instructions below: [.procedure] ==== . Point your mail client at `smtp.FreeBSD.org:587`. . Enable STARTTLS. . Ensure your `From:` address is set to `_yourusername_@FreeBSD.org`. . For authentication, you can use your FreeBSD Kerberos username and password (see crossref:committers-guide[kerberos-ldap, Kerberos and LDAP web Password for FreeBSD Cluster]). The `_yourusername_/mail` principal is preferred, as it is only valid for authenticating to mail resources. + [NOTE] ====== Do not include `@FreeBSD.org` when entering in your username. ====== + .Additional Notes [NOTE] ====== * Will only accept mail from `_yourusername_@FreeBSD.org`. If you are authenticated as one user, you are not permitted to send mail from another. * A header will be appended with the SASL username: (`Authenticated sender: _username_`). * Host has various rate limits in place to cut down on brute force attempts. ====== ==== [[smtp-setup-local-mta]] ===== Using a Local MTA to Forward Emails to the FreeBSD.org SMTP Service It is also possible to use a local MTA to forward locally sent emails to the FreeBSD.org SMTP servers. [[smtp-setup-local-postfix]] .Using Postfix [example] ==== To tell a local Postfix instance that anything from `_yourusername_@FreeBSD.org` should be forwarded to the FreeBSD.org servers, add this to your [.filename]#main.cf#: [.programlisting] .... sender_dependent_relayhost_maps = hash:/usr/local/etc/postfix/relayhost_maps smtp_sasl_auth_enable = yes smtp_sasl_security_options = noanonymous smtp_sasl_password_maps = hash:/usr/local/etc/postfix/sasl_passwd smtp_use_tls = yes .... Create [.filename]#/usr/local/etc/postfix/relayhost_maps# with the following content: [.programlisting] .... yourusername@FreeBSD.org [smtp.freebsd.org]:587 .... Create [.filename]#/usr/local/etc/postfix/sasl_passwd# with the following content: [.programlisting] .... [smtp.freebsd.org]:587 yourusername:yourpassword .... If the email server is used by other people, you may want to prevent them from sending e-mails from your address. To achieve this, add this to your [.filename]#main.cf#: [.programlisting] .... smtpd_sender_login_maps = hash:/usr/local/etc/postfix/sender_login_maps smtpd_sender_restrictions = reject_known_sender_login_mismatch .... Create [.filename]#/usr/local/etc/postfix/sender_login_maps# with the following content: [.programlisting] .... yourusername@FreeBSD.org yourlocalusername .... Where _yourlocalusername_ is the SASL username used to connect to the local instance of Postfix. ==== [[smtp-setup-local-opensmtpd]] .Using OpenSMTPD [example] ==== To tell a local OpenSMTPD instance that anything from `_yourusername_@FreeBSD.org` should be forwarded to the FreeBSD.org servers, add this to your [.filename]#smtpd.conf#: [.programlisting] .... action "freebsd" relay host smtp+tls://freebsd@smtp.freebsd.org:587 auth match from any auth yourlocalusername mail-from "_yourusername_@freebsd.org" for any action "freebsd" .... Where _yourlocalusername_ is the SASL username used to connect to the local instance of OpenSMTPD. Create [.filename]#/usr/local/etc/mail/secrets# with the following content: [.programlisting] .... freebsd yourusername:yourpassword .... ==== [[smtp-setup-local-exim]] .Using Exim [example] ==== To direct a local Exim instance to forward all mail from `_example_@FreeBSD.org` to FreeBSD.org servers, add this to Exim [.filename]#configuration#: [.programlisting] .... Routers section: (at the top of the list): freebsd_send: driver = manualroute domains = !+local_domains transport = freebsd_smtp route_data = ${lookup {${lc:$sender_address}} lsearch {/usr/local/etc/exim/freebsd_send}} Transport Section: freebsd_smtp: driver = smtp tls_certificate= tls_privatekey= tls_require_ciphers = EECDH+ECDSA+AESGCM:EECDH+aRSA+AESGCM:EECDH+ECDSA+SHA384:EECDH+ECDSA+SHA256:EECDH+aRSA+SHA384:EECDH+aRSA+SHA256:EECDH+AESGCM:EECDH:EDH+AESGCM:EDH+aRSA:HIGH:!MEDIUM:!LOW:!aNULL:!eNULL:!LOW:!RC4:!MD5:!EXP:!PSK:!SRP:!DSS dkim_domain = dkim_selector = dkim_private_key= dnssec_request_domains = * hosts_require_auth = smtp.freebsd.org Authenticators: freebsd_plain: driver = plaintext public_name = PLAIN client_send = ^example/mail^examplePassword client_condition = ${if eq{$host}{smtp.freebsd.org}} .... Create [.filename]#/usr/local/etc/exim/freebsd_send# with the following content: [.programlisting] .... example@freebsd.org:smtp.freebsd.org::587 .... ==== [[mentors]] === Mentors All new developers have a mentor assigned to them for the first few months. A mentor is responsible for teaching the mentee the rules and conventions of the project and guiding their first steps in the developer community. The mentor is also personally responsible for the mentee's actions during this initial period. For committers: do not commit anything without first getting mentor approval. Document that approval with an `Approved by:` line in the commit message. When the mentor decides that a mentee has learned the ropes and is ready to commit on their own, the mentor announces it with a commit to [.filename]#mentors#. This file is in the [.filename]#admin# orphan branch of each repository. Detailed information on how to access these branches can be found in crossref:committers-guide[admin-branch, "admin" branch]. [[pre-commit-review]] == Pre-Commit Review Code review is one way to increase the quality of software. The following guidelines apply to commits to the `main` (-CURRENT) branch of the `src` repository. Other branches and the `ports` and `docs` trees have their own review policies, but these guidelines generally apply to commits requiring review: * All non-trivial changes should be reviewed before they are committed to the repository. * Reviews may be conducted by email, in Bugzilla, in Phabricator, or by another mechanism. Where possible, reviews should be public. * The developer responsible for a code change is also responsible for making all necessary review-related changes. * Code review can be an iterative process, which continues until the patch is ready to be committed. Specifically, once a patch is sent out for review, it should receive an explicit "looks good" before it is committed. So long as it is explicit, this can take whatever form makes sense for the review method. * Timeouts are not a substitute for review. Sometimes code reviews will take longer than you would hope for, especially for larger features. Accepted ways to speed up review times for your patches are: * Review other people's patches. If you help out, everybody will be more willing to do the same for you; goodwill is our currency. * Ping the patch. If it is urgent, provide reasons why it is important to you to get this patch landed and ping it every couple of days. If it is not urgent, the common courtesy ping rate is one week. Remember that you are asking for valuable time from other professional developers. * Ask for help on mailing lists, IRC, etc. Others may be able to either help you directly, or suggest a reviewer. * Split your patch into multiple smaller patches that build on each other. The smaller your patch, the higher the probability that somebody will take a quick look at it. + When making large changes, it is helpful to keep this in mind from the beginning of the effort as breaking large changes into smaller ones is often difficult after the fact. Developers should participate in code reviews as both reviewers and reviewees. If someone is kind enough to review your code, you should return the favor for someone else. Note that while anyone is welcome to review and give feedback on a patch, only an appropriate subject-matter expert can approve a change. This will usually be a committer who works with the code in question on a regular basis. In some cases, no subject-matter expert may be available. In those cases, a review by an experienced developer is sufficient when coupled with appropriate testing. [[commit-log-message]] == Commit Log Messages This section contains some suggestions and traditions for how commit logs are formatted. === Why are commit messages important? When you commit a change in Git, Subversion, or another version control system (VCS), you're prompted to write some text describing the commit -- a commit message. How important is this commit message? Should you spend some significant effort writing it? Does it really matter if you write simply `fixed a bug`? Most projects have more than one developer and last for some length of time. Commit messages are a very important method of communicating with other developers, in the present and for the future. FreeBSD has hundreds of active developers and hundreds of thousands of commits spanning decades of history. Over that time the developer community has learned how valuable good commit messages are; sometimes these are hard-learned lessons. Commit messages serve at least three purposes: * Communicating with other developers + FreeBSD commits generate email to various mailing lists. These include the commit message along with a copy of the patch itself. Commit messages are also viewed through commands like git log. These serve to make other developers aware of changes that are ongoing; that other developer may want to test the change, may have an interest in the topic and will want to review in more detail, or may have their own projects underway that would benefit from interaction. * Making Changes Discoverable + In a large project with a long history it may be difficult to find changes of interest when investigating an issue or change in behaviour. Verbose, detailed commit messages allow searches for changes that might be relevant. For example, `git log --since 1year --grep 'USB timeout'`. * Providing historical documentation + Commit messages serve to document changes for future developers, perhaps years or decades later. This future developer may even be you, the original author. A change that seems obvious today may be decidedly not so much later on. The `git blame` command annotates each line of a source file with the change (hash and subject line) that brought it in. Having established the importance, here are elements of a good FreeBSD commit message: === Start with a subject line Commit messages should start with a single-line subject that briefly summarizes the change. The subject should, by itself, allow the reader to quickly determine if the change is of interest or not. === Keep subject lines short The subject line should be as short as possible while still retaining the required information. This is to make browsing Git log more efficient, and so that git log --oneline can display the short hash and subject on a single 80-column line. A good rule of thumb is to stay below 67 characters, and aim for about 50 or fewer if possible. === Prefix the subject line with a component, if applicable If the change relates to a specific component the subject line may be prefixed with that component name and a colon (:). If applicable, try to use the same prefix used in previous commits to the same files. ✓ `foo: Add -k option to keep temporary data` Include the prefix in the 67-character limit suggested above, so that `git log --oneline` avoids wrapping. === Capitalize the first letter of the subject Capitalize the first letter of the subject itself. The prefix, if any, is not capitalized unless necessary (e.g., `USB:` is capitalized). === Do not end the subject line with punctuation Do not end with a period or other punctuation. In this regard the subject line is like a newspaper headline. === Separate the subject and body with a blank line Separate the body from the subject with a blank line. Some trivial commits do not require a body, and will have only a subject. ✓ `ls: Fix typo in usage text` === Limit messages to 72 columns `git log` and `git format-patch` indent the commit message by four spaces. Wrapping at 72 columns provides a matching margin on the right edge. Limiting messages to 72 characters also keeps the commit message in formatted patches below RFC 2822's suggested email line length limit of 78 characters. This limit works well with a variety of tools that may render commit messages; line wrapping might be inconsistent with longer line length. === Use the present tense, imperative mood This facilitates short subject lines and provides consistency, including with automatically generated commit messages (e.g., as generated by git revert). This is important when reading a list of commit subjects. Think of the subject as finishing the sentence "when applied, this change will ...". ✓ `foo: Implement the -k (keep) option` + ✗ `foo: Implemented the -k option` + ✗ `This change implements the -k option in foo` + ✗ `-k option added` === Focus on what and why, not how Explain what the change accomplishes and why it is being done, rather than how. Do not assume that the reader is familiar with the issue. Explain the background and motivation for the change. Include benchmark data if you have it. If there are limitations or incomplete aspects of the change, describe them in the commit message. === Consider whether parts of the commit message could be code comments instead Sometimes while writing a commit message you may find yourself writing a sentence or two explaining some tricky or confusing aspect of the change. When this happens consider whether it would be valuable to have that explanation as a comment in the code itself. === Write commit messages for your future self While writing the commit message for a change you have all of the context in mind - what prompted the change, alternate approaches that were considered and rejected, limitations of the change, and so on. Imagine yourself revisiting the change a year or two in the future, and write the commit message in a way that would provide that necessary context. === Commit messages should stand alone You may include references to mailing list postings, benchmark result web sites, or code review links. However, the commit message should contain all of the relevant information in case these references are no longer available in the future. Similarly, a commit may refer to a previous commit, for example in the case of a bug fix or revert. In addition to the commit identifier (revision or hash), include the subject line from the referenced commit (or another suitable brief reference). With each VCS migration (from CVS to Subversion to Git) revision identifiers from previous systems may become difficult to follow. === Include appropriate metadata in a footer As well as including an informative message with each commit, some additional information may be needed. This information consists of one or more lines containing the key word or phrase, a colon, tabs for formatting, and then the additional information. For key words where multiple values make sense (e.g., `PR:` with a comma-separated list of PRs), it is permitted to use the same keyword multiple times to avoid ambiguity or improve readability. The key words or phrases are: [.informaltable] [cols="20%,80%", frame="none"] |=== |`PR:` |The problem report (if any) which is affected (typically, by being closed) by this commit. Multiple PRs may be specified on one line, separated by commas or spaces. |`Reported by:` |The name and e-mail address of the person that reported the issue; for developers, just the username on the FreeBSD cluster. Typically used when there is no PR, for example if the issue was reported on a mailing list. |`Submitted by:` + (discouraged) |Name of an author who submitted a change without providing a full valid patch, especially without a valid email. Submitted patches should have the author set by using `git commit --author` with a full name and valid email. Before the migration to git allowed separate author and committer fields, this was used for contributed patches. |`Reviewed by:` | The name and e-mail address of the person or people that reviewed the change; for developers, just the username on the FreeBSD cluster. If a patch was submitted to a mailing list for review, and the review was favorable, then just include the list name. If the reviewer is not a member of the project, provide the name, email, and if ports an external role like maintainer: Reviewed by a developer: [source,shell] .... Reviewed by: username .... Reviewed by a ports maintainer that is not a developer: [source,shell] .... Reviewed by: Full Name (maintainer) .... |`Tested by:` |The name and e-mail address of the person or people that tested the change; for developers, just the username on the FreeBSD cluster. |`Discussed with:` |The name and e-mail address of the person or people that contributed to the patch by providing meaningful feedback; for developers, just the username on the FreeBSD cluster. Typically used to credit those who did not explicitly review, test, or approve the change, but nevertheless contributed to the discussion surrounding the change, which led to improvements and a better understanding of its impact on the FreeBSD project. |`Approved by:` a| The name and e-mail address of the person or people that approved the change; for developers, just the username on the FreeBSD cluster. There are several cases where approval is customary: * while a new committer is under mentorship * commits to an area of the tree covered by the LOCKS file (src) * during a release cycle * committing to a repo where you do not hold a commit bit (e.g. src committer committing to docs) * committing to a port maintained by someone else While under mentorship, get mentor approval before the commit. Enter the mentor's username in this field, and note that they are a mentor: [source,shell] .... Approved by: username-of-mentor (mentor) .... If a team approved these commits then include the team name followed by the username of the approver in parentheses. For example: [source,shell] .... Approved by: re (username) .... |`Obtained from:` |The name of the project (if any) from which the code was obtained. Do not use this line for the name of an individual person. |`Fixes:` |The Git short hash and the title line of a commit that is fixed by this change as returned by `git log -n1 --format='%h ("%s")' GIT-COMMIT-HASH`. We include the commit title so that the referenced commit can be located even in the case that a future VCS migration invalidates hash references. |`MFC after:` |To receive an e-mail reminder to MFC at a later date, specify the number of days, weeks, or months after which an MFC is planned. |`MFC to:` |If the commit should be merged to a subset of stable branches, specify the branch names. |`MFH:` |If the commit is to be merged into a ports quarterly branch name, specify the quarterly branch. For example `2021Q2`. |`Relnotes:` |If the change is a candidate for inclusion in the release notes for the next release from the branch, set to `yes`. |Candidates are user-visible changes, new features, compatibility breaks, etc.. |If you forget to set this line, or want to provide more details, add an entry to the `RELNOTES` file in the root of the src tree. |The `RELNOTES` file is used to generate release notes for the next release. |Do not use the `Relnotes:` line to describe the change: its only valid value is `yes`. |`Security:` |If the change is related to a security vulnerability or security exposure, include one or more references or a description of the issue. If possible, include a VuXML URL or a CVE ID. |`Event:` |The description for the event where this commit was made. If this is a recurring event, add the year or even the month to it. For example, this could be `FooBSDcon 2019`. The idea behind this line is to put recognition to conferences, gatherings, and other types of meetups and to show that these are useful to have. Please do not use the `Sponsored by:` line for this as that is meant for organizations sponsoring certain features or developers working on them. |`Sponsored by:` |Sponsoring organizations for this change, if any. Separate multiple organizations with commas. If only a portion of the work was sponsored, or different amounts of sponsorship were provided to different authors, please give appropriate credit in parentheses after each sponsor name. For example, `Example.com (alice, code refactoring), Wormulon (bob), Momcorp (cindy)` shows that Alice was sponsored by Example.com to do code refactoring, while Wormulon sponsored Bob's work and Momcorp sponsored Cindy's work. Other authors were either not sponsored or chose not to list sponsorship. |`Pull Request:` |This change was submitted as a pull request or merge request against one of FreeBSD's public read-only Git repositories. It should include the entire URL to the pull request, as these often act as code reviews for the code. For example: `https://github.com/freebsd/freebsd-src/pull/745` | `Closes:` | This change concludes the patch series discussed at the specified Github pull request, and closes that request. It should include the entire URL to the pull request, as these often act as code reviews for the code. For example: `https://github.com/freebsd/freebsd-src/pull/745` |`Co-authored-by:` |The name and email address of an additional author of the commit. GitHub has a detailed description of the Co-authored-by trailer at https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors. |`Signed-off-by:` |ID certifies compliance with https://developercertificate.org/ |`Differential Revision:` |The full URL of the Phabricator review. This line __must be the last line__. For example: `https://reviews.freebsd.org/D1708`. |=== .Commit Log for a Commit Based on a PR [example] ==== The commit is based on a patch from a PR submitted by John Smith. The commit message "PR" field is filled. [.programlisting] .... ... PR: 12345 .... The committer sets the author of the patch with `git commit --author "John Smith "`. ==== .Commit Log for a Commit Needing Review [example] ==== The virtual memory system is being changed. After posting patches to the appropriate mailing list (in this case, `freebsd-arch`) and the changes have been approved. [.programlisting] .... ... Reviewed by: -arch .... ==== .Commit Log for a Commit Needing Approval [example] ==== Commit a port, after working with the listed MAINTAINER, who said to go ahead and commit. [.programlisting] .... ... Approved by: abc (maintainer) .... Where _abc_ is the account name of the person who approved. ==== .Commit Log for a Commit Bringing in Code from OpenBSD [example] ==== Committing some code based on work done in the OpenBSD project. [.programlisting] .... ... Obtained from: OpenBSD .... ==== .Commit Log for a Change to FreeBSD-CURRENT with a Planned Commit to FreeBSD-STABLE to Follow at a Later Date. [example] ==== Committing some code which will be merged from FreeBSD-CURRENT into the FreeBSD-STABLE branch after two weeks. [.programlisting] .... ... MFC after: 2 weeks .... Where _2_ is the number of days, weeks, or months after which an MFC is planned. The _weeks_ option may be `day`, `days`, `week`, `weeks`, `month`, `months`. ==== It is often necessary to combine these. Consider the situation where a user has submitted a PR containing code from the NetBSD project. Looking at the PR, the developer sees it is not an area of the tree they normally work in, so they have the change reviewed by the `arch` mailing list. Since the change is complex, the developer opts to MFC after one month to allow adequate testing. The extra information to include in the commit would look something like .Example Combined Commit Log [example] ==== [.programlisting] .... PR: 54321 Reviewed by: -arch Obtained from: NetBSD MFC after: 1 month Relnotes: yes .... ==== [[pref-license]] == Preferred License for New Files The FreeBSD Project's full license policy can be found at link:https://www.FreeBSD.org/internal/software-license/[https://www.FreeBSD.org/internal/software-license]. The rest of this section is intended to help you get started. As a rule, when in doubt, ask. It is much easier to give advice than to fix the source tree. The FreeBSD Project suggests and uses this text as the preferred license scheme: [.programlisting] .... /* * SPDX-License-Identifier: BSD-2-Clause * * Copyright (c) [year] [your name] * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * [id for your version control system, if any] */ .... The FreeBSD project strongly discourages the so-called "advertising clause" in new code. Due to the large number of contributors to the FreeBSD project, complying with this clause for many commercial vendors has become difficult. If you have code in the tree with the advertising clause, please consider removing it. In fact, please consider using the above license for your code. The FreeBSD project discourages completely new licenses and variations on the standard licenses. New licenses require the approval of {core-email} to reside in the `src` repository. The more different licenses that are used in the tree, the more problems that this causes to those wishing to utilize this code, typically from unintended consequences from a poorly worded license. Project policy dictates that code under some non-BSD licenses must be placed only in specific sections of the repository, and in some cases, compilation must be conditional or even disabled by default. For example, the GENERIC kernel must be compiled under only licenses identical to or substantially similar to the BSD license. GPL, APSL, CDDL, etc, licensed software must not be compiled into GENERIC. Developers are reminded that in open source, getting "open" right is just as important as getting "source" right, as improper handling of intellectual property has serious consequences. Any questions or concerns should immediately be brought to the attention of the core team. [[tracking.license.grants]] == Keeping Track of Licenses Granted to the FreeBSD Project Various software or data exist in the repositories where the FreeBSD project has been granted a special license to be able to use them. A case in point are the Terminus fonts for use with man:vt[4]. Here the author Dimitar Zhekov has allowed us to use the "Terminus BSD Console" font under a 2-clause BSD license rather than the regular Open Font License he normally uses. It is clearly sensible to keep a record of any such license grants. To that end, the {core-email} has decided to keep an archive of them. Whenever the FreeBSD project is granted a special license we require the {core-email} to be notified. Any developers involved in arranging such a license grant, please send details to the {core-email} including: * Contact details for people or organizations granting the special license. * What files, directories etc. in the repositories are covered by the license grant including the revision numbers where any specially licensed material was committed. * The date the license comes into effect from. Unless otherwise agreed, this will be the date the license was issued by the authors of the software in question. * The license text. * A note of any restrictions, limitations or exceptions that apply specifically to FreeBSD's usage of the licensed material. * Any other relevant information. Once the {core-email} is satisfied that all the necessary details have been gathered and are correct, the secretary will send a PGP-signed acknowledgment of receipt including the license details. This receipt will be persistently archived and serve as our permanent record of the license grant. The license archive should contain only details of license grants; this is not the place for any discussions around licensing or other subjects. Access to data within the license archive will be available on request to the {core-email}. [[spdx.tags]] == SPDX Tags in the tree The project uses https://spdx.dev[SPDX] tags in our source base. At present, these tags are indented to help automated tools reconstruct license requirements mechanically. All _SPDX-License-Identifier_ tags in the tree should be considered to be informative. All files in the FreeBSD source tree with these tags also have a copy of the license which governs use of that file. In the event of a discrepancy, the verbatim license is controlling. The project tries to follow the https://spdx.github.io/spdx-spec/v2.2.2/[SPDX Specification, Version 2.2]. How to mark source files and valid algebraic expressions are found in https://spdx.github.io/spdx-spec/v2.2.2/SPDX-license-expressions/[Annex D] and https://spdx.github.io/spdx-spec/v2.2.2/using-SPDX-short-identifiers-in-source-files/[Annex E]. The project draws identifiers from SPDX's list of valid https://spdx.org/licenses/[short license identifiers]. The project uses only the _SPDX-License-Identifier_ tag. As of March 2021, approximately 25,000 out of 90,000 files in the tree have been marked. [[developer.relations]] == Developer Relations When working directly on your own code or on code which is already well established as your responsibility, then there is probably little need to check with other committers before jumping in with a commit. When working on a bug in an area of the system which is clearly orphaned (and there are a few such areas, to our shame), the same applies. When modifying parts of the system which are maintained, formally or informally, consider asking for a review just as a developer would have before becoming a committer. For ports, contact the listed `MAINTAINER` in the [.filename]#Makefile#. To determine if an area of the tree is maintained, check the MAINTAINERS file at the root of the tree. If nobody is listed, scan the revision history to see who has committed changes in the past. To list the names and email addresses of all commit authors for a given file in the last 2 years and the number of commits each has authored, ordered by descending number of commits, use: [source,shell] ---- % git -C /path/to/repo shortlog -sne --since="2 years" -- relative/path/to/file ---- If queries go unanswered or the committer otherwise indicates a lack of interest in the area affected, go ahead and commit it. [IMPORTANT] ==== Avoid sending private emails to maintainers. Other people might be interested in the conversation, not just the final output. ==== If there is any doubt about a commit for any reason at all, have it reviewed before committing. Better to have it flamed then and there rather than when it is part of the repository. If a commit does results in controversy erupting, it may be advisable to consider backing the change out again until the matter is settled. Remember, with a version control system we can always change it back. Do not impugn the intentions of others. If they see a different solution to a problem, or even a different problem, it is probably not because they are stupid, because they have questionable parentage, or because they are trying to destroy hard work, personal image, or FreeBSD, but basically because they have a different outlook on the world. Different is good. Disagree honestly. Argue your position from its merits, be honest about any shortcomings it may have, and be open to seeing their solution, or even their vision of the problem, with an open mind. Accept correction. We are all fallible. When you have made a mistake, apologize and get on with life. Do not beat up yourself, and certainly do not beat up others for your mistake. Do not waste time on embarrassment or recrimination, just fix the problem and move on. Ask for help. Seek out (and give) peer reviews. One of the ways open source software is supposed to excel is in the number of eyeballs applied to it; this does not apply if nobody will review code. [[if-in-doubt]] == If in Doubt... When unsure about something, whether it be a technical issue or a project convention be sure to ask. If you stay silent you will never make progress. If it relates to a technical issue ask on the public mailing lists. Avoid the temptation to email the individual person that knows the answer. This way everyone will be able to learn from the question and the answer. For project specific or administrative questions ask, in order: * Your mentor or former mentor. * An experienced committer on IRC, email, etc. * Any team with a "hat", as they can give you a definitive answer. * If still not sure, ask on {developers-name}. Once your question is answered, if no one pointed you to documentation that spelled out the answer to your question, document it, as others will have the same question. [[bugzilla]] == Bugzilla The FreeBSD Project utilizes Bugzilla for tracking bugs and change requests. If you commit a fix or suggestion found in the PR database, be sure to close the PR. It is also considered nice if you take time to close any other PRs associated with your commits. Committers with non-``FreeBSD.org`` Bugzilla accounts can have the old account merged with the `FreeBSD.org` account by following these steps: [.procedure] ==== . Log in using your old account. . Open new bug. Choose `Services` as the Product, and `Bug Tracker` as the Component. In bug description list accounts you wish to be merged. . Log in using `FreeBSD.org` account and post comment to newly opened bug to confirm ownership. See crossref:committers-guide[kerberos-ldap, Kerberos and LDAP web Password for FreeBSD Cluster] for more details on how to generate or set a password for your `FreeBSD.org` account. . If there are more than two accounts to merge, post comments from each of them. ==== You can find out more about Bugzilla at: * extref:{pr-guidelines}[FreeBSD Problem Report Handling Guidelines] * link:https://www.FreeBSD.org/support/[https://www.FreeBSD.org/support] [[phabricator]] == Phabricator The FreeBSD Project utilizes https://reviews.freebsd.org[Phabricator] for code review requests. See the https://wiki.freebsd.org/Phabricator[Phabricator wiki page] for details. Please use the `git arc` command provided by `devel/freebsd-git-arc` (install the port or package, then type `git help arc` for documentation) to create and update Phabricator reviews. This will make it easier for others to review and test your patches. Committers with non-``FreeBSD.org`` Phabricator accounts can have the old account renamed to the ``FreeBSD.org`` account by following these steps: [.procedure] ==== . Change your Phabricator account email to your `FreeBSD.org` email. . Open new bug on our bug tracker using your `FreeBSD.org` account, see crossref:committers-guide[bugzilla, Bugzilla] for more information. Choose `Services` as the Product, and `Code Review` as the Component. In bug description request that your Phabricator account be renamed, and provide a link to your Phabricator user. For example, `https://reviews.freebsd.org/p/bob_example.com/` ==== [IMPORTANT] ==== Phabricator accounts cannot be merged, please do not open a new account. ==== [[people]] == Who's Who Besides the repository meisters, there are other FreeBSD project members and teams whom you will probably get to know in your role as a committer. Briefly, and by no means all-inclusively, these are: `{doceng}`:: doceng is the group responsible for the documentation build infrastructure, approving new documentation committers, and ensuring that the FreeBSD website and documentation on the FTP site is up to date with respect to the Subversion tree. It is not a conflict resolution body. The vast majority of documentation related discussion takes place on the {freebsd-doc}. More details regarding the doceng team can be found in its https://www.FreeBSD.org/internal/doceng/[charter]. Committers interested in contributing to the documentation should familiarize themselves with the extref:{fdp-primer}[Documentation Project Primer]. `{re-members}`:: These are the members of the `{re}`. This team is responsible for setting release deadlines and controlling the release process. During code freezes, the release engineers have final authority on all changes to the system for whichever branch is pending release status. If there is something you want merged from FreeBSD-CURRENT to FreeBSD-STABLE (whatever values those may have at any given time), these are the people to talk to about it. `{so}`:: `{so-name}` is the link:https://www.FreeBSD.org/security/[FreeBSD Security Officer] and oversees the `{security-officer}`. {committers-name}:: {dev-src-all}, {dev-ports-all} and {dev-doc-all} are the mailing lists that the version control system uses to send commit messages to. _Never_ send email directly to these lists. Only send replies to this list when they are short and are directly related to a commit. {developers-name}:: All committers are subscribed to -developers. This list was created to be a forum for the committers "community" issues. Examples are Core voting, announcements, etc. + The {developers-name} is for the exclusive use of FreeBSD committers. To develop FreeBSD, committers must have the ability to openly discuss matters that will be resolved before they are publicly announced. Frank discussions of work in progress are not suitable for open publication and may harm FreeBSD. + All FreeBSD committers are expected not to not publish or forward messages from the {developers-name} outside the list membership without permission of all of the authors. Violators will be removed from the {developers-name}, resulting in a suspension of commit privileges. Repeated or flagrant violations may result in permanent revocation of commit privileges. + This list is _not_ intended as a place for code reviews or for any technical discussion. In fact using it as such hurts the FreeBSD Project as it gives a sense of a closed list where general decisions affecting all of the FreeBSD using community are made without being "open". Last, but not least __never, never ever, email the {developers-name} and CC:/BCC: another FreeBSD list__. Never, ever email another FreeBSD email list and CC:/BCC: the {developers-name}. Doing so can greatly diminish the benefits of this list. [[ssh.guide]] == SSH Quick-Start Guide [.procedure] ==== . If you do not wish to type your password in every time you use man:ssh[1], and you use keys to authenticate, man:ssh-agent[1] is there for your convenience. If you want to use man:ssh-agent[1], make sure that you run it before running other applications. X users, for example, usually do this from their [.filename]#.xsession# or [.filename]#.xinitrc#. See man:ssh-agent[1] for details. . Generate a key pair using man:ssh-keygen[1]. The key pair will wind up in your [.filename]#$HOME/.ssh/# directory. + [IMPORTANT] ====== Only ECDSA, Ed25519 or RSA keys are supported. ====== . Send your public key ([.filename]#$HOME/.ssh/id_ecdsa.pub#, [.filename]#$HOME/.ssh/id_ed25519.pub#, or [.filename]#$HOME/.ssh/id_rsa.pub#) to the person setting you up as a committer so it can be put into [.filename]#yourlogin# in [.filename]#/etc/ssh-keys/# on `freefall`. ==== Now man:ssh-add[1] can be used for authentication once per session. It prompts for the private key's pass phrase, and then stores it in the authentication agent (man:ssh-agent[1]). Use `ssh-add -d` to remove keys stored in the agent. Test with a simple remote command: `ssh freefall.FreeBSD.org ls /usr`. For more information, see package:security/openssh-portable[], man:ssh[1], man:ssh-add[1], man:ssh-agent[1], man:ssh-keygen[1], and man:scp[1]. For information on adding, changing, or removing man:ssh[1] keys, see https://wiki.freebsd.org/clusteradm/ssh-keys[this article]. [[coverity]] == Coverity(R) Availability for FreeBSD Committers All FreeBSD developers can obtain access to Coverity analysis results of all FreeBSD Project software. All who are interested in obtaining access to the analysis results of the automated Coverity runs, can sign up at http://scan.coverity.com/[Coverity Scan]. The FreeBSD wiki includes a mini-guide for developers who are interested in working with the Coverity(R) analysis reports: https://wiki.freebsd.org/CoverityPrevent[https://wiki.freebsd.org/CoverityPrevent]. Please note that this mini-guide is only readable by FreeBSD developers, so if you cannot access this page, you will have to ask someone to add you to the appropriate Wiki access list. Finally, all FreeBSD developers who are going to use Coverity(R) are always encouraged to ask for more details and usage information, by posting any questions to the mailing list of the FreeBSD developers. [[rules]] == The FreeBSD Committers' Big List of Rules Everyone involved with the FreeBSD project is expected to abide by the _Code of Conduct_ available from link:https://www.FreeBSD.org/internal/code-of-conduct/[https://www.FreeBSD.org/internal/code-of-conduct]. As committers, you form the public face of the project, and how you behave has a vital impact on the public perception of it. This guide expands on the parts of the _Code of Conduct_ specific to committers. . Respect other committers. . Respect other contributors. . Discuss any significant change _before_ committing. . Respect existing maintainers (if listed in the `MAINTAINER` field in [.filename]#Makefile# or in [.filename]#MAINTAINER# in the top-level directory). . Any disputed change must be backed out pending resolution of the dispute if requested by a maintainer. Security related changes may override a maintainer's wishes at the Security Officer's discretion. . Changes go to FreeBSD-CURRENT before FreeBSD-STABLE unless specifically permitted by the release engineer or unless they are not applicable to FreeBSD-CURRENT. Any non-trivial or non-urgent change which is applicable should also be allowed to sit in FreeBSD-CURRENT for at least 3 days before merging so that it can be given sufficient testing. The release engineer has the same authority over the FreeBSD-STABLE branch as outlined for the maintainer in rule #5. . Do not fight in public with other committers; it looks bad. . Respect all code freezes and read the `committers` and `developers` mailing lists in a timely manner so you know when a code freeze is in effect. . When in doubt on any procedure, ask first! . Test your changes before committing them. . Do not commit to contributed software without _explicit_ approval from the respective maintainers. As noted, breaking some of these rules can be grounds for suspension or, upon repeated offense, permanent removal of commit privileges. Individual members of core have the power to temporarily suspend commit privileges until core as a whole has the chance to review the issue. In case of an "emergency" (a committer doing damage to the repository), a temporary suspension may also be done by the repository meisters. Only a 2/3 majority of core has the authority to suspend commit privileges for longer than a week or to remove them permanently. This rule does not exist to set core up as a bunch of cruel dictators who can dispose of committers as casually as empty soda cans, but to give the project a kind of safety fuse. If someone is out of control, it is important to be able to deal with this immediately rather than be paralyzed by debate. In all cases, a committer whose privileges are suspended or revoked is entitled to a "hearing" by core, the total duration of the suspension being determined at that time. A committer whose privileges are suspended may also request a review of the decision after 30 days and every 30 days thereafter (unless the total suspension period is less than 30 days). A committer whose privileges have been revoked entirely may request a review after a period of 6 months has elapsed. This review policy is _strictly informal_ and, in all cases, core reserves the right to either act on or disregard requests for review if they feel their original decision to be the right one. In all other aspects of project operation, core is a subset of committers and is bound by the __same rules__. Just because someone is in core this does not mean that they have special dispensation to step outside any of the lines painted here; core's "special powers" only kick in when it acts as a group, not on an individual basis. As individuals, the core team members are all committers first and core second. === Details [[respect]] . Respect other committers. + This means that you need to treat other committers as the peer-group developers that they are. Despite our occasional attempts to prove the contrary, one does not get to be a committer by being stupid and nothing rankles more than being treated that way by one of your peers. Whether we always feel respect for one another or not (and everyone has off days), we still have to _treat_ other committers with respect at all times, on public forums and in private email. + Being able to work together long term is this project's greatest asset, one far more important than any set of changes to the code, and turning arguments about code into issues that affect our long-term ability to work harmoniously together is just not worth the trade-off by any conceivable stretch of the imagination. + To comply with this rule, do not send email when you are angry or otherwise behave in a manner which is likely to strike others as needlessly confrontational. First calm down, then think about how to communicate in the most effective fashion for convincing the other persons that your side of the argument is correct, do not just blow off some steam so you can feel better in the short term at the cost of a long-term flame war. Not only is this very bad "energy economics", but repeated displays of public aggression which impair our ability to work well together will be dealt with severely by the project leadership and may result in suspension or termination of your commit privileges. The project leadership will take into account both public and private communications brought before it. It will not seek the disclosure of private communications, but it will take it into account if it is volunteered by the committers involved in the complaint. + All of this is never an option which the project's leadership enjoys in the slightest, but unity comes first. No amount of code or good advice is worth trading that away. . Respect other contributors. + You were not always a committer. At one time you were a contributor. Remember that at all times. Remember what it was like trying to get help and attention. Do not forget that your work as a contributor was very important to you. Remember what it was like. Do not discourage, belittle, or demean contributors. Treat them with respect. They are our committers in waiting. They are every bit as important to the project as committers. Their contributions are as valid and as important as your own. After all, you made many contributions before you became a committer. Always remember that. + Consider the points raised under crossref:committers-guide[respect,Respect other committers] and apply them also to contributors. . Discuss any significant change _before_ committing. + The repository is not where changes are initially submitted for correctness or argued over, that happens first in the mailing lists or by use of the Phabricator service. The commit will only happen once something resembling consensus has been reached. This does not mean that permission is required before correcting every obvious syntax error or manual page misspelling, just that it is good to develop a feel for when a proposed change is not quite such a no-brainer and requires some feedback first. People really do not mind sweeping changes if the result is something clearly better than what they had before, they just do not like being _surprised_ by those changes. The very best way of making sure that things are on the right track is to have code reviewed by one or more other committers. + When in doubt, ask for review! . Respect existing maintainers if listed. + Many parts of FreeBSD are not "owned" in the sense that any specific individual will jump up and yell if you commit a change to "their" area, but it still pays to check first. One convention we use is to put a maintainer line in the [.filename]#Makefile# for any package or subtree which is being actively maintained by one or more people; see extref:{developers-handbook}policies[Source Tree Guidelines and Policies, policies] for documentation on this. Where sections of code have several maintainers, commits to affected areas by one maintainer need to be reviewed by at least one other maintainer. In cases where the "maintainer-ship" of something is not clear, look at the repository logs for the files in question and see if someone has been working recently or predominantly in that area. . Any disputed change must be backed out pending resolution of the dispute if requested by a maintainer. Security related changes may override a maintainer's wishes at the Security Officer's discretion. + This may be hard to swallow in times of conflict (when each side is convinced that they are in the right, of course) but a version control system makes it unnecessary to have an ongoing dispute raging when it is far easier to simply reverse the disputed change, get everyone calmed down again and then try to figure out what is the best way to proceed. If the change turns out to be the best thing after all, it can be easily brought back. If it turns out not to be, then the users did not have to live with the bogus change in the tree while everyone was busily debating its merits. People _very_ rarely call for back-outs in the repository since discussion generally exposes bad or controversial changes before the commit even happens, but on such rare occasions the back-out should be done without argument so that we can get immediately on to the topic of figuring out whether it was bogus or not. . Changes go to FreeBSD-CURRENT before FreeBSD-STABLE unless specifically permitted by the release engineer or unless they are not applicable to FreeBSD-CURRENT. Any non-trivial or non-urgent change which is applicable should also be allowed to sit in FreeBSD-CURRENT for at least 3 days before merging so that it can be given sufficient testing. The release engineer has the same authority over the FreeBSD-STABLE branch as outlined in rule #5. + This is another "do not argue about it" issue since it is the release engineer who is ultimately responsible (and gets beaten up) if a change turns out to be bad. Please respect this and give the release engineer your full cooperation when it comes to the FreeBSD-STABLE branch. The management of FreeBSD-STABLE may frequently seem to be overly conservative to the casual observer, but also bear in mind the fact that conservatism is supposed to be the hallmark of FreeBSD-STABLE and different rules apply there than in FreeBSD-CURRENT. There is also really no point in having FreeBSD-CURRENT be a testing ground if changes are merged over to FreeBSD-STABLE immediately. Changes need a chance to be tested by the FreeBSD-CURRENT developers, so allow some time to elapse before merging unless the FreeBSD-STABLE fix is critical, time sensitive or so obvious as to make further testing unnecessary (spelling fixes to manual pages, obvious bug/typo fixes, etc.) In other words, apply common sense. + Changes to the security branches (for example, `releng/9.3`) must be approved by a member of the `{security-officer}`, or in some cases, by a member of the `{re}`. . Do not fight in public with other committers; it looks bad. + This project has a public image to uphold and that image is very important to all of us, especially if we are to continue to attract new members. There will be occasions when, despite everyone's very best attempts at self-control, tempers are lost and angry words are exchanged. The best thing that can be done in such cases is to minimize the effects of this until everyone has cooled back down. Do not air angry words in public and do not forward private correspondence or other private communications to public mailing lists, mail aliases, instant messaging channels or social media sites. What people say one-to-one is often much less sugar-coated than what they would say in public, and such communications therefore have no place there - they only serve to inflame an already bad situation. If the person sending a flame-o-gram at least had the grace to send it privately, then have the grace to keep it private yourself. If you feel you are being unfairly treated by another developer, and it is causing you anguish, bring the matter up with core rather than taking it public. Core will do its best to play peace makers and get things back to sanity. In cases where the dispute involves a change to the codebase and the participants do not appear to be reaching an amicable agreement, core may appoint a mutually-agreeable third party to resolve the dispute. All parties involved must then agree to be bound by the decision reached by this third party. . Respect all code freezes and read the `committers` and `developers` mailing list on a timely basis so you know when a code freeze is in effect. + Committing unapproved changes during a code freeze is a really big mistake and committers are expected to keep up-to-date on what is going on before jumping in after a long absence and committing 10 megabytes worth of accumulated stuff. People who abuse this on a regular basis will have their commit privileges suspended until they get back from the FreeBSD Happy Reeducation Camp we run in Greenland. . When in doubt on any procedure, ask first! + Many mistakes are made because someone is in a hurry and just assumes they know the right way of doing something. If you have not done it before, chances are good that you do not actually know the way we do things and really need to ask first or you are going to completely embarrass yourself in public. There is no shame in asking "how in the heck do I do this?" We already know you are an intelligent person; otherwise, you would not be a committer. . Test your changes before committing them. + If your changes are to the kernel, make sure you can still compile both GENERIC and LINT. If your changes are anywhere else, make sure you can still compile userspace via `make buildworld`. If your changes are to a branch, make sure your testing occurs with a machine which is running that code. If you have a change which also may break another architecture, be sure and test on all supported architectures. Please ensure your change works for crossref:committers-guide[compilers,supported toolchains]. Please refer to the https://www.FreeBSD.org/internal/[FreeBSD Internal Page] for a list of available resources. As other architectures are added to the FreeBSD supported platforms list, the appropriate shared testing resources will be made available. . Do not commit to contributed software without _explicit_ approval from the respective maintainers. + Contributed software is anything under the [.filename]#src/contrib#, [.filename]#src/crypto#, or [.filename]#src/sys/contrib# trees. + The trees mentioned above are for contributed software usually imported onto a vendor branch. Committing something there may cause unnecessary headaches when importing newer versions of the software. As a general consider sending patches upstream to the vendor. Patches may be committed to FreeBSD first with permission of the maintainer. + Reasons for modifying upstream software range from wanting strict control over a tightly coupled dependency to lack of portability in the canonical repository's distribution of their code. Regardless of the reason, effort to minimize the maintenance burden of fork is helpful to fellow maintainers. Avoid committing trivial or cosmetic changes to files since it makes every merge thereafter more difficult: such patches need to be manually re-verified every import. + If a particular piece of software lacks a maintainer, you are encouraged to take up ownership. If you are unsure of the current maintainership email {freebsd-arch} and ask. === Policy on Multiple Architectures In an effort to make it easier to keep FreeBSD portable across the platforms we support, core has developed this mandate: [.blockquote] Major design work (including major API and ABI changes) must prove itself on at least one Tier 1 platform before it may be committed to the source tree. Developers should also be aware of our Tier Policy for the long term support of hardware architectures. The rules here are intended to provide guidance during the development process, and are distinct from the requirements for features and architectures listed in that section. The Tier rules for feature support on architectures at release-time are more strict than the rules for changes during the development process. [[compilers]] === Policy on Multiple Compilers The FreeBSD base system builds with both Clang and GCC. The project does this in a careful and controlled way to maximize benefits from this extra work, while keeping the extra work to a minimum. Supporting both Clang and GCC improves the flexibility our users have. These compilers have different strengths and weaknesses, and supporting both allows users to pick the best one for their needs. Clang and GCC support similar dialects of C and C++, necessitating a relatively small amount of conditional code. The project gains increased code coverage and improves the code quality by using features from both compilers. The project is able to build in more user environments and leverage more CI environments by supporting this range, increasing convenience for users and giving them more tools to test with. By carefully constraining the range of versions supported to modern versions of these compilers, the project avoids unduly increasing the testing matrix. Older and obscure compilers, as well as older dialects of the languages, have extremely limited support that allow user programs to build with them, but without constraining the base system to being built with them. The exact balance continues to evolve to ensure the benefits of extra work remain greater than the burdens it imposes. The project used to support really old Intel compilers or old GCC versions, but we traded supporting those obsolete compilers for a carefully selected range of modern compilers. This section documents where we use different compilers, and the expectations around that. The FreeBSD base system includes an in-tree Clang compiler. Due to being in the tree, this compiler is the most supported compiler. All changes must compile with it, prior to commit. Complete testing, as appropriate for the change, should be done with this compiler. The FreeBSD base system also supports various versions of Clang and GCC as out-of-tree compilers. For large or risky changes, committers should do a test build with a supported version of GCC. Out of tree compilers are available as packages. GCC compilers are available as `${TARGET_ARCH}-gcc${VERSION}` packages, such as package:devel/freebsd-gcc14@aarch64[aarch64-gcc14]. Clang compilers are available as `llvm${VERSION}` packages, such as package:devel/llvm18[llvm18]. The project runs automated CI jobs to build everything with these compilers. Committers are expected to fix the jobs they break with their changes. Committers may test builds of userspace or individual kernels by setting `CROSS_TOOLCHAIN` to the package name, for example `CROSS_TOOLCHAIN=aarch64-gcc14` or `CROSS_TOOLCHAIN=llvm18`. For universe or tinderbox builds, `USE_GCC_TOOLCHAINS=gcc${VERSION}` builds all architectures using the appropriate GCC compiler packages. For universe or tinderbox builds using an out-of-tree Clang, pass `CROSS_TOOLCHAIN=llvm${VERSION}`. Note that while all architectures in the base system can be compiled by Clang, only a few architectures can be fully built by GCC. The FreeBSD project also has some CI pipelines on github. For pull requests on github and some branches pushed to github forks, a number of cross compilation jobs run. These test FreeBSD building using versions of Clang that lag the in-tree compiler by one or more major versions. The FreeBSD project is also upgrading compilers. Both Clang and GCC are fast moving targets. Some work to change things in the tree, for example removing the old-style K&R function declarations and definitions, will land in the tree prior to the compiler landing. Committers should try to be mindful about this and be receptive to looking into problems with their code or changes with these new compilers. Also, just after a new compiler version hits the tree, people may need to compile things with the old version if there was an undetected regression suspected. In addition to the compiler, LLVM's LLD and GNU's binutils are used indirectly by the compiler. Committers should be mindful of variations in assembler syntax and features of the linkers and ensure both variants work. These components will be tested as part of FreeBSD's CI jobs for Clang or GCC. The FreeBSD project provides headers and libraries that allow other compilers to be used to build software not in the base system. These headers have support for making the environment as strict as the standard, supporting prior dialects of ANSI-C back to C89, and other edge cases our large ports collection has uncovered. This support constrains retirement of older standards in places like header files, but does not constrain updating the base system to newer dialects. Nor does it require the base system to compile with these older standards as a whole. Breaking this support will cause packages in the ports collection to fail, so should be avoided where possible, and promptly fixed when it is easy to do so. The FreeBSD build system currently accommodates these different environments. As new warnings are added to compilers, the project tries to fix them. However, sometimes these warnings require extensive rework, so are suppressed in some way by using make variables that evaluate to the proper thing depending on the compiler version. Developers should be mindful of this, and ensure any compiler specific flags are properly conditionalized. ==== Current Compiler Versions The versions of supported compilers for a given branch such as `main` or `stable/X` varies over time. The authoritative source for supported compiler versions are automated CI jobs tested in GitHub's cross-build actions and Jenkins. [.tblbasic] [cols="12*",options="header",] |=== |Branch | In-tree Compiler |llvm12 | llvm13 | llvm14 | llvm15 | llvm18 |amd64-gcc12 | amd64-gcc13 | amd64-gcc14 | amd64-gcc15 |riscv64-gcc15 |main | llvm 19 | | | | Y | Y | Y | Y | Y | Y | Y |stable/15 | llvm 19 | | | Y | | Y | Y | Y | Y | | |stable/14 | llvm 19 | Y | Y | Y | | | Y | | Y | | |stable/13 | llvm 19 | Y | Y | Y | | | Y | | Y | | |=== GCC toolchains are tested for amd64 and riscv64 via CI jobs in Jenkins. LLVM toolchains are tested for aarch64 and amd64 in GitHub's cross-build actions. === Other Suggestions When committing documentation changes, use a spell checker before committing. For all XML docs, verify that the formatting directives are correct by running `make lint` and package:textproc/igor[]. For manual pages, run package:sysutils/manck[] and package:textproc/igor[] over the manual page to verify all of the cross references and file references are correct and that the man page has all of the appropriate `MLINKS` installed. Do not mix style fixes with new functionality. A style fix is any change which does not modify the functionality of the code. Mixing the changes obfuscates the functionality change when asking for differences between revisions, which can hide any new bugs. Do not include whitespace changes with content changes in commits to [.filename]#doc/#. The extra clutter in the diffs makes the translators' job much more difficult. Instead, make any style or whitespace changes in separate commits that are clearly labeled as such in the commit message. === Deprecating Features When it is necessary to remove functionality from software in the base system, follow these guidelines whenever possible: . Mention is made in the manual page and possibly the release notes that the option, utility, or interface is deprecated. Use of the deprecated feature generates a warning. . The option, utility, or interface is preserved until the next major (point zero) release. . The option, utility, or interface is removed and no longer documented. It is now obsolete. It is also generally a good idea to note its removal in the release notes. === Privacy and Confidentiality . Most FreeBSD business is done in public. + FreeBSD is an _open_ project. Which means that not only can anyone use the source code, but that most of the development process is open to public scrutiny. . Certain sensitive matters must remain private or held under embargo. + There unfortunately cannot be complete transparency. As a FreeBSD developer you will have a certain degree of privileged access to information. Consequently you are expected to respect certain requirements for confidentiality. Sometimes the need for confidentiality comes from external collaborators or has a specific time limit. Mostly though, it is a matter of not releasing private communications. . The Security Officer has sole control over the release of security advisories. + Where there are security problems that affect many different operating systems, FreeBSD frequently depends on early access to be able to prepare advisories for coordinated release. Unless FreeBSD developers can be trusted to maintain security, such early access will not be made available. The Security Officer is responsible for controlling pre-release access to information about vulnerabilities, and for timing the release of all advisories. He may request help under condition of confidentiality from any developer with relevant knowledge to prepare security fixes. . Communications with Core are kept confidential for as long as necessary. + Communications to core will initially be treated as confidential. Eventually however, most of Core's business will be summarized into the monthly or quarterly core reports. Care will be taken to avoid publicising any sensitive details. Records of some particularly sensitive subjects may not be reported on at all and will be retained only in Core's private archives. . Non-disclosure Agreements may be required for access to certain commercially sensitive data. + Access to certain commercially sensitive data may only be available under a Non-Disclosure Agreement. The FreeBSD Foundation legal staff must be consulted before any binding agreements are entered into. . Private communications must not be made public without permission. + Beyond the specific requirements above there is a general expectation not to publish private communications between developers without the consent of all parties involved. Ask permission before forwarding a message onto a public mailing list, or posting it to a forum or website that can be accessed by other than the original correspondents. . Communications on project-only or restricted access channels must be kept private. + Similarly to personal communications, certain internal communications channels, including FreeBSD Committer only mailing lists and restricted access IRC channels are considered private communications. Permission is required to publish material from these sources. . Core may approve publication. + Where it is impractical to obtain permission due to the number of correspondents or where permission to publish is unreasonably withheld, Core may approve release of such private matters that merit more general publication. [[archs]] == Support for Multiple Architectures FreeBSD is a highly portable operating system intended to function on many different types of hardware architectures. Maintaining clean separation of Machine Dependent (MD) and Machine Independent (MI) code, as well as minimizing MD code, is an important part of our strategy to remain agile with regards to current hardware trends. Each new hardware architecture supported by FreeBSD adds substantially to the cost of code maintenance, toolchain support, and release engineering. It also dramatically increases the cost of effective testing of kernel changes. As such, there is strong motivation to differentiate between classes of support for various architectures while remaining strong in a few key architectures that are seen as the FreeBSD "target audience". === Statement of General Intent The FreeBSD Project targets "production quality commercial off-the-shelf (COTS) workstation, server, and high-end embedded systems". By retaining a focus on a narrow set of architectures of interest in these environments, the FreeBSD Project is able to maintain high levels of quality, stability, and performance, as well as minimize the load on various support teams on the project, such as the ports team, documentation team, security officer, and release engineering teams. Diversity in hardware support broadens the options for FreeBSD consumers by offering new features and usage opportunities, but these benefits must always be carefully considered in terms of the real-world maintenance cost associated with additional platform support. The FreeBSD Project differentiates platform targets into four tiers. Each tier includes a list of guarantees consumers may rely on as well as obligations by the Project and developers to fulfill those guarantees. These lists define the minimum guarantees for each tier. The Project and developers may provide additional levels of support beyond the minimum guarantees for a given tier, but such additional support is not guaranteed. Each platform target is assigned to a specific tier for each stable branch. As a result, a platform target might be assigned to different tiers on concurrent stable branches. === Platform Targets Support for a hardware platform consists of two components: kernel support and userland Application Binary Interfaces (ABIs). Kernel platform support includes things needed to run a FreeBSD kernel on a hardware platform such as machine-dependent virtual memory management and device drivers. A userland ABI specifies an interface for user processes to interact with a FreeBSD kernel and base system libraries. A userland ABI includes system call interfaces, the layout and semantics of public data structures, and the layout and semantics of arguments passed to subroutines. Some components of an ABI may be defined by specifications such as the layout of C++ exception objects or calling conventions for C functions. A FreeBSD kernel also uses an ABI (sometimes referred to as the Kernel Binary Interface (KBI)) which includes the semantics and layouts of public data structures and the layout and semantics of arguments to public functions within the kernel itself. A FreeBSD kernel may support multiple userland ABIs. For example, FreeBSD's amd64 kernel supports FreeBSD amd64 and i386 userland ABIs as well as Linux x86_64 and i386 userland ABIs. A FreeBSD kernel should support a "native" ABI as the default ABI. The native "ABI" generally shares certain properties with the kernel ABI such as the C calling convention, sizes of basic types, etc. Tiers are defined for both kernels and userland ABIs. In the common case, a platform's kernel and FreeBSD ABIs are assigned to the same tier. ==== Tier 1: Fully-Supported Architectures Tier 1 platforms are the most mature FreeBSD platforms. They are supported by the security officer, release engineering, and Ports Management Team. Tier 1 architectures are expected to be Production Quality with respect to all aspects of the FreeBSD operating system, including installation and development environments. The FreeBSD Project provides the following guarantees to consumers of Tier 1 platforms: * Official FreeBSD release images will be provided by the release engineering team. * Binary updates and source patches for Security Advisories and Errata Notices will be provided for supported releases. * Source patches for Security Advisories will be provided for supported branches. * Binary updates and source patches for cross-platform Security Advisories will typically be provided at the time of the announcement. * Changes to userland ABIs will generally include compatibility shims to ensure correct operation of binaries compiled against any stable branch where the platform is Tier 1. These shims might not be enabled in the default install. If compatibility shims are not provided for an ABI change, the lack of shims will be clearly documented in the release notes. * Changes to certain portions of the kernel ABI will include compatibility shims to ensure correct operation of kernel modules compiled against the oldest supported release on the branch. Note that not all parts of the kernel ABI are protected. * Official binary packages for third party software will be provided by the ports team. For embedded architectures, these packages may be cross-built from a different architecture. * Most relevant ports should either build or have the appropriate filters to prevent inappropriate ones from building. * New features which are not inherently platform-specific will be fully functional on all Tier 1 architectures. * Features and compatibility shims used by binaries compiled against older stable branches may be removed in newer major versions. Such removals will be clearly documented in the release notes. * Tier 1 platforms should be fully documented. Basic operations will be documented in the FreeBSD Handbook. * Tier 1 platforms will be included in the source tree. * Tier 1 platforms should be self-hosting either via the in-tree toolchain or an external toolchain. If an external toolchain is required, official binary packages for an external toolchain will be provided. To maintain maturity of Tier 1 platforms, the FreeBSD Project will maintain the following resources to support development: * Build and test automation support either in the FreeBSD.org cluster or some other location easily available for all developers. Embedded platforms may substitute an emulator available in the FreeBSD.org cluster for actual hardware. * Inclusion in the `make universe` and `make tinderbox` targets. * Dedicated hardware in one of the FreeBSD clusters for package building (either natively or via qemu-user). Collectively, developers are required to provide the following to maintain the Tier 1 status of a platform: * Changes to the source tree should not knowingly break the build of a Tier 1 platform. * Tier 1 architectures must have a mature, healthy ecosystem of users and active developers. * Developers should be able to build packages on commonly available, non-embedded Tier 1 systems. This can mean either native builds if non-embedded systems are commonly available for the platform in question, or it can mean cross-builds hosted on some other Tier 1 architecture. * Changes cannot break the userland ABI. If an ABI change is required, ABI compatibility for existing binaries should be provided via use of symbol versioning or shared library version bumps. * Changes merged to stable branches cannot break the protected portions of the kernel ABI. If a kernel ABI change is required, the change should be modified to preserve functionality of existing kernel modules. ==== Tier 2: Developmental and Niche Architectures Tier 2 platforms are functional, but less mature FreeBSD platforms. They are not supported by the security officer, release engineering, and Ports Management Team. Tier 2 platforms may be Tier 1 platform candidates that are still under active development. Architectures reaching end of life may also be moved from Tier 1 status to Tier 2 status as the availability of resources to continue to maintain the system in a Production Quality state diminishes. Well-supported niche architectures may also be Tier 2. The FreeBSD Project provides the following guarantees to consumers of Tier 2 platforms: * The ports infrastructure should include basic support for Tier 2 architectures sufficient to support building ports and packages. This includes support for basic packages such as ports-mgmt/pkg, but there is no guarantee that arbitrary ports will be buildable or functional. * New features which are not inherently platform-specific should be feasible on all Tier 2 architectures if not implemented. * Tier 2 platforms will be included in the source tree. * Tier 2 platforms should be self-hosting either via the in-tree toolchain or an external toolchain. If an external toolchain is required, official binary packages for an external toolchain will be provided. * Tier 2 platforms should provide functional kernels and userlands even if an official release distribution is not provided. To maintain maturity of Tier 2 platforms, the FreeBSD Project will maintain the following resources to support development: * Inclusion in the `make universe` and `make tinderbox` targets. Collectively, developers are required to provide the following to maintain the Tier 2 status of a platform: * Changes to the source tree should not knowingly break the build of a Tier 2 platform. * Tier 2 architectures must have an active ecosystem of users and developers. * While changes are permitted to break the userland ABI, the ABI should not be broken gratuitously. Significant userland ABI changes should be restricted to major versions. * New features that are not yet implemented on Tier 2 architectures should provide a means of disabling them on those architectures. ==== Tier 3: Experimental Architectures Tier 3 platforms have at least partial FreeBSD support. They are _not_ supported by the security officer, release engineering, and Ports Management Team. Tier 3 platforms are architectures in the early stages of development, for non-mainstream hardware platforms, or which are considered legacy systems unlikely to see broad future use. Initial support for Tier 3 platforms may exist in a separate repository rather than the main source repository. The FreeBSD Project provides no guarantees to consumers of Tier 3 platforms and is not committed to maintaining resources to support development. Tier 3 platforms may not always be buildable, nor are any kernel or userland ABIs considered stable. ==== Unsupported Architectures Other platforms are not supported in any form by the project. The project previously described these as Tier 4 systems. After a platform transitions to unsupported, all support for the platform is removed from the source, ports and documentation trees. Note that ports support should remain as long as the platform is supported in a branch supported by ports. === Policy on Changing the Tier of an Architecture Systems may only be moved from one tier to another by approval of the FreeBSD Core Team, which shall make that decision in collaboration with the Security Officer, Release Engineering, and ports management teams. For a platform to be promoted to a higher tier, any missing support guarantees must be satisfied before the promotion is completed. [[ports]] == Ports Specific FAQ [[ports-qa-adding]] === Adding a New Port [[ports-qa-add-new]] ==== How do I add a new port? Adding a port to the tree is relatively simple. Once the port is ready to be added, as explained later crossref:committers-guide[ports-qa-add-new-extra,here], you need to add the port's directory entry in the category's [.filename]#Makefile#. In this [.filename]#Makefile#, ports are listed in alphabetical order and added to the `SUBDIR` variable, like this: [.programlisting] .... SUBDIR += newport .... Once the port and its category's Makefile are ready, the new port can be committed: [source,shell] .... % git add category/Makefile category/newport % git commit % git push .... [TIP] ==== Don't forget to crossref:committers-guide[port-commit-message-formats,setup git hooks for the ports tree as explained here]; a specific hook has been developed to verify the category's [.filename]#Makefile#. ==== [[ports-qa-add-new-extra]] ==== Any other things I need to know when I add a new port? Check the port, preferably to make sure it compiles and packages correctly. The extref:{porters-handbook}testing[Porters Handbook's Testing Chapter] contains more detailed instructions. See the extref:{porters-handbook}testing[Portclippy / Portfmt, testing-portclippy] and the extref:{porters-handbook}testing[poudriere, testing-poudriere] sections. You do not necessarily have to eliminate all warnings but make sure you have fixed the simple ones. If the port came from a submitter who has not contributed to the Project before, add that person's name to the extref:{contributors}[Additional Contributors, contrib-additional] section of the FreeBSD Contributors List. Close the PR if the port came in as a PR. To close a PR, change the state to `Issue Resolved` and the resolution as `Fixed`. [NOTE] ==== If for some reason using extref:{porters-handbook}testing[poudriere, testing-poudriere] to test the new port is not possible, the bare minimum of testing includes this sequence: [source,shell] .... # make install # make package # make deinstall # pkg add package you built above # make deinstall # make reinstall # make package .... Note that poudriere is the reference for package building, it the port does not build in poudriere, it will be removed. ==== [[ports-qa-removing]] === Removing an Existing Port [[ports-qa-remove-one]] ==== How do I remove an existing port? First, please read the section about repository copies. Before you remove the port, you have to verify there are no other ports depending on it. * Make sure there is no dependency on the port in the ports collection: ** The port's PKGNAME appears in exactly one line in a recent INDEX file. ** No other ports contains any reference to the port's directory or PKGNAME in their Makefiles + [TIP] ==== When using Git, consider using man:git-grep[1], it is much faster than `grep -r`. ==== + * Then, remove the port: + [.procedure] ==== * Remove the port's files and directory with `git rm`. * Remove the `SUBDIR` listing of the port in the parent directory [.filename]#Makefile#. * Add an entry to [.filename]#ports/MOVED#. * Remove the port from [.filename]#ports/LEGAL# if it is there. ==== Alternatively, you can use the rmport script, from [.filename]#ports/Tools/scripts#. This script was written by {vd}. When sending questions about this script to the {freebsd-ports}, please also CC {crees}, the current maintainer. [[ports-qa-move-port]] === How do I move a port to a new location? [.procedure] ==== . Perform a thorough check of the ports collection for any dependencies on the old port location/name, and update them. Running `grep` on [.filename]#INDEX# is not enough because some ports have dependencies enabled by compile-time options. A full man:git-grep[1] of the ports collection is recommended. . Remove the `SUBDIR` entry from the old category Makefile and add a `SUBDIR` entry to the new category Makefile. . Add an entry to [.filename]#ports/MOVED#. . Search for entries in xml files inside [.filename]#ports/security/vuxml# and adjust them accordingly. In particular, check for previous packages with the new name which version could include the new port. . Move the port with `git mv`. . Commit the changes. ==== [[ports-qa-copy-port]] === How do I copy a port to a new location? [.procedure] ==== . Copy port with `cp -R old-cat/old-port new-cat/new-port`. . Add the new port to the [.filename]#new-cat/Makefile#. . Change stuff in [.filename]#new-cat/new-port#. . Commit the changes. ==== [[ports-qa-freeze]] === Ports Freeze [[ports-qa-freeze-what]] ==== What is a “ports freeze”? A “ports freeze” was a restricted state the ports tree was put in before a release. It was used to ensure a higher quality for the packages shipped with a release. It usually lasted a couple of weeks. During that time, build problems were fixed, and the release packages were built. This practice is no longer used, as the packages for the releases are built from the current stable, quarterly branch. For more information on how to merge commits to the quarterly branch, see crossref:committers-guide[ports-qa-misc-request-mfh, What is the procedure to request authorization for merging a commit to the quarterly branch?]. [[ports-qa-quarterly]] === Quarterly Branches [[ports-qa-misc-request-mfh]] ==== What is the procedure to request authorization for merging a commit to the quarterly branch? As of November 30, 2020, there is no need to seek explicit approval to commit to the quarterly branch. [[ports-qa-misc-commit-mfh]] ==== What is the procedure for merging commits to the quarterly branch? Merging commits to the quarterly branch (a process we call MFH for a historical reason) is very similar to MFC'ing a commit in the src repository, so basically: [source,shell] .... % git checkout 2021Q2 % git cherry-pick -x $HASH (verify everything is OK, for example by doing a build test) % git push .... where `$HASH` is the hash of the commit you want to copy over to the quarterly branch. The `-x` parameter ensures the hash `$HASH` of the `main` branch is included in the new commit message of the quarterly branch. [[ports-qa-new-category]] === Creating a New Category [[ports-qa-new-category-how]] ==== What is the procedure for creating a new category? Please see extref:{porters-handbook}makefiles[Proposing a New Category, proposing-categories] in the Porter's Handbook. Once that procedure has been followed and the PR has been assigned to the {portmgr}, it is their decision whether or not to approve it. If they do, it is their responsibility to: [.procedure] ==== . Perform any needed moves. (This only applies to physical categories.) . Update the `VALID_CATEGORIES` definition in [.filename]#ports/Mk/bsd.port.mk#. . Assign the PR back to you. ==== [[ports-qa-new-category-physical]] ==== What do I need to do to implement a new physical category? [.procedure] ==== . Upgrade each moved port's [.filename]#Makefile#. Do not connect the new category to the build yet. + To do this, you will need to: + [.procedure] ====== . Change the port's `CATEGORIES` (this was the point of the exercise, remember?) The new category is listed first. This will help to ensure that the PKGORIGIN is correct. . Run a `make describe`. Since the top-level `make index` that you will be running in a few steps is an iteration of `make describe` over the entire ports hierarchy, catching any errors here will save you having to re-run that step later on. . If you want to be really thorough, now might be a good time to run man:portlint[1]. ====== + . Check that the ``PKGORIGIN``s are correct. The ports system uses each port's `CATEGORIES` entry to create its `PKGORIGIN`, which is used to connect installed packages to the port directory they were built from. If this entry is wrong, common port tools like man:pkg-version[8] and man:portupgrade[1] fail. + To do this, use the [.filename]#chkorigin.sh# tool: `env PORTSDIR=/path/to/ports sh -e /path/to/ports/Tools/scripts/chkorigin.sh`. This will check every port in the ports tree, even those not connected to the build, so you can run it directly after the move operation. Hint: do not forget to look at the ``PKGORIGIN``s of any slave ports of the ports you just moved! . On your own local system, test the proposed changes: first, comment out the SUBDIR entries in the old ports' categories' [.filename]##Makefile##s; then enable building the new category in [.filename]#ports/Makefile#. Run make checksubdirs in the affected category directories to check the SUBDIR entries. Next, in the [.filename]#ports/# directory, run make index. This can take over 40 minutes on even modern systems; however, it is a necessary step to prevent problems for other people. . Once this is done, you can commit the updated [.filename]#ports/Makefile# to connect the new category to the build and also commit the [.filename]#Makefile# changes for the old category or categories. . Add appropriate entries to [.filename]#ports/MOVED#. . Update the documentation by modifying: ** the extref:{porters-handbook}makefiles[list of categories, porting-categories] in the Porter's Handbook + . Only once all the above have been done, and no one is any longer reporting problems with the new ports, should the old ports be deleted from their previous locations in the repository. ==== ==== What do I need to do to implement a new virtual category? This is much simpler than a physical category. Only a few modifications are needed: * the extref:{porters-handbook}makefiles[list of categories, porting-categories] in the Porter's Handbook [[ports-qa-misc-questions]] === Miscellaneous Questions [[ports-qa-misc-blanket-approval]] ==== Are there changes that can be committed without asking the maintainer for approval? Blanket approval for most ports applies to these types of fixes: * Most infrastructure changes to a port (that is, modernizing, but not changing the functionality). For example, the blanket covers converting to new `USES` macros, enabling verbose builds, and switching to new ports system syntaxes. * Trivial and _tested_ build and runtime fixes. * Documentations or metadata changes to ports, like [.filename]#pkg-descr# or `COMMENT`. [IMPORTANT] ==== Exceptions to this are anything maintained by the {portmgr}, or the {security-officer}. No unauthorized commits may ever be made to ports maintained by those groups. ==== [[ports-qa-misc-correctly-building]] ==== How do I know if my port is building correctly or not? The packages are built multiple times each week. If a port fails, the maintainer will receive an email from `pkg-fallout@FreeBSD.org`. Reports for all the package builds (official, experimental, and non-regression) are aggregated at link:https://pkg-status.FreeBSD.org[pkg-status.FreeBSD.org]. [[ports-qa-misc-INDEX]] ==== I added a new port. Do I need to add it to the [.filename]#INDEX#? No. The file can either be generated by running `make index`, or a pre-generated version can be downloaded with `make fetchindex`. [[ports-qa-misc-no-touch]] ==== Are there any other files I am not allowed to touch? Any file directly under [.filename]#ports/#, or any file under a subdirectory that starts with an uppercase letter ([.filename]#Mk/#, [.filename]#Tools/#, etc.). In particular, the {portmgr} is very protective of [.filename]#ports/Mk/bsd.port*.mk# so do not commit changes to those files unless you want to face their wrath. [[ports-qa-misc-updated-distfile]] ==== What is the proper procedure for updating the checksum for a port distfile when the file changes without a version change? When the checksum for a distribution file is updated due to the author updating the file without changing the port revision, the commit message includes a summary of the relevant diffs between the original and new distfile to ensure that the distfile has not been corrupted or maliciously altered. If the current version of the port has been in the ports tree for a while, a copy of the old distfile will usually be available on the ftp servers; otherwise the author or maintainer should be contacted to find out why the distfile has changed. [[ports-exp-run]] ==== How can an experimental test build of the ports tree (exp-run) be requested? An exp-run must be completed before patches with a significant ports impact are committed. The patch can be against the ports tree or the base system. Full package builds will be done with the patches provided by the submitter, and the submitter is required to fix detected problems _(fallout)_ before commit. [.procedure] ==== . Go to the link:https://bugs.freebsd.org/submit[Bugzilla new PR page]. . Select the product your patch is about. . Fill in the bug report as normal. Remember to attach the patch. . If at the top it says “Show Advanced Fields” click on it. It will now say “Hide Advanced Fields”. Many new fields will be available. If it already says “Hide Advanced Fields”, no need to do anything. . In the “Flags” section, set the “exp-run” one to `?`. As for all other fields, hovering the mouse over any field shows more details. . Submit. Wait for the build to run. . {portmgr} will reply with a possible fallout. . Depending on the fallout: ** If there is no fallout, the procedure stops here, and the change can be committed, pending any other approval required. ... If there is fallout, it _must_ be fixed, either by fixing the ports directly in the ports tree, or adding to the submitted patch. ... When this is done, go back to step 6 saying the fallout was fixed and wait for the exp-run to be run again. Repeat as long as there are broken ports. ==== [[non-committers]] == Issues Specific to Developers Who Are Not Committers A few people who have access to the FreeBSD machines do not have commit bits. Almost all of this document will apply to these developers as well (except things specific to commits and the mailing list memberships that go with them). In particular, we recommend that you read: * crossref:committers-guide[admin, Administrative Details] * crossref:committers-guide[conventions-everyone, For Everyone] + [NOTE] ==== Get your mentor to add you to the "Additional Contributors" ([.filename]#doc/shared/contrib-additional.adoc#), if you are not already listed there. ==== * crossref:committers-guide[developer.relations, Developer Relations] * crossref:committers-guide[ssh.guide, SSH Quick-Start Guide] * crossref:committers-guide[rules, The FreeBSD Committers' Big List of Rules] [[google-analytics]] == Information About Google Analytics As of December 12, 2012, Google Analytics was enabled on the FreeBSD Project website to collect anonymized usage statistics regarding usage of the site. [NOTE] ==== As of March 3, 2022, Google Analytics was removed from the FreeBSD Project. ==== [[misc]] == Miscellaneous Questions === How do I access people.FreeBSD.org to put up personal or project information? `people.FreeBSD.org` is the same as `freefall.FreeBSD.org`. Just create a [.filename]#public_html# directory. Anything you place in that directory will automatically be visible under https://people.FreeBSD.org/[https://people.FreeBSD.org/]. === Where are the mailing list archives stored? The mailing lists are archived under [.filename]#/local/mail# on `freefall.FreeBSD.org`. === I would like to mentor a new committer. What process do I need to follow? See the https://www.freebsd.org/internal/new-account/[New Account Creation Procedure] document on the internal pages. [[benefits]] == Benefits and Perks for FreeBSD Committers [[benefits-recognition]] === Recognition Recognition as a competent software engineer is the longest lasting value. In addition, getting a chance to work with some of the best people that every engineer would dream of meeting is a great perk! [[benefits-freebsdmall]] === FreeBSD Mall FreeBSD committers can get a free 4-CD or DVD set at conferences from http://www.freebsdmall.com[FreeBSD Mall, Inc.]. [[benefits-gandi]] === `Gandi.net` https://gandi.net[Gandi] provides website hosting, cloud computing, domain registration, and X.509 certificate services. Gandi offers an E-rate discount to all FreeBSD developers. To streamline the process of getting the discount first set up a Gandi account, fill in the billing information and select the currency. Then send an mail to mailto:non-profit@gandi.net[non-profit@gandi.net] using your `@freebsd.org` mail address, and indicate your Gandi handle. [[benefits-rsync]] === `rsync.net` https://rsync.net[rsync.net] provides cloud storage for offsite backup that is optimized for UNIX users. Their service runs entirely on FreeBSD and ZFS. rsync.net offers a free-forever 500 GB account to FreeBSD developers. Simply sign up at https://www.rsync.net/freebsd.html[https://www.rsync.net/freebsd.html] using your `@freebsd.org` address to receive this free account. diff --git a/documentation/content/en/articles/explaining-bsd/_index.adoc b/documentation/content/en/articles/explaining-bsd/_index.adoc index 82c9222170..23fb72ec2e 100644 --- a/documentation/content/en/articles/explaining-bsd/_index.adoc +++ b/documentation/content/en/articles/explaining-bsd/_index.adoc @@ -1,233 +1,233 @@ --- title: Explaining BSD authors: - author: Greg Lehey email: grog@FreeBSD.org description: Brief explanation about BSD -trademarks: ["freebsd", "amd", "apple", "git", intel", "linux", "opengroup", "sun", "unix", "general"] +trademarks: ["freebsd", "amd", "apple", "git", "intel", "linux", "opengroup", "sun", "unix", "general"] tags: ["Explaining BSD", "BSD", "FreeBSD", "operating system"] --- = Explaining BSD :doctype: article :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :source-highlighter: rouge :experimental: :images-path: articles/explaining-bsd/ ifdef::env-beastie[] ifdef::backend-html5[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] :imagesdir: ../../../images/{images-path} endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [.abstract-title] Abstract In the open source world, the word "Linux" is almost synonymous with "Operating System", but it is not the only open source UNIX(R) operating system. So what is the secret? Why is BSD not better known? This white paper addresses these and other questions. Throughout this paper, differences between BSD and Linux will be noted __like this__. ''' toc::[] [[what-is-bsd]] == What is BSD? BSD stands for "Berkeley Software Distribution". It is the name of distributions of source code from the University of California, Berkeley, which were originally extensions to AT&T's Research UNIX(R) operating system. Several open source operating system projects are based on a release of this source code known as 4.4BSD-Lite. In addition, they comprise a number of packages from other Open Source projects, including notably the GNU project. The overall operating system comprises: * The BSD kernel, which handles process scheduling, memory management, symmetric multi-processing (SMP), device drivers, etc. * The C library, the base API for the system. + __The BSD C library is based on code from Berkeley, not the GNU project.__ * Utilities such as shells, file utilities, compilers and linkers. + __Some of the utilities are derived from the GNU project, others are not.__ * The X Window system, which handles graphical display. + The X Window system used in most versions of BSD is maintained by the http://www.X.org/[X.Org project]. FreeBSD allows the user to choose from a variety of desktop environments, such as Gnome, KDE, or Xfce; and lightweight window managers like Openbox, Fluxbox, or Awesome. * Many other programs and utilities. [[what-a-real-unix]] == What, a real UNIX(R)? The BSD operating systems are not clones, but open source derivatives of AT&T's Research UNIX(R) operating system, which is also the ancestor of the modern UNIX(R) System V. This may surprise you. How could that happen when AT&T has never released its code as open source? It is true that AT&T UNIX(R) is not open source, and in a copyright sense BSD is very definitely _not_ UNIX(R), but on the other hand, AT&T has imported sources from other projects, noticeably the Computer Sciences Research Group (CSRG) of the University of California in Berkeley, CA. Starting in 1976, the CSRG started releasing tapes of their software, calling them _Berkeley Software Distribution_ or __BSD__. Initial BSD releases consisted mainly of user programs, but that changed dramatically when the CSRG landed a contract with the Defense Advanced Research Projects Agency (DARPA) to upgrade the communications protocols on their network, ARPANET. The new protocols were known as the __Internet Protocols__, later _TCP/IP_ after the most important protocols. The first widely distributed implementation was part of 4.2BSD, in 1982. In the course of the 1980s, a number of new workstation companies sprang up. Many preferred to license UNIX(R) rather than developing operating systems for themselves. In particular, Sun Microsystems licensed UNIX(R) and implemented a version of 4.2BSD, which they called SunOS(TM). When AT&T themselves were allowed to sell UNIX(R) commercially, they started with a somewhat bare-bones implementation called System III, to be quickly followed by System V. The System V code base did not include networking, so all implementations included additional software from the BSD, including the TCP/IP software, but also utilities such as the _csh_ shell and the _vi_ editor. Collectively, these enhancements were known as the __Berkeley Extensions__. The BSD tapes contained AT&T source code and thus required a UNIX(R) source license. By 1990, the CSRG's funding was running out, and it faced closure. Some members of the group decided to release the BSD code, which was Open Source, without the AT&T proprietary code. This finally happened with the __Networking Tape 2__, usually known as __Net/2__. Net/2 was not a complete operating system: about 20% of the kernel code was missing. One of the CSRG members, William F. Jolitz, wrote the remaining code and released it in early 1992 as __386BSD__. At the same time, another group of ex-CSRG members formed a commercial company called http://www.bsdi.com/[Berkeley Software Design Inc.] and released a beta version of an operating system called http://www.bsdi.com/[BSD/386], which was based on the same sources. The name of the operating system was later changed to BSD/OS. 386BSD never became a stable operating system. Instead, two other projects split off from it in 1993: http://www.NetBSD.org/[NetBSD] and link:https://www.FreeBSD.org/[FreeBSD]. The two projects originally diverged due to differences in patience waiting for improvements to 386BSD: the NetBSD people started early in the year, and the first version of FreeBSD was not ready until the end of the year. In the meantime, the code base had diverged sufficiently to make it difficult to merge. In addition, the projects had different aims, as we will see below. In 1996, http://www.OpenBSD.org/[OpenBSD] split off from NetBSD, and in 2003, http://www.dragonflybsd.org/[DragonFlyBSD] split off from FreeBSD. [[why-is-bsd-not-better-known]] == Why is BSD not better known? For a number of reasons, BSD is relatively unknown: . The BSD developers are often more interested in polishing their code than marketing it. . Much of Linux's popularity is due to factors external to the Linux projects, such as the press, and to companies formed to provide Linux services. Until recently, the open source BSDs had no such proponents. . In 1992, AT&T sued http://www.bsdi.com/[BSDI], the vendor of BSD/386, alleging that the product contained AT&T-copyrighted code. The case was settled out of court in 1994, but the spectre of the litigation continues to haunt people. In March 2000 an article published on the web claimed that the court case had been "recently settled". + One detail that the lawsuit did clarify is the naming: in the 1980s, BSD was known as "BSD UNIX(R)". With the elimination of the last vestige of AT&T code from BSD, it also lost the right to the name UNIX(R). Thus you will see references in book titles to "the 4.3BSD UNIX(R) operating system" and "the 4.4BSD operating system". [[comparing-bsd-and-linux]] == Comparing BSD and Linux So what is really the difference between, say, Debian Linux and FreeBSD? For the average user, the difference is surprisingly small: Both are UNIX(R) like operating systems. Both are developed by non-commercial projects (this does not apply to many other Linux distributions, of course). In the following section, we will look at BSD and compare it to Linux. The description applies most closely to FreeBSD, which accounts for an estimated 80% of the BSD installations, but the differences from NetBSD, OpenBSD and DragonFlyBSD are small. === Who owns BSD? No one person or corporation owns BSD. It is created and distributed by a community of highly technical and committed contributors all over the world. Some of the components of BSD are Open Source projects in their own right and managed by different project maintainers. === How is BSD developed and updated? The BSD kernels are developed and updated following the Open Source development model. Each project maintains a publicly accessible _source tree_ which contains all source files for the project, including documentation and other incidental files. Users can obtain a complete copy of any version. A large number of developers worldwide contribute to improvements to BSD. They are divided into three kinds: * _Contributors_ write code or documentation. They are not permitted to commit (add code) directly to the source tree. For their code to be included in the system, it must be reviewed and checked in by a registered developer, known as a __committer__. * _Committers_ are developers with write access to the source tree. To become a committer, an individual must show ability in the area in which they are active. + It is at the individual committer's discretion whether they should obtain authority before committing changes to the source tree. In general, an experienced committer may make changes which are obviously correct without obtaining consensus. For example, a documentation project committer may correct typographical or grammatical errors without review. On the other hand, developers making far-reaching or complicated changes are expected to submit their changes for review before committing them. In extreme cases, a core team member with a function such as Principal Architect may order that changes be removed from the tree, a process known as _backing out_. All committers receive mail describing each individual commit, so it is not possible to commit secretly. * The _Core team_. FreeBSD and NetBSD each have a core team which manages the project. The core teams developed in the course of the projects, and their role is not always well-defined. It is not necessary to be a developer to be a core team member, though it is normal. The rules for the core team vary from one project to the other, but in general they have more say in the direction of the project than non-core team members have. This arrangement differs from Linux in a number of ways: . No one person controls the content of the system. In practice, this difference is overrated, since the Principal Architect can require that code be backed out, and even in the Linux project several people are permitted to make changes. . On the other hand, there _is_ a central repository, a single place where you can find the entire operating system sources, including all older versions. . BSD projects maintain the entire "Operating System", not only the kernel. This distinction is only marginally useful: neither BSD nor Linux is useful without applications. The applications used under BSD are frequently the same as the applications used under Linux. . As a result of the formalized maintenance of a single Git source tree, BSD development is clear, and it is possible to access any version of the system by release number or by date. Git also allows incremental updates to the system: for example, the FreeBSD repository is updated about 100 times a day. Most of these changes are small. === BSD releases FreeBSD, NetBSD and OpenBSD provide the system in three different "releases". As with Linux, releases are assigned a number such as 1.4.1 or 3.5. In addition, the version number has a suffix indicating its purpose: . The development version of the system is called _CURRENT_. FreeBSD assigns a number to CURRENT, for example FreeBSD 5.0-CURRENT. NetBSD uses a slightly different naming scheme and appends a single-letter suffix which indicates changes in the internal interfaces, for example NetBSD 1.4.3G. OpenBSD does not assign a number ("OpenBSD-current"). All new development on the system goes into this branch. . At regular intervals, between two and four times a year, the projects bring out a _RELEASE_ version of the system, which is available on CD-ROM and for free download from FTP sites, for example OpenBSD 2.6-RELEASE or NetBSD 1.4-RELEASE. The RELEASE version is intended for end users and is the normal version of the system. NetBSD also provides _patch releases_ with a third digit, for example NetBSD 1.4.2. . As bugs are found in a RELEASE version, they are fixed, and the fixes are added to the Git tree. In FreeBSD, the resultant version is called the _STABLE_ version, while in NetBSD and OpenBSD it continues to be called the RELEASE version. Smaller new features can also be added to this branch after a period of test in the CURRENT branch. Security and other important bug fixes are also applied to all supported RELEASE versions. _By contrast, Linux maintains two separate code trees: the stable version and the development version. Stable versions have an even minor version number, such as 2.0, 2.2 or 2.4. Development versions have an odd minor version number, such as 2.1, 2.3 or 2.5. In each case, the number is followed by a further number designating the exact release. In addition, each vendor adds their own userland programs and utilities, so the name of the distribution is also important. Each distribution vendor also assigns version numbers to the distribution, so a complete description might be something like "TurboLinux 6.0 with kernel 2.2.14"_ === What versions of BSD are available? In contrast to the numerous Linux distributions, there are only four major open source BSDs. Each BSD project maintains its own source tree and its own kernel. In practice, though, there appear to be fewer divergences between the userland code of the projects than there is in Linux. It is difficult to categorize the goals of each project: the differences are very subjective. Basically, * FreeBSD aims for high performance and ease of use by end users, and is a favourite of web content providers. It runs on a link:https://www.FreeBSD.org/platforms/[number of platforms] and has significantly more users than the other projects. * NetBSD aims for maximum portability: "of course it runs NetBSD". It runs on machines from palmtops to large servers, and has even been used on NASA space missions. It is a particularly good choice for running on old non-Intel(R) hardware. * OpenBSD aims for security and code purity: it uses a combination of the open source concept and rigorous code reviews to create a system which is demonstrably correct, making it the choice of security-conscious organizations such as banks, stock exchanges and US Government departments. Like NetBSD, it runs on a number of platforms. * DragonFlyBSD aims for high performance and scalability under everything from a single-node UP system to a massively clustered system. DragonFlyBSD has several long-range technical goals, but focus lies on providing a SMP-capable infrastructure that is easy to understand, maintain and develop for. There are also two additional BSD UNIX(R) operating systems which are not open source, BSD/OS and Apple's Mac OS(R) X: * BSD/OS was the oldest of the 4.4BSD derivatives. It was not open source, though source code licenses were available at relatively low cost. It resembled FreeBSD in many ways. Two years after the acquisition of BSDi by Wind River Systems, BSD/OS failed to survive as an independent product. Support and source code may still be available from Wind River, but all new development is focused on the VxWorks embedded operating system. * http://www.apple.com/macosx/server/[Mac OS(R) X] is the latest version of the operating system for Apple(R)'s Mac(R) line. The BSD core of this operating system, http://developer.apple.com/darwin/[Darwin], is available as a fully functional open source operating system for x86 and PPC computers. The Aqua/Quartz graphics system and many other proprietary aspects of Mac OS(R) X remain closed-source, however. Several Darwin developers are also FreeBSD committers, and vice-versa. === How does the BSD license differ from the GNU Public license? Linux is available under the http://www.fsf.org/copyleft/gpl.html[GNU General Public License] (GPL), which is designed to eliminate closed source software. In particular, any derivative work of a product released under the GPL must also be supplied with source code if requested. By contrast, the http://www.opensource.org/licenses/bsd-license.html[BSD license] is less restrictive: binary-only distributions are allowed. This is particularly attractive for embedded applications. === What else should I know? Since fewer applications are available for BSD than Linux, the BSD developers created a Linux compatibility package, which allows Linux programs to run under BSD. The package includes both kernel modifications, to correctly perform Linux system calls, and Linux compatibility files such as the C library. There is no noticeable difference in execution speed between a Linux application running on a Linux machine and a Linux application running on a BSD machine of the same speed. The "all from one supplier" nature of BSD means that upgrades are much easier to handle than is frequently the case with Linux. BSD handles library version upgrades by providing compatibility modules for earlier library versions, so it is possible to run binaries which are several years old with no problems. === Which should I use, BSD or Linux? What does this all mean in practice? Who should use BSD, who should use Linux? This is a very difficult question to answer. Here are some guidelines: * "If it ain't broke, don't fix it": If you already use an open source operating system, and you are happy with it, there is probably no good reason to change. * BSD systems, in particular FreeBSD, can have notably higher performance than Linux. But this is not across the board. In many cases, there is little or no difference in performance. In some cases, Linux may perform better than FreeBSD. * In general, BSD systems have a better reputation for reliability, mainly as a result of the more mature code base. * BSD projects have a better reputation for the quality and completeness of their documentation. The various documentation projects aim to provide actively updated documentation, in many languages, and covering all aspects of the system. * The BSD license may be more attractive than the GPL. * BSD can execute most Linux binaries, while Linux can not execute BSD binaries. Many BSD implementations can also execute binaries from other UNIX(R) like systems. As a result, BSD may present an easier migration route from other systems than Linux would. === Who provides support, service, and training for BSD? http://www.ixsystems.com/[iXsystems, Inc.] provides support contracts for FreeBSD. In addition, each of the projects has a list of consultants for hire: link:https://www.FreeBSD.org/commercial/consult_bycat/[FreeBSD], http://www.netbsd.org/gallery/consultants.html[NetBSD], and http://www.openbsd.org/support.html[OpenBSD]. diff --git a/documentation/content/en/books/handbook/jails/_index.adoc b/documentation/content/en/books/handbook/jails/_index.adoc index 10ecf79d45..3b949cef5a 100644 --- a/documentation/content/en/books/handbook/jails/_index.adoc +++ b/documentation/content/en/books/handbook/jails/_index.adoc @@ -1,1251 +1,1251 @@ --- title: Chapter 17. Jails and Containers part: Part III. System Administration prev: books/handbook/security next: books/handbook/mac description: Jails improve on the concept of the traditional chroot environment in several ways tags: ["jails", "creating", "managing", "updating"] showBookMenu: true weight: 21 params: path: "/books/handbook/jails/" --- [[jails]] = Jails and Containers :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 17 :partnums: :source-highlighter: rouge :experimental: :images-path: books/handbook/jails/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [[jails-synopsis]] == Synopsis Since system administration is a difficult task, many tools have been developed to make life easier for the administrator. These tools often enhance the way systems are installed, configured, and maintained. One of the tools which can be used to enhance the security of a FreeBSD system is _jails_. Jails have been available since FreeBSD 4.X and continue to be enhanced in their usefulness, performance, reliability, and security. Jails build upon the man:chroot[2] concept, which is used to change the root directory of a set of processes. This creates a safe environment, separate from the rest of the system. Processes created in the chrooted environment can not access files or resources outside of it. For that reason, compromising a service running in a chrooted environment should not allow the attacker to compromise the entire system. However, a chroot has several limitations. It is suited to easy tasks which do not require much flexibility or complex, advanced features. Over time, many ways have been found to escape from a chrooted environment, making it a less than ideal solution for securing services. Jails improve on the concept of the traditional chroot environment in several ways. In a traditional chroot environment, processes are only limited in the part of the file system they can access. The rest of the system resources, system users, running processes, and the networking subsystem are shared by the chrooted processes and the processes of the host system. Jails expand this model by virtualizing access to the file system, the set of users, and the networking subsystem. More fine-grained controls are available for tuning the access of a jailed environment. Jails can be considered as a type of operating system-level virtualization. This chapter covers: * What a jail is and what purpose it may serve in FreeBSD installations. * The different types of jail. * The different ways to configure the network for a jail. * The jail configuration file. * How to create the different types of jail. * How to start, stop, and restart a jail. * The basics of jail administration, both from inside and outside the jail. * How to upgrade the different types of jail. -* A incomplete list of the different FreeBSD jail managers. +* An incomplete list of the different FreeBSD jail managers. [[jail-types]] == Jail Types Some administrators divide jails into different types, although the underlying technology is the same. Each administrator will have to assess what type of jail to create in each case depending on the problem they have to solve. Below can be found a list of the different types, their characteristics, and considerations for use. [[thick-jails]] === Thick Jails A thick jail is a traditional form of FreeBSD Jail. In a thick jail, a complete copy of the base system is replicated within the jail's environment. This means that the jail has its own separate instance of the FreeBSD base system, including libraries, executables, and configuration files. The jail can be thought of as an almost complete standalone FreeBSD installation, but running within the confines of the host system. This isolation ensures that the processes within the jail are kept separate from those on the host and other jails. Advantages of Thick Jails: * High degree of isolation: Processes within the jail are isolated from the host system and other jails. * Independence: Thick jails can have different versions of libraries, configurations, and software than the host system or other jails. * Security: Since the jail contains its own base system, vulnerabilities or issues affecting the jail environment will not directly impact the host or other jails. Disadvantages of Thick Jails: * Resource overhead: Because each jail maintains its own separate base system, thick jails consume more resources compared to thin jails. * Maintenance: Each jail requires its own maintenance and updates for its base system components. [[thin-jails]] === Thin Jails A thin jail shares the base system using OpenZFS snapshots or NullFS mounts from a template. Only a minimal subset of base system is duplicated for each thin jail, resulting in less resource consumption compared to a thick jail. However, this also means that thin jails have less isolation and independence compared to thick jails. Changes in shared components could potentially affect multiple thin jails simultaneously. In summary, a FreeBSD Thin Jail is a type of FreeBSD Jail that replicates a substantial portion, but not all, of the base system within the isolated environment. Advantages of Thin Jails: * Resource Efficiency: Thin jails are more resource-efficient compared to thick jails. Since they share most of the base system, they consume less disk space and memory. This makes it possible to run more jails on the same hardware without consuming excessive resources. * Faster Deployment: Creating and launching thin jails is generally faster compared to thick jails. This can be particularly advantageous when rapidly deploying multiple instances. * Unified Maintenance: Since thin jails share the majority of their base system with the host system, updates and maintenance of common base system components (such as libraries and binaries) only need to be done once on the host. This simplifies the maintenance process compared to maintaining an individual base system for each thick jail. * Shared Resources: Thin jails can more easily share common resources such as libraries and binaries with the host system. This can potentially lead to more efficient disk caching and improved performance for applications within the jail. Disadvantages of Thin Jails: * Reduced Isolation: The primary disadvantage of thin jails is that they offer less isolation compared to thick jails. Since they share a significant portion of the template's base system, vulnerabilities or issues affecting shared components could potentially impact multiple jails simultaneously. * Security Concerns: The reduced isolation in thin jails could pose security risks, as a compromise in one jail might have a greater potential to affect other jails or the host system. * Dependency Conflicts: If multiple thin jails require different versions of the same libraries or software, managing dependencies can become complex. In some cases, this might require additional effort to ensure compatibility. * Compatibility Challenges: Applications within a thin jail might encounter compatibility issues if they assume a certain base system environment that differs from the shared components provided by the template. [[service-jails]] === Service Jails A service jail shares the complete filesystem tree directly with the host (the jail root path is [.filename]#/#) and as such can access and modify any file on the host, and shares the same user accounts with the host. By default it has no access to the network or other resources which are restricted in jails, but they can be configured to re-use the network of the host and to remove some of the jail-restrictions. The use case for service jails is automatic confinement of services/daemons inside a jail with minimal configuration, and without any knowledge of the files needed by such service/daemon. Service jails exist since FreeBSD 15. Advantages of Service Jails: * Zero Administration: A service jail ready service needs only one config line in [.filename]#/etc/rc.conf#, a service which is not service jails ready needs two config lines. * Resource Efficiency: Service jails are more resource efficient than thin jails, as they do not need any additional disk space or network resource. * Faster Deployment: Creating and launching service jails is generally faster compared to thin jails if only distinct services/daemons shall be jailed and no parallel instances of the same service/daemon is needed. * Shared Resources: Service jails share all resources such as libraries and binaries with the host system. This can potentially lead to more efficient disk caching and improved performance for applications within the jail. * Process Isolation: Service jails isolate a particular service, it can not see processes which are not a child of the service jail, even if they run within the same user account. Disadvantages of Service Jails: * Reduced Isolation: The primary disadvantage of service jails is that they offer no filesystem isolation compared to thick or thin jails. * Security Concerns: The reduced isolation in service jails could pose security risks, as a compromise in one jail might have a greater potential to affect everything on the host system. Most of the configuration of jails which is discussed below is not needed for service jails. To understand how jails work, it is recommended to understand those configuration possibilities. The details about what is needed to configure a service jail is in crossref:jails[service-jails-config, Configuring service jails]. [[vnet-jails]] === VNET Jails A FreeBSD VNET jail is a virtualized environment that allows for the isolation and control of network resources for processes running within it. It provides a high level of network segmentation and security by creating a separate network stack for processes within the jail, ensuring that network traffic within the jail is isolated from the host system and other jails. In essence, FreeBSD VNET jails add a network configuration mechanism. This means a VNET jail can be created as a Thick or Thin Jail. [[linux-jails]] === Linux Jails A FreeBSD Linux Jail is a feature in the FreeBSD operating system that enables the use of Linux binaries and applications within a FreeBSD jail. This functionality is achieved by incorporating a compatibility layer that allows certain Linux system calls and libraries to be translated and executed on the FreeBSD kernel. The purpose of a Linux Jail is to facilitate the execution of Linux software on a FreeBSD system without needing a separate Linux virtual machine or environment. [[host-configuration]] == Host Configuration Before creating any jail on the host system it is necessary to perform certain configuration and obtain some information from the host system. It will be necessary to configure the man:jail[8] utility, create the necessary directories to configure and install jails, obtain information from the host's network, and check whether the host uses OpenZFS or UFS as its file system. [TIP] ==== The FreeBSD version running in the jail can not be newer than the version running in the host. ==== [[host-configuration-jail-utility]] === Jail Utility The man:jail[8] utility manages jails. To start jails when the system boots, run the following commands: [source,shell] .... # sysrc jail_enable="YES" # sysrc jail_parallel_start="YES" .... [TIP] ==== With `jail_parallel_start`, all configured jails will be started in the background. ==== [[jails-networking]] === Networking Networking for FreeBSD jails can be configured several different ways: Host Networking Mode (IP Sharing):: In host networking mode, a jail shares the same networking stack as the host system. When a jail is created in host networking mode it uses the same network interface and IP address. This means that the jail does not have a separate IP address, and its network traffic is associated with the host's IP. Virtual Networks (VNET):: Virtual Networks are a feature of FreeBSD jails that offer more advanced and flexible networking solutions than a basic networking mode like host networking. VNET allows the creation of isolated network stacks for each jail, providing them with their own separate IP addresses, routing tables, and network interfaces. This offers a higher level of network isolation and allows jails to function as if they are running on separate virtual machines. The netgraph system:: man:netgraph[4] is a versatile kernel framework for creating custom network configurations. It can be used to define how network traffic flows between jails and the host system and between different jails. [[host-configuration-directories]] === Setting Up the Jail Directory Tree There is no specific place to put the files for the jails. Some administrators use [.filename]#/jail#, others [.filename]#/usr/jail#, and still others [.filename]#/usr/local/jails#. In this chapter [.filename]#/usr/local/jails# will be used. Apart from [.filename]#/usr/local/jails# other directories will be created: * [.filename]#media# will contain the compressed files of the downloaded userlands. * [.filename]#templates# will contain the templates when using Thin Jails. * [.filename]#containers# will contain the jails. When using OpenZFS, execute the following commands to create datasets for these directories: [source,shell] .... # zfs create -o mountpoint=/usr/local/jails zroot/jails # zfs create zroot/jails/media # zfs create zroot/jails/templates # zfs create zroot/jails/containers .... [TIP] ==== In this case, `zroot` was used for the parent dataset, but other datasets could have been used. ==== When using UFS, execute the following commands to create the directories: [source,shell] .... # mkdir /usr/local/jails/ # mkdir /usr/local/jails/media # mkdir /usr/local/jails/templates # mkdir /usr/local/jails/containers .... [[jail-configuration-files]] === Jail Configuration Files There are two ways to configure jails. The first one is to add an entry for each jail to the file [.filename]#/etc/jail.conf#. The other option is to create a file for each jail in the directory [.filename]#/etc/jail.conf.d/#. In case a host system has few jails, an entry for each jail can be added in the file [.filename]#/etc/jail.conf#. If the host system has many jails, it is a good idea to have one configuration file for each jail in the [.filename]#/etc/jail.conf.d/# directory. The files in [.filename]#/etc/jail.conf.d/# must have `.conf` as their extension and have to be included in [.filename]#/etc/jail.conf#: [.programlisting] .... .include "/etc/jail.conf.d/*.conf"; .... A typical jail entry would look like this: [.programlisting] .... jailname { <.> # STARTUP/LOGGING exec.start = "/bin/sh /etc/rc"; <.> exec.stop = "/bin/sh /etc/rc.shutdown"; <.> exec.consolelog = "/var/log/jail_console_${name}.log"; <.> # PERMISSIONS allow.raw_sockets; <.> exec.clean; <.> mount.devfs; <.> # HOSTNAME/PATH host.hostname = "${name}"; <.> path = "/usr/local/jails/containers/${name}"; <.> # NETWORK ip4.addr = 192.168.1.151; <.> ip6.addr = ::ffff:c0a8:197 <.> interface = em0; <.> } .... <.> `jailname` - Name of the jail. <.> `exec.start` - Command(s) to run in the jail environment when a jail is created. A typical command to run is "/bin/sh /etc/rc". <.> `exec.stop` - Command(s) to run in the jail environment before a jail is removed. A typical command to run is "/bin/sh /etc/rc.shutdown". <.> `exec.consolelog` - A file to direct command output (stdout and stderr) to. <.> `allow.raw_sockets` - Allow creating raw sockets inside the jail. Setting this parameter allows utilities like man:ping[8] and man:traceroute[8] to operate inside the jail. <.> `exec.clean` - Run commands in a clean environment. <.> `mount.devfs` - Mount a man:devfs[5] filesystem on the chrooted [.filename]#/dev# directory, and apply the ruleset in the devfs_ruleset parameter to restrict the devices visible inside the jail. <.> `host.hostname` - The hostname of the jail. <.> `path` - The directory which is to be the root of the jail. Any commands that are run inside the jail, either by jail or from man:jexec[8], are run from this directory. <.> `ip4.addr` - IPv4 address. There are two configuration possibilities for IPv4. The first is to establish an IP or a list of IPs as has been done in the example. The other is to use `ip4` instead and set the `inherit` value to inherit the host's IP address. <.> `ip6.addr` - IPv6 address. There are two configuration possibilities for IPv6. The first is to establish an IP or a list of IPs as has been done in the example. The other is to use `ip6` instead and set the `inherit` value to inherit the host's IP address. <.> `interface` - A network interface to add the jail's IP addresses. Usually the host interface. More information about configuration variables can be found in man:jail[8] and man:jail.conf[5]. [[classic-jail]] == Classic Jail (Thick Jail) These jails resemble a real FreeBSD system. They can be managed more or less like a normal host system and updated independently. [[creating-classic-jail]] === Creating a Classic Jail In principle, a jail only needs a hostname, a root directory, an IP address, and a userland. The userland for the jail can be obtained from the official FreeBSD download servers. Execute the following command to download the userland: [source,shell,subs=attributes] .... # fetch https://download.freebsd.org/ftp/releases/amd64/amd64/{rel-latest}-RELEASE/base.txz -o /usr/local/jails/media/{rel-latest}-RELEASE-base.txz .... Once the download is complete, it will be necessary to extract the contents into the jail directory. Execute the following commands to extract the userland into the jail's directory: [source,shell,subs=attributes] .... # mkdir -p /usr/local/jails/containers/classic # tar -xf /usr/local/jails/media/{rel-latest}-RELEASE-base.txz -C /usr/local/jails/containers/classic --unlink .... With the userland extracted in the jail directory, it will be necessary to copy the timezone and DNS server files: [source,shell] .... # cp /etc/resolv.conf /usr/local/jails/containers/classic/etc/resolv.conf # cp /etc/localtime /usr/local/jails/containers/classic/etc/localtime .... With the files copied, the next thing to do is update to the latest patch level by executing the following command: [source,shell] .... # freebsd-update -b /usr/local/jails/containers/classic/ fetch install .... The last step is to configure the jail. It will be necessary to add an entry to the configuration file [.filename]#/etc/jail.conf# or in [.filename]#jail.conf.d# with the parameters of the jail. An example would be the following: [.programlisting] .... classic { # STARTUP/LOGGING exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.consolelog = "/var/log/jail_console_${name}.log"; # PERMISSIONS allow.raw_sockets; exec.clean; mount.devfs; # HOSTNAME/PATH host.hostname = "${name}"; path = "/usr/local/jails/containers/${name}"; # NETWORK ip4.addr = 192.168.1.151; interface = em0; } .... Execute the following command to start the jail: [source,shell] .... # service jail start classic .... More information on how to manage jails can be found in the section crossref:jails[jail-management, Jail Management]. [[thin-jail]] == Thin Jails Although Thin Jails use the same technology as Thick Jails, the creation procedure is different. Thin jails can be created using OpenZFS snapshots or using templates and NullFS. The use of OpenZFS snapshots and templates using NullFS have certain advantages over classic jails, such as being able to create them faster from snapshots or being able to update multiple jails using NullFS. [[creating-thin-jail-openzfs-snapshots]] === Creating a Thin Jail Using OpenZFS Snapshots Due to the good integration between FreeBSD and OpenZFS it is very easy to create new Thin Jails using OpenZFS Snapshots. To create a Thin Jail using OpenZFS Snapshots the first step is to create the jail directory tree by following the instructions in crossref:jails[host-configuration-directories, "Setting up the Jail Directory Tree"]. Next, create a template. Templates will only be used to create new jails. For this reason they are created in "read-only" mode so that jails are created with an immutable base. To create the dataset for the template, execute the following command: [source,shell,subs=attributes] .... # zfs create -p zroot/jails/templates/{rel-latest}-RELEASE .... Then execute the following command to download the userland: [source,shell,subs=attributes] .... # fetch https://download.freebsd.org/ftp/releases/amd64/amd64/{rel-latest}-RELEASE/base.txz -o /usr/local/jails/media/{rel-latest}-RELEASE-base.txz .... Once the download is complete, it will be necessary to extract the contents in the template directory by executing the following command: [source,shell,subs=attributes] .... # tar -xf /usr/local/jails/media/{rel-latest}-RELEASE-base.txz -C /usr/local/jails/templates/{rel-latest}-RELEASE --unlink .... With the userland extracted in the templates directory, it will be necessary to copy the timezone and DNS server files to the template directory by executing the following command: [source,shell,subs=attributes] .... # cp /etc/resolv.conf /usr/local/jails/templates/{rel-latest}-RELEASE/etc/resolv.conf # cp /etc/localtime /usr/local/jails/templates/{rel-latest}-RELEASE/etc/localtime .... The next thing to do is update to the latest patch level by executing the following command: [source,shell,subs=attributes] .... # freebsd-update -b /usr/local/jails/templates/{rel-latest}-RELEASE/ fetch install .... Once the update is finished, the template is ready. To create an OpenZFS Snapshot from the template, execute the following command: [source,shell,subs=attributes] .... # zfs snapshot zroot/jails/templates/{rel-latest}-RELEASE@base .... Once the OpenZFS Snapshot has been created, infinite jails can be created using the OpenZFS clone function. To create a Thin Jail named `thinjail`, execute the following command: [source,shell,subs=attributes] .... # zfs clone zroot/jails/templates/{rel-latest}-RELEASE@base zroot/jails/containers/thinjail .... The last step is to configure the jail. It will be necessary to add an entry to the configuration file [.filename]#/etc/jail.conf# or in [.filename]#jail.conf.d# with the parameters of the jail. An example would be the following: [.programlisting] .... thinjail { # STARTUP/LOGGING exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.consolelog = "/var/log/jail_console_${name}.log"; # PERMISSIONS allow.raw_sockets; exec.clean; mount.devfs; # HOSTNAME/PATH host.hostname = "${name}"; path = "/usr/local/jails/containers/${name}"; # NETWORK ip4 = inherit; interface = em0; } .... Execute the following command to start the jail: [source,shell] .... # service jail start thinjail .... More information on how to manage jails can be found in the section crossref:jails[jail-management, Jail Management]. [[creating-thin-jail-nullfs]] === Creating a Thin Jail Using NullFS A jail can be created with reduced duplication of system files by using the Thin Jail technique and using NullFS to selectively share specific directories from the host system into the jail. The first step is to create the dataset to save the template, execute the following command if using OpenZFS: [source,shell,subs=attributes] .... # zfs create -p zroot/jails/templates/{rel-latest}-RELEASE-base .... Or this one if using UFS: [source,shell,subs=attributes] .... # mkdir /usr/local/jails/templates/{rel-latest}-RELEASE-base .... Then execute the following command to download the userland: [source,shell,subs=attributes] .... # fetch https://download.freebsd.org/ftp/releases/amd64/amd64/{rel-latest}-RELEASE/base.txz -o /usr/local/jails/media/{rel-latest}-RELEASE-base.txz .... Once the download is complete, it will be necessary to extract the contents in the template directory by executing the following command: [source,shell,subs=attributes] .... # tar -xf /usr/local/jails/media/{rel-latest}-RELEASE-base.txz -C /usr/local/jails/templates/{rel-latest}-RELEASE-base --unlink .... Once the userland is extracted in the templates directory, it will be necessary to copy the timezone and DNS server files to the template directory by executing the following command: [source,shell,subs=attributes] .... # cp /etc/resolv.conf /usr/local/jails/templates/{rel-latest}-RELEASE-base/etc/resolv.conf # cp /etc/localtime /usr/local/jails/templates/{rel-latest}-RELEASE-base/etc/localtime .... With the files moved to the template, the next thing to do is update to the latest patch level by executing the following command: [source,shell,subs=attributes] .... # freebsd-update -b /usr/local/jails/templates/{rel-latest}-RELEASE-base/ fetch install .... In addition to the base template, it is also necessary to create a directory where the `skeleton` will be located. Some directories will be copied from the template to the `skeleton`. Execute the following command to create the dataset for the `skeleton` in case of using OpenZFS: [source,shell,subs=attributes] .... # zfs create -p zroot/jails/templates/{rel-latest}-RELEASE-skeleton .... Or this one in case of using UFS: [source,shell,subs=attributes] .... # mkdir /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton .... Then create the `skeleton` directories. The `skeleton` directories will hold the local directories of the jails. Execute the following commands to create the directories: [source,shell,subs=attributes] .... # mkdir -p /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton/home # mkdir -p /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton/usr # mv /usr/local/jails/templates/{rel-latest}-RELEASE-base/etc /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton/etc # mv /usr/local/jails/templates/{rel-latest}-RELEASE-base/usr/local /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton/usr/local # mv /usr/local/jails/templates/{rel-latest}-RELEASE-base/tmp /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton/tmp # mv /usr/local/jails/templates/{rel-latest}-RELEASE-base/var /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton/var # mv /usr/local/jails/templates/{rel-latest}-RELEASE-base/root /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton/root .... The next step is to create the symlinks to the `skeleton` by executing the following commands: [source,shell,subs=attributes] .... # cd /usr/local/jails/templates/{rel-latest}-RELEASE-base/ # mkdir skeleton # ln -s skeleton/etc etc # ln -s skeleton/home home # ln -s skeleton/root root # ln -s ../skeleton/usr/local usr/local # ln -s skeleton/tmp tmp # ln -s skeleton/var var .... With the `skeleton` ready, it will be necessary to copy the data to the jail directory. In case of using OpenZFS, OpenZFS snapshots can be used to easily create as many jails as necessary by executing the following commands: [source,shell,subs=attributes] .... # zfs snapshot zroot/jails/templates/{rel-latest}-RELEASE-skeleton@base # zfs clone zroot/jails/templates/{rel-latest}-RELEASE-skeleton@base zroot/jails/containers/thinjail .... In case of using UFS the man:cp[1] program can be used by executing the following command: [source,shell,subs=attributes] .... # cp -R /usr/local/jails/templates/{rel-latest}-RELEASE-skeleton /usr/local/jails/containers/thinjail .... Then create the directory in which the base template and the skeleton will be mounted: [source,shell] .... # mkdir -p /usr/local/jails/thinjail-nullfs-base .... Add a jail entry in [.filename]#/etc/jail.conf# or a file in [.filename]#jail.conf.d# as follows: [.programlisting] .... thinjail { # STARTUP/LOGGING exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.consolelog = "/var/log/jail_console_${name}.log"; # PERMISSIONS allow.raw_sockets; exec.clean; mount.devfs; # HOSTNAME/PATH host.hostname = "${name}"; path = "/usr/local/jails/${name}-nullfs-base"; # NETWORK ip4.addr = 192.168.1.153; interface = em0; # MOUNT mount.fstab = "/usr/local/jails/${name}-nullfs-base.fstab"; } .... Then the create the [.filename]#/usr/local/jails/thinjail-nullfs-base.fstab# file as follows: [.programlisting,subs=attributes] .... /usr/local/jails/templates/{rel-latest}-RELEASE-base /usr/local/jails/thinjail-nullfs-base/ nullfs ro 0 0 /usr/local/jails/containers/thinjail /usr/local/jails/thinjail-nullfs-base/skeleton nullfs rw 0 0 .... Execute the following command to start the jail: [source,shell] .... # service jail start thinjail .... [[creating-vnet-jail]] === Creating a VNET Jail FreeBSD VNET Jails have their own distinct networking stack, including interfaces, IP addresses, routing tables, and firewall rules. The first step to create a VNET jail is to create the man:bridge[4] by executing the following command: [source,shell] .... # ifconfig bridge create .... The output should be similar to the following: [.programlisting] .... bridge0 .... With the `bridge` created, it will be necessary to attach it to the `em0` interface and bring both of them up by executing the following commands: [source,shell] .... # ifconfig bridge0 addm em0 up # ifconfig em0 up .... To make this setting persist across reboots, add the following lines to [.filename]#/etc/rc.conf#: [.programlisting] .... defaultrouter="192.168.1.1" cloned_interfaces="bridge0" ifconfig_bridge0="inet 192.168.1.150/24 addm em0 up" ifconfig_em0="up" .... For more information on bridging, see crossref:advanced-networking[network-bridging, Network Bridging]. The next step is to create the jail as indicated above. Either the crossref:jails[classic-jail, Classic Jail (Thick Jail)] procedure and the crossref:jails[thin-jail, Thin Jails] procedure can be used. The only thing that will change is the configuration in the [.filename]#/etc/jail.conf# file. The path [.filename]#/usr/local/jails/containers/vnet# will be used as an example for the created jail. The following is an example configuration for a VNET jail: [.programlisting] .... vnet { # STARTUP/LOGGING exec.consolelog = "/var/log/jail_console_${name}.log"; # PERMISSIONS allow.raw_sockets; exec.clean; mount.devfs; devfs_ruleset = 5; # PATH/HOSTNAME path = "/usr/local/jails/containers/${name}"; host.hostname = "${name}"; # VNET/VIMAGE vnet; vnet.interface = "${epair}b"; # NETWORKS/INTERFACES $id = "154"; <.> $ip = "192.168.1.${id}/24"; $gateway = "192.168.1.1"; $bridge = "bridge0"; <.> $epair = "epair${id}"; # ADD TO bridge INTERFACE exec.prestart = "/sbin/ifconfig ${epair} create up"; exec.prestart += "/sbin/ifconfig ${epair}a up descr jail:${name}"; exec.prestart += "/sbin/ifconfig ${bridge} addm ${epair}a up"; exec.start += "/sbin/ifconfig ${epair}b ${ip} up"; exec.start += "/sbin/route add default ${gateway}"; exec.start += "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.poststop = "/sbin/ifconfig ${bridge} deletem ${epair}a"; exec.poststop += "/sbin/ifconfig ${epair}a destroy"; } .... <.> Represents the IP of the Jail, it must be *unique*. <.> Refers to the bridge created previously. [[creating-linux-jail]] === Creating a Linux Jail FreeBSD can run Linux inside a jail using crossref:linuxemu[linuxemu,Linux Binary Compatibility] and man:debootstrap[8]. Jails do not have a kernel. They run on the host's kernel. Therefore it is necessary to enable Linux Binary Compatibility in the host system. To enable the Linux ABI at boot time, execute the following command: [source,shell] .... # sysrc linux_enable="YES" .... Once enabled, it can be started without rebooting by executing the following command: [source,shell] .... # service linux start .... The next step will be to create a jail as indicated above, for example in crossref:jails[creating-thin-jail-openzfs-snapshots, Creating a Thin Jail Using OpenZFS Snapshots], but *without* performing the configuration. FreeBSD Linux jails require a specific configuration that will be detailed below. Once the jail has been created as explained above, execute the following command to perform required configuration for the jail and start it: [source,shell] .... # jail -cm \ name=ubuntu \ host.hostname="ubuntu.example.com" \ path="/usr/local/jails/ubuntu" \ interface="em0" \ ip4.addr="192.168.1.150" \ exec.start="/bin/sh /etc/rc" \ exec.stop="/bin/sh /etc/rc.shutdown" \ mount.devfs \ devfs_ruleset=4 \ allow.mount \ allow.mount.devfs \ allow.mount.fdescfs \ allow.mount.procfs \ allow.mount.linprocfs \ allow.mount.linsysfs \ allow.mount.tmpfs \ enforce_statfs=1 .... To access the jail, it will be necessary to install package:sysutils/debootstrap[]. Execute the following command to access the FreeBSD Linux jail: [source,shell] .... # jexec -u root ubuntu .... Inside the jail, execute the following commands to install package:sysutils/debootstrap[] and prepare the Ubuntu environment: [source,shell] .... # pkg install debootstrap # debootstrap jammy /compat/ubuntu .... When the process has finished and the message `Base system installed successfully` is displayed on the console, it will be necessary to stop the jail from the host system by executing the following command: [source,shell] .... # service jail onestop ubuntu .... Then add an entry in [.filename]#/etc/jail.conf# for the Linux jail: [.programlisting] .... ubuntu { # STARTUP/LOGGING exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.consolelog = "/var/log/jail_console_${name}.log"; # PERMISSIONS allow.raw_sockets; exec.clean; mount.devfs; devfs_ruleset = 4; # HOSTNAME/PATH host.hostname = "${name}"; path = "/usr/local/jails/containers/${name}"; # NETWORK ip4.addr = 192.168.1.155; interface = em0; # MOUNT mount += "devfs $path/compat/ubuntu/dev devfs rw 0 0"; mount += "tmpfs $path/compat/ubuntu/dev/shm tmpfs rw,size=1g,mode=1777 0 0"; mount += "fdescfs $path/compat/ubuntu/dev/fd fdescfs rw,linrdlnk 0 0"; mount += "linprocfs $path/compat/ubuntu/proc linprocfs rw 0 0"; mount += "linsysfs $path/compat/ubuntu/sys linsysfs rw 0 0"; mount += "/tmp $path/compat/ubuntu/tmp nullfs rw 0 0"; mount += "/home $path/compat/ubuntu/home nullfs rw 0 0"; } .... Then the jail can be started as usual with the following command: [source,shell] .... # service jail start ubuntu .... The Ubuntu environment can be accessed using the following command: [source,shell] .... # jexec ubuntu chroot /compat/ubuntu /bin/bash .... More information can be found in the chapter crossref:linuxemu[linuxemu,Linux Binary Compatibility]. [[service-jails-config]] === Configuring Service Jails A service jail is configured completely via [.filename]#/etc/rc.conf# or man:sysrc[8]. The base system services are service jails ready. They contain a config line which enables networking or lift other restrictions of jails. Base system services which do not make sense to run inside jails are configured to not be started as a service jail, even if enabled in [.filename]#/etc/rc.conf#. Some examples of such a service are services which want to mount or unmount something in the start of stop method, or only configure something like a route, or firewall, or the like. Third party services may or may not be service jails ready. To check if a service is service jail ready, the following command can be used: [source,shell] .... # grep _svcj_options /path/to/rc.d/servicename .... If there is no output, the service is not service jail ready, or does not need any additional privileges like for example, network access. If the service is not service jail ready, and needs network access, it can be made ready by adding the necessary config to [.filename]#/etc/rc.conf#: [source,shell] .... # sysrc servicename_svcj_options=net_basic .... For all possible `_svcj_options` see the man:rc.conf[5] man-page. To enable a service jail for a given service, the service needs to be stopped and the `servicename_svcj` variable needs to be set to YES. To put man:syslogd[8] into a service jail, use the following sequence of commands: [source,shell] .... # service syslogd stop # sysrc syslogd_svcj=YES # service syslogd start .... If the `servicename_svcj` variable is changed, the service needs to be stopped before it is changed. If it is not stopped, the rc framework will not detect the correct state of the service and will not be able to do what is requested. Service jails are managed only via man:rc.conf[5]/man:sysrc[8] and the man:service[8] command. The jail utilities, like man:jls[8] as described in crossref:jails[jail-management,Jail Management] can be used to investigate the operation, but the man:jail[8] command is not supposed to be used to manage them. [[jail-management]] == Jail Management Once the jail is created, there are a number of operations that can be performed, like starting, rebooting or deleting the jail, installing software in it, etc. In this section the different actions that can be done with jails from the host will be described. [[list-running-jails]] === List Running Jails To list the jails that are running on the host system, the command man:jls[8] can be used: [source,shell] .... # jls .... The output should be similar to the following: .... JID IP Address Hostname Path 1 192.168.250.70 classic /usr/local/jails/containers/classic .... man:jls[8] supports the `--libxo` argument, which through the man:libxo[3] library allows other types of formats to be displayed, such as `JSON`, `HTML`, etc. For example, execute the following command to get the `JSON` output: [source,shell] .... # jls --libxo=json .... The output should be similar to the following: .... {"__version": "2", "jail-information": {"jail": [{"jid":1,"ipv4":"192.168.250.70","hostname":"classic","path":"/usr/local/jails/containers/classic"}]}} .... [[start-jail]] === Start, Restart, and Stop a Jail man:service[8] is used to start, reboot, or stop a jail on the host. For example, to start a jail, run the following command: [source,shell] .... # service jail start jailname .... Change the `start` argument to `restart` or `stop` to perform other actions on the jail. [[destroy-jail]] === Destroy a Jail Destroying a jail is not as simple as stopping the jail using man:service[8] and removing the jail directory and [.filename]#/etc/jail.conf# entry. FreeBSD takes system security very seriously. For this reason there are certain files that not even the root user can delete. This functionality is known as File Flags. The first step is to stop the desired jail executing the following command: [source,shell] .... # service jail stop jailname .... The second step is to remove these flags with man:chflags[1] by executing the following command, in which `classic` is the name of the jail to remove: [source,shell] .... # chflags -R 0 /usr/local/jails/containers/classic .... The third step is to delete the directory where the jail was: [source,shell] .... # rm -rf /usr/local/jails/containers/classic .... Finally, it will be necessary to remove the jail entry in [.filename]#/etc/jail.conf# or in [.filename]#jail.conf.d#. [[handle-packages-jail]] === Handle Packages in a Jail The man:pkg[8] tool supports the `-j` argument in order to handle packages installed inside the jail. For example, to install package:www/nginx-lite[] in the jail, the next command can be executed *from the host*: [source,shell] .... # pkg -j classic install nginx-lite .... For more information on working with packages in FreeBSD, see crossref:ports[ports,"Installing Applications: Packages and Ports"]. [[access-jail]] === Access a Jail While it has been stated above that it is best to manage jails from the host system, a jail can be entered with man:jexec[8]. The jail can be entered by running man:jexec[8] from the host: [source,shell] .... # jexec -u root jailname .... When gaining access to the jail, the message configured in man:motd[5] will be displayed. [[execute-commands-jail]] === Execute Commands in a Jail To execute a command from the host system in a jail the man:jexec[8] can be used. For example, to stop a service that is running inside a jail, the command will be executed: [source,shell] .... # jexec -l jailname service nginx stop .... [[jail-upgrading]] == Jail Upgrading Upgrading FreeBSD Jails ensures that the isolated environments remain secure, up-to-date, and in line with the latest features and improvements available in the FreeBSD ecosystem. [[jails-updating]] === Upgrading a Classic Jail or a Thin Jail using OpenZFS Snapshots Jails *must be updated from the host* operating system. The default behavior in FreeBSD is to disallow the use of man:chflags[1] in a jail. This will prevent the update of some files so updating from within the jail will fail. To update the jail to the latest patch release of the version of FreeBSD it is running, execute the following commands on the host: [source,shell] .... # freebsd-update -j classic fetch install # service jail restart classic .... To upgrade the jail to a new major or minor version, first upgrade the host system as described in crossref:cutting-edge[freebsdupdate-upgrade,"Performing Major and Minor Version Upgrades"]. Once the host has been upgraded and rebooted, the jail can then be upgraded. [TIP] ==== In case of upgrade from one version to another, it is easier to create a new jail than to upgrade completely. ==== For example to upgrade from 13.1-RELEASE to 13.2-RELEASE, execute the following commands on the host: [source,shell] .... # freebsd-update -j classic -r 13.2-RELEASE upgrade # freebsd-update -j classic install # service jail restart classic # freebsd-update -j classic install # service jail restart classic .... [NOTE] ==== It is necessary to execute the `install` step two times. The first one upgrades the kernel, and the second one upgrades the rest of the components. ==== Then, if it was a major version upgrade, reinstall all installed packages and restart the jail again. This is required because the ABI version changes when upgrading between major versions of FreeBSD. From the host: [source,shell] .... # pkg -j jailname upgrade -f # service jail restart jailname .... [[upgrading-thin-jail]] === Upgrading a Thin Jail Using NullFS Since Thin Jails that use NullFS share the majority of system directories, they are very easy to update. It is enough to update the template. This allows updating multiple jails at the same time. To update the template to the latest patch release of the version of FreeBSD it is running, execute the following commands on the host: [source,shell] .... # freebsd-update -b /usr/local/jails/templates/13.1-RELEASE-base/ fetch install # service jail restart .... To upgrade the template to a new major or minor version, first upgrade the host system as described in crossref:cutting-edge[freebsdupdate-upgrade,"Performing Major and Minor Version Upgrades"]. Once the host has been upgraded and rebooted, the template can then be upgraded. For example, to upgrade from 13.1-RELEASE to 13.2-RELEASE, execute the following commands on the host: [source,shell] .... # freebsd-update -b /usr/local/jails/templates/13.1-RELEASE-base/ -r 13.2-RELEASE upgrade # freebsd-update -b /usr/local/jails/templates/13.1-RELEASE-base/ install # service jail restart # freebsd-update -b /usr/local/jails/templates/13.1-RELEASE-base/ install # service jail restart .... [[jail-resource-limits]] == Jail Resource Limits Controlling the resources that a jail uses from the host system is a task to be taken into account by the system administrator. Use man:rctl[8] to manage the resources that a jail can use from the host system. [TIP] ==== The `kern.racct.enable` tunable must be enabled at [.filename]#/boot/loader.conf#. ==== The syntax to limit the resources of a jail is as follows: [.programlisting] .... rctl -a jail::resource:action=amount/percentage .... For example, to limit the maximum RAM that a jail can access, run the following command: [source,shell] .... # rctl -a jail:classic:memoryuse:deny=2G .... To make the limitation persistent across reboots of the host system, it will be necessary to add the rule to the [.filename]#/etc/rctl.conf# file as follows: [.programlisting] .... jail:classic:memoryuse:deny=2G/jail .... More information on resource limits can be found in the security chapter in the crossref:security[security-resourcelimits,"Resource Limits section"]. [[jail-managers-and-containers]] == Jail Managers and Containers As previously explained, each type of FreeBSD Jail can be created and configured manually, but FreeBSD also has third-party utilities to make configuration and administration easier. Below is an incomplete list of the different FreeBSD Jail managers: .Jail Managers [options="header", cols="1,1,1,1"] |=== | Name | License | Package | Documentation | BastilleBSD | BSD-3 | package:sysutils/bastille[] | link:https://bastille.readthedocs.io/en/latest/[Documentation] | pot | BSD-3 | package:sysutils/pot[] | link:https://pot.pizzamig.dev/[Documentation] | cbsd | BSD-2 | package:sysutils/cbsd[] | link:https://github.com/cbsd/cbsd[Documentation] | AppJail | BSD-3 | package:sysutils/appjail[], for devel package:sysutils/appjail-devel[] | link:https://github.com/DtxdF/AppJail#getting-started[Documentation] | iocage | BSD-2 | package:sysutils/iocage[] | link:https://freebsd.github.io/iocage/[Documentation] | ezjail | link:https://erdgeist.org/beerware.html[Beer Ware] | package:sysutils/ezjail[] | link:https://erdgeist.org/arts/software/ezjail/[Documentation] |=== diff --git a/documentation/content/en/books/handbook/zfs/_index.adoc b/documentation/content/en/books/handbook/zfs/_index.adoc index 22d41f6468..587b6510e2 100644 --- a/documentation/content/en/books/handbook/zfs/_index.adoc +++ b/documentation/content/en/books/handbook/zfs/_index.adoc @@ -1,3034 +1,3034 @@ --- title: Chapter 22. The Z File System (ZFS) part: Part III. System Administration prev: books/handbook/geom next: books/handbook/filesystems description: ZFS is an advanced file system designed to solve major problems found in previous storage subsystem software tags: ["ZFS", "filesystem", "administration", "zpool", "features", "terminology", "RAID-Z"] showBookMenu: true weight: 26 params: path: "/books/handbook/zfs/" --- [[zfs]] = The Z File System (ZFS) :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 22 :partnums: :source-highlighter: rouge :experimental: :images-path: books/handbook/zfs/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] ZFS is an advanced file system designed to solve major problems found in previous storage subsystem software. Originally developed at Sun(TM), ongoing open source ZFS development has moved to the http://open-zfs.org[OpenZFS Project]. ZFS has three major design goals: * Data integrity: All data includes a crossref:zfs[zfs-term-checksum,checksum] of the data. ZFS calculates checksums and writes them along with the data. When reading that data later, ZFS recalculates the checksums. If the checksums do not match, meaning detecting one or more data errors, ZFS will attempt to automatically correct errors when ditto-, mirror-, or parity-blocks are available. * Pooled storage: adding physical storage devices to a pool, and allocating storage space from that shared pool. Space is available to all file systems and volumes, and increases by adding new storage devices to the pool. * Performance: caching mechanisms provide increased performance. crossref:zfs[zfs-term-arc,ARC] is an advanced memory-based read cache. ZFS provides a second level disk-based read cache with crossref:zfs[zfs-term-l2arc,L2ARC], and a disk-based synchronous write cache named crossref:zfs[zfs-term-zil,ZIL]. A complete list of features and terminology is in crossref:zfs[zfs-term, ZFS Features and Terminology]. [[zfs-differences]] == What Makes ZFS Different More than a file system, ZFS is fundamentally different from traditional file systems. Combining the traditionally separate roles of volume manager and file system provides ZFS with unique advantages. The file system is now aware of the underlying structure of the disks. Traditional file systems could exist on a single disk alone at a time. If there were two disks then creating two separate file systems was necessary. A traditional hardware RAID configuration avoided this problem by presenting the operating system with a single logical disk made up of the space provided by physical disks on top of which the operating system placed a file system. Even with software RAID solutions like those provided by GEOM, the UFS file system living on top of the RAID believes it's dealing with a single device. ZFS' combination of the volume manager and the file system solves this and allows the creation of file systems that all share a pool of available storage. One big advantage of ZFS' awareness of the physical disk layout is that existing file systems grow automatically when adding extra disks to the pool. This new space then becomes available to the file systems. ZFS can also apply different properties to each file system. This makes it useful to create separate file systems and datasets instead of a single monolithic file system. [[zfs-quickstart]] == Quick Start Guide FreeBSD can mount ZFS pools and datasets during system initialization. To enable it, add this line to [.filename]#/etc/rc.conf#: [.programlisting] .... zfs_enable="YES" .... Then start the service: [source,shell] .... # service zfs start .... The examples in this section assume three SCSI disks with the device names [.filename]#da0#, [.filename]#da1#, and [.filename]#da2#. Users of SATA hardware should instead use [.filename]#ada# device names. [[zfs-quickstart-single-disk-pool]] === Single Disk Pool To create a simple, non-redundant pool using a single disk device: [source,shell] .... # zpool create example /dev/da0 .... To view the new pool, review the output of `df`: [source,shell] .... # df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1a 2026030 235230 1628718 13% / devfs 1 1 0 100% /dev /dev/ad0s1d 54098308 1032846 48737598 2% /usr example 17547136 0 17547136 0% /example .... This output shows creating and mounting of the `example` pool, and that is now accessible as a file system. Create files for users to browse: [source,shell] .... # cd /example # ls # touch testfile # ls -al total 4 drwxr-xr-x 2 root wheel 3 Aug 29 23:15 . drwxr-xr-x 21 root wheel 512 Aug 29 23:12 .. -rw-r--r-- 1 root wheel 0 Aug 29 23:15 testfile .... This pool is not using any advanced ZFS features and properties yet. To create a dataset on this pool with compression enabled: [source,shell] .... # zfs create example/compressed # zfs set compression=gzip example/compressed .... The `example/compressed` dataset is now a ZFS compressed file system. Try copying some large files to [.filename]#/example/compressed#. Disable compression with: [source,shell] .... # zfs set compression=off example/compressed .... To unmount a file system, use `zfs umount` and then verify with `df`: [source,shell] .... # zfs umount example/compressed # df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1a 2026030 235232 1628716 13% / devfs 1 1 0 100% /dev /dev/ad0s1d 54098308 1032864 48737580 2% /usr example 17547008 0 17547008 0% /example .... To re-mount the file system to make it accessible again, use `zfs mount` and verify with `df`: [source,shell] .... # zfs mount example/compressed # df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1a 2026030 235234 1628714 13% / devfs 1 1 0 100% /dev /dev/ad0s1d 54098308 1032864 48737580 2% /usr example 17547008 0 17547008 0% /example example/compressed 17547008 0 17547008 0% /example/compressed .... Running `mount` shows the pool and file systems: [source,shell] .... # mount /dev/ad0s1a on / (ufs, local) devfs on /dev (devfs, local) /dev/ad0s1d on /usr (ufs, local, soft-updates) example on /example (zfs, local) example/compressed on /example/compressed (zfs, local) .... Use ZFS datasets like any file system after creation. Set other available features on a per-dataset basis when needed. The example below creates a new file system called `data`. It assumes the file system contains important files and configures it to store two copies of each data block. [source,shell] .... # zfs create example/data # zfs set copies=2 example/data .... Use `df` to see the data and space usage: [source,shell] .... # df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1a 2026030 235234 1628714 13% / devfs 1 1 0 100% /dev /dev/ad0s1d 54098308 1032864 48737580 2% /usr example 17547008 0 17547008 0% /example example/compressed 17547008 0 17547008 0% /example/compressed example/data 17547008 0 17547008 0% /example/data .... Notice that all file systems in the pool have the same available space. Using `df` in these examples shows that the file systems use the space they need and all draw from the same pool. ZFS gets rid of concepts such as volumes and partitions, and allows several file systems to share the same pool. To destroy the file systems and then the pool that is no longer needed: [source,shell] .... # zfs destroy example/compressed # zfs destroy example/data # zpool destroy example .... [[zfs-quickstart-raid-z]] === RAID-Z Disks fail. One way to avoid data loss from disk failure is to use RAID. ZFS supports this feature in its pool design. RAID-Z pools require three or more disks but provide more usable space than mirrored pools. This example creates a RAID-Z pool, specifying the disks to add to the pool: [source,shell] .... # zpool create storage raidz da0 da1 da2 .... [NOTE] ==== Sun(TM) recommends that the number of devices used in a RAID-Z configuration be between three and nine. For environments requiring a single pool consisting of 10 disks or more, consider breaking it up into smaller RAID-Z groups. If two disks are available, ZFS mirroring provides redundancy if required. Refer to man:zpool[8] for more details. ==== The previous example created the `storage` zpool. This example makes a new file system called `home` in that pool: [source,shell] .... # zfs create storage/home .... Enable compression and store an extra copy of directories and files: [source,shell] .... # zfs set copies=2 storage/home # zfs set compression=gzip storage/home .... To make this the new home directory for users, copy the user data to this directory and create the appropriate symbolic links: [source,shell] .... # cp -rp /home/* /storage/home # rm -rf /home /usr/home # ln -s /storage/home /home # ln -s /storage/home /usr/home .... Users data is now stored on the freshly-created [.filename]#/storage/home#. Test by adding a new user and logging in as that user. Create a file system snapshot to roll back to later: [source,shell] .... # zfs snapshot storage/home@08-30-08 .... ZFS creates snapshots of a dataset, not a single directory or file. The `@` character is a delimiter between the file system name or the volume name. Before deleting an important directory, back up the file system, then roll back to an earlier snapshot in which the directory still exists: [source,shell] .... # zfs rollback storage/home@08-30-08 .... To list all available snapshots, run `ls` in the file system's [.filename]#.zfs/snapshot# directory. For example, to see the snapshot taken: [source,shell] .... # ls /storage/home/.zfs/snapshot .... Write a script to take regular snapshots of user data. Over time, snapshots can use up a lot of disk space. Remove the previous snapshot using the command: [source,shell] .... # zfs destroy storage/home@08-30-08 .... After testing, make [.filename]#/storage/home# the real [.filename]#/home# with this command: [source,shell] .... # zfs set mountpoint=/home storage/home .... Run `df` and `mount` to confirm that the system now treats the file system as the real [.filename]#/home#: [source,shell] .... # mount /dev/ad0s1a on / (ufs, local) devfs on /dev (devfs, local) /dev/ad0s1d on /usr (ufs, local, soft-updates) storage on /storage (zfs, local) storage/home on /home (zfs, local) # df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1a 2026030 235240 1628708 13% / devfs 1 1 0 100% /dev /dev/ad0s1d 54098308 1032826 48737618 2% /usr storage 26320512 0 26320512 0% /storage storage/home 26320512 0 26320512 0% /home .... This completes the RAID-Z configuration. Add daily status updates about the created file systems to the nightly man:periodic[8] runs by adding this line to [.filename]#/etc/periodic.conf#: [.programlisting] .... daily_status_zfs_enable="YES" .... [[zfs-quickstart-recovering-raid-z]] === Recovering RAID-Z Every software RAID has a method of monitoring its `state`. View the status of RAID-Z devices using: [source,shell] .... # zpool status -x .... If all pools are crossref:zfs[zfs-term-online,Online] and everything is normal, the message shows: [source,shell] .... all pools are healthy .... If there is a problem, perhaps a disk being in the crossref:zfs[zfs-term-offline,Offline] state, the pool state will look like this: [source,shell] .... pool: storage state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scrub: none requested config: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 da0 ONLINE 0 0 0 da1 OFFLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors .... "OFFLINE" shows the administrator took [.filename]#da1# offline using: [source,shell] .... # zpool offline storage da1 .... Power down the computer now and replace [.filename]#da1#. Power up the computer and return [.filename]#da1# to the pool: [source,shell] .... # zpool replace storage da1 .... Next, check the status again, this time without `-x` to display all pools: [source,shell] .... # zpool status storage pool: storage state: ONLINE scrub: resilver completed with 0 errors on Sat Aug 30 19:44:11 2008 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors .... In this example, everything is normal. [[zfs-quickstart-data-verification]] === Data Verification ZFS uses checksums to verify the integrity of stored data. Creating file systems automatically enables them. [WARNING] ==== Disabling Checksums is possible but _not_ recommended! Checksums take little storage space and provide data integrity. Most ZFS features will not work properly with checksums disabled. Disabling these checksums will not increase performance noticeably. ==== Verifying the data checksums (called _scrubbing_) ensures integrity of the `storage` pool with: [source,shell] .... # zpool scrub storage .... The duration of a scrub depends on the amount of data stored. Larger amounts of data will take proportionally longer to verify. Since scrubbing is I/O intensive, ZFS allows a single scrub to run at a time. After scrubbing completes, view the status with `zpool status`: [source,shell] .... # zpool status storage pool: storage state: ONLINE scrub: scrub completed with 0 errors on Sat Jan 26 19:57:37 2013 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors .... Displaying the completion date of the last scrubbing helps decide when to start another. Routine scrubs help protect data from silent corruption and ensure the integrity of the pool. Refer to man:zfs[8] and man:zpool[8] for other ZFS options. [[zfs-zpool]] == `zpool` Administration ZFS administration uses two main utilities. The `zpool` utility controls the operation of the pool and allows adding, removing, replacing, and managing disks. The crossref:zfs[zfs-zfs,`zfs`] utility allows creating, destroying, and managing datasets, both crossref:zfs[zfs-term-filesystem,file systems] and crossref:zfs[zfs-term-volume,volumes]. [[zfs-zpool-create]] === Creating and Destroying Storage Pools Creating a ZFS storage pool requires permanent decisions, as the pool structure cannot change after creation. The most important decision is which types of vdevs to group the physical disks into. See the list of crossref:zfs[zfs-term-vdev,vdev types] for details about the possible options. After creating the pool, most vdev types do not allow adding disks to the vdev. The exceptions are mirrors, which allow adding new disks to the vdev, and stripes, which upgrade to mirrors by attaching a new disk to the vdev. Although adding new vdevs expands a pool, the pool layout cannot change after pool creation. Instead, back up the data, destroy the pool, and recreate it. Create a simple mirror pool: [source,shell] .... # zpool create mypool mirror /dev/ada1 /dev/ada2 # zpool status pool: mypool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 errors: No known data errors .... To create more than one vdev with a single command, specify groups of disks separated by the vdev type keyword, `mirror` in this example: [source,shell] .... # zpool create mypool mirror /dev/ada1 /dev/ada2 mirror /dev/ada3 /dev/ada4 # zpool status pool: mypool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 ada3 ONLINE 0 0 0 ada4 ONLINE 0 0 0 errors: No known data errors .... Pools can also use partitions rather than whole disks. Putting ZFS in a separate partition allows the same disk to have other partitions for other purposes. In particular, it allows adding partitions with bootcode and file systems needed for booting. This allows booting from disks that are also members of a pool. ZFS adds no performance penalty on FreeBSD when using a partition rather than a whole disk. Using partitions also allows the administrator to _under-provision_ the disks, using less than the full capacity. If a future replacement disk of the same nominal size as the original actually has a slightly smaller capacity, the smaller partition will still fit, using the replacement disk. Create a crossref:zfs[zfs-term-vdev-raidz,RAID-Z2] pool using partitions: [source,shell] .... # zpool create mypool raidz2 /dev/ada0p3 /dev/ada1p3 /dev/ada2p3 /dev/ada3p3 /dev/ada4p3 /dev/ada5p3 # zpool status pool: mypool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 ada3p3 ONLINE 0 0 0 ada4p3 ONLINE 0 0 0 ada5p3 ONLINE 0 0 0 errors: No known data errors .... Destroy a pool that is no longer needed to reuse the disks. Destroying a pool requires unmounting the file systems in that pool first. If any dataset is in use, the unmount operation fails without destroying the pool. Force the pool destruction with `-f`. This can cause undefined behavior in applications which had open files on those datasets. [[zfs-zpool-attach]] === Adding and Removing Devices Two ways exist for adding disks to a pool: attaching a disk to an existing vdev with `zpool attach`, or adding vdevs to the pool with `zpool add`. Some crossref:zfs[zfs-term-vdev,vdev types] allow adding disks to the vdev after creation. A pool created with a single disk lacks redundancy. It can detect corruption but can not repair it, because there is no other copy of the data. The crossref:zfs[zfs-term-copies,copies] property may be able to recover from a small failure such as a bad sector, but does not provide the same level of protection as mirroring or RAID-Z. Starting with a pool consisting of a single disk vdev, use `zpool attach` to add a new disk to the vdev, creating a mirror. Also use `zpool attach` to add new disks to a mirror group, increasing redundancy and read performance. When partitioning the disks used for the pool, replicate the layout of the first disk on to the second. Use `gpart backup` and `gpart restore` to make this process easier. Upgrade the single disk (stripe) vdev [.filename]#ada0p3# to a mirror by attaching [.filename]#ada1p3#: [source,shell] .... # zpool status pool: mypool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 errors: No known data errors # zpool attach mypool ada0p3 ada1p3 Make sure to wait until resilvering finishes before rebooting. If you boot from pool 'mypool', you may need to update boot code on newly attached disk _ada1p3_. Assuming you use GPT partitioning and _da0_ is your new boot disk you may use the following command: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1 bootcode written to ada1 # zpool status pool: mypool state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Fri May 30 08:19:19 2014 527M scanned out of 781M at 47.9M/s, 0h0m to go 527M resilvered, 67.53% done config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 (resilvering) errors: No known data errors # zpool status pool: mypool state: ONLINE scan: resilvered 781M in 0h0m with 0 errors on Fri May 30 08:15:58 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 errors: No known data errors .... When adding disks to the existing vdev is not an option, as for RAID-Z, an alternative method is to add another vdev to the pool. Adding vdevs provides higher performance by distributing writes across the vdevs. Each vdev provides its own redundancy. Mixing vdev types like `mirror` and `RAID-Z` is possible but discouraged. Adding a non-redundant vdev to a pool containing mirror or RAID-Z vdevs risks the data on the entire pool. Distributing writes means a failure of the non-redundant disk will result in the loss of a fraction of every block written to the pool. ZFS stripes data across each of the vdevs. For example, with two mirror vdevs, this is effectively a RAID 10 that stripes writes across two sets of mirrors. ZFS allocates space so that each vdev reaches 100% full at the same time. Having vdevs with different amounts of free space will lower performance, as more data writes go to the less full vdev. When attaching new devices to a boot pool, remember to update the bootcode. Attach a second mirror group ([.filename]#ada2p3# and [.filename]#ada3p3#) to the existing mirror: [source,shell] .... # zpool status pool: mypool state: ONLINE scan: resilvered 781M in 0h0m with 0 errors on Fri May 30 08:19:35 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 errors: No known data errors # zpool add mypool mirror ada2p3 ada3p3 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada2 bootcode written to ada2 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada3 bootcode written to ada3 # zpool status pool: mypool state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Fri May 30 08:29:51 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 ada3p3 ONLINE 0 0 0 errors: No known data errors .... Removing vdevs from a pool is impossible and removal of disks from a mirror is exclusive if there is enough remaining redundancy. If a single disk remains in a mirror group, that group ceases to be a mirror and becomes a stripe, risking the entire pool if that remaining disk fails. Remove a disk from a three-way mirror group: [source,shell] .... # zpool status pool: mypool state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Fri May 30 08:29:51 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors # zpool detach mypool ada2p3 # zpool status pool: mypool state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Fri May 30 08:29:51 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 errors: No known data errors .... [[zfs-zpool-status]] === Checking the Status of a Pool Pool status is important. If a drive goes offline or ZFS detects a read, write, or checksum error, the corresponding error count increases. The `status` output shows the configuration and status of each device in the pool and the status of the entire pool. Actions to take and details about the last crossref:zfs[zfs-zpool-scrub,`scrub`] are also shown. [source,shell] .... # zpool status pool: mypool state: ONLINE scan: scrub repaired 0 in 2h25m with 0 errors on Sat Sep 14 04:25:50 2013 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 ada3p3 ONLINE 0 0 0 ada4p3 ONLINE 0 0 0 ada5p3 ONLINE 0 0 0 errors: No known data errors .... [[zfs-zpool-clear]] === Clearing Errors When detecting an error, ZFS increases the read, write, or checksum error counts. Clear the error message and reset the counts with `zpool clear _mypool_`. Clearing the error state can be important for automated scripts that alert the administrator when the pool encounters an error. Without clearing old errors, the scripts may fail to report further errors. [[zfs-zpool-replace]] === Replacing a Functioning Device It may be desirable to replace one disk with a different disk. When replacing a working disk, the process keeps the old disk online during the replacement. The pool never enters a crossref:zfs[zfs-term-degraded,degraded] state, reducing the risk of data loss. Running `zpool replace` copies the data from the old disk to the new one. After the operation completes, ZFS disconnects the old disk from the vdev. If the new disk is larger than the old disk, it may be possible to grow the zpool, using the new space. See crossref:zfs[zfs-zpool-online,Growing a Pool]. Replace a functioning device in the pool: [source,shell] .... # zpool status pool: mypool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 errors: No known data errors # zpool replace mypool ada1p3 ada2p3 Make sure to wait until resilvering finishes before rebooting. When booting from the pool 'zroot', update the boot code on the newly attached disk 'ada2p3'. Assuming GPT partitioning is used and [.filename]#da0# is the new boot disk, use the following command: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada2 # zpool status pool: mypool state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Mon Jun 2 14:21:35 2014 604M scanned out of 781M at 46.5M/s, 0h0m to go 604M resilvered, 77.39% done config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 replacing-1 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 (resilvering) errors: No known data errors # zpool status pool: mypool state: ONLINE scan: resilvered 781M in 0h0m with 0 errors on Mon Jun 2 14:21:52 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors .... [[zfs-zpool-resilver]] === Dealing with Failed Devices When a disk in a pool fails, the vdev to which the disk belongs enters the crossref:zfs[zfs-term-degraded,degraded] state. The data is still available, but with reduced performance because ZFS computes missing data from the available redundancy. To restore the vdev to a fully functional state, replace the failed physical device. ZFS is then instructed to begin the crossref:zfs[zfs-term-resilver,resilver] operation. ZFS recomputes data on the failed device from available redundancy and writes it to the replacement device. After completion, the vdev returns to crossref:zfs[zfs-term-online,online] status. If the vdev does not have any redundancy, or if devices have failed and there is not enough redundancy to compensate, the pool enters the crossref:zfs[zfs-term-faulted,faulted] state. Unless enough devices can reconnect the pool becomes inoperative requiring a data restore from backups. When replacing a failed disk, the name of the failed disk changes to the GUID of the new disk. A new device name parameter for `zpool replace` is not required if the replacement device has the same device name. Replace a failed disk using `zpool replace`: [source,shell] .... # zpool status pool: mypool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: none requested config: NAME STATE READ WRITE CKSUM mypool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 ada0p3 ONLINE 0 0 0 316502962686821739 UNAVAIL 0 0 0 was /dev/ada1p3 errors: No known data errors # zpool replace mypool 316502962686821739 ada2p3 # zpool status pool: mypool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Mon Jun 2 14:52:21 2014 641M scanned out of 781M at 49.3M/s, 0h0m to go 640M resilvered, 82.04% done config: NAME STATE READ WRITE CKSUM mypool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 ada0p3 ONLINE 0 0 0 replacing-1 UNAVAIL 0 0 0 15732067398082357289 UNAVAIL 0 0 0 was /dev/ada1p3/old ada2p3 ONLINE 0 0 0 (resilvering) errors: No known data errors # zpool status pool: mypool state: ONLINE scan: resilvered 781M in 0h0m with 0 errors on Mon Jun 2 14:52:38 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors .... [[zfs-zpool-scrub]] === Scrubbing a Pool Routinely crossref:zfs[zfs-term-scrub,scrub] pools, ideally at least once every month. The `scrub` operation is disk-intensive and will reduce performance while running. Avoid high-demand periods when scheduling `scrub` or use crossref:zfs[zfs-advanced-tuning-scrub_delay,`vfs.zfs.scrub_delay`] to adjust the relative priority of the `scrub` to keep it from slowing down other workloads. [source,shell] .... # zpool scrub mypool # zpool status pool: mypool state: ONLINE scan: scrub in progress since Wed Feb 19 20:52:54 2014 116G scanned out of 8.60T at 649M/s, 3h48m to go 0 repaired, 1.32% done config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 ada3p3 ONLINE 0 0 0 ada4p3 ONLINE 0 0 0 ada5p3 ONLINE 0 0 0 errors: No known data errors .... To cancel a scrub operation if needed, run `zpool scrub -s _mypool_`. [[zfs-zpool-selfheal]] === Self-Healing The checksums stored with data blocks enable the file system to _self-heal_. This feature will automatically repair data whose checksum does not match the one recorded on another device that is part of the storage pool. For example, a mirror configuration with two disks where one drive is starting to malfunction and cannot properly store the data any more. This is worse when the data was not accessed for a long time, as with long term archive storage. Traditional file systems need to run commands that check and repair the data like man:fsck[8]. These commands take time, and in severe cases, an administrator has to decide which repair operation to perform. When ZFS detects a data block with a mismatched checksum, it tries to read the data from the mirror disk. If that disk can provide the correct data, ZFS will give that to the application and correct the data on the disk with the wrong checksum. This happens without any interaction from a system administrator during normal pool operation. The next example shows this self-healing behavior by creating a mirrored pool of disks [.filename]#/dev/ada0# and [.filename]#/dev/ada1#. [source,shell] .... # zpool create healer mirror /dev/ada0 /dev/ada1 # zpool status healer pool: healer state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM healer ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 errors: No known data errors # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT healer 960M 92.5K 960M - - 0% 0% 1.00x ONLINE - .... Copy some important data to the pool to protect from data errors using the self-healing feature and create a checksum of the pool for later comparison. [source,shell] .... # cp /some/important/data /healer # zfs list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT healer 960M 67.7M 892M 7% 1.00x ONLINE - # sha1 /healer > checksum.txt # cat checksum.txt SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f .... Simulate data corruption by writing random data to the beginning of one of the disks in the mirror. To keep ZFS from healing the data when detected, export the pool before the corruption and import it again afterwards. [WARNING] ==== This is a dangerous operation that can destroy vital data, shown here for demonstration alone. *Do not try* it during normal operation of a storage pool. Nor should this intentional corruption example run on any disk with a file system not using ZFS on another partition in it. Do not use any other disk device names other than the ones that are part of the pool. Ensure proper backups of the pool exist and test them before running the command! ==== [source,shell] .... # zpool export healer # dd if=/dev/random of=/dev/ada1 bs=1m count=200 200+0 records in 200+0 records out 209715200 bytes transferred in 62.992162 secs (3329227 bytes/sec) # zpool import healer .... The pool status shows that one device has experienced an error. Note that applications reading data from the pool did not receive any incorrect data. ZFS provided data from the [.filename]#ada0# device with the correct checksums. To find the device with the wrong checksum, look for one whose `CKSUM` column contains a nonzero value. [source,shell] .... # zpool status healer pool: healer state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-4J scan: none requested config: NAME STATE READ WRITE CKSUM healer ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 1 errors: No known data errors .... ZFS detected the error and handled it by using the redundancy present in the unaffected [.filename]#ada0# mirror disk. A checksum comparison with the original one will reveal whether the pool is consistent again. [source,shell] .... # sha1 /healer >> checksum.txt # cat checksum.txt SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f .... Generate checksums before and after the intentional tampering while the pool data still matches. This shows how ZFS is capable of detecting and correcting any errors automatically when the checksums differ. Note this is possible with enough redundancy present in the pool. A pool consisting of a single device has no self-healing capabilities. That is also the reason why checksums are so important in ZFS; do not disable them for any reason. ZFS requires no man:fsck[8] or similar file system consistency check program to detect and correct this, and keeps the pool available while there is a problem. A scrub operation is now required to overwrite the corrupted data on [.filename]#ada1#. [source,shell] .... # zpool scrub healer # zpool status healer pool: healer state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-4J scan: scrub in progress since Mon Dec 10 12:23:30 2012 10.4M scanned out of 67.0M at 267K/s, 0h3m to go 9.63M repaired, 15.56% done config: NAME STATE READ WRITE CKSUM healer ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 627 (repairing) errors: No known data errors .... The scrub operation reads data from [.filename]#ada0# and rewrites any data with a wrong checksum on [.filename]#ada1#, shown by the `(repairing)` output from `zpool status`. After the operation is complete, the pool status changes to: [source,shell] .... # zpool status healer pool: healer state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-4J scan: scrub repaired 66.5M in 0h2m with 0 errors on Mon Dec 10 12:26:25 2012 config: NAME STATE READ WRITE CKSUM healer ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 2.72K errors: No known data errors .... After the scrubbing operation completes with all the data synchronized from [.filename]#ada0# to [.filename]#ada1#, crossref:zfs[zfs-zpool-clear,clear] the error messages from the pool status by running `zpool clear`. [source,shell] .... # zpool clear healer # zpool status healer pool: healer state: ONLINE scan: scrub repaired 66.5M in 0h2m with 0 errors on Mon Dec 10 12:26:25 2012 config: NAME STATE READ WRITE CKSUM healer ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 errors: No known data errors .... The pool is now back to a fully working state, with all error counts now zero. [[zfs-zpool-online]] === Growing a Pool The smallest device in each vdev limits the usable size of a redundant pool. Replace the smallest device with a larger device. After completing a crossref:zfs[zfs-zpool-replace,replace] or crossref:zfs[zfs-term-resilver,resilver] operation, the pool can grow to use the capacity of the new device. For example, consider a mirror of a 1 TB drive and a 2 TB drive. The usable space is 1 TB. When replacing the 1 TB drive with another 2 TB drive, the resilvering process copies the existing data onto the new drive. As both of the devices now have 2 TB capacity, the mirror's available space grows to 2 TB. Start expansion by using `zpool online -e` on each device. After expanding all devices, the extra space becomes available to the pool. [[zfs-zpool-import]] === Importing and Exporting Pools _Export_ pools before moving them to another system. ZFS unmounts all datasets, marking each device as exported but still locked to prevent use by other disks. This allows pools to be _imported_ on other machines, other operating systems that support ZFS, and even different hardware architectures (with some caveats, see man:zpool[8]). When a dataset has open files, use `zpool export -f` to force exporting the pool. Use this with caution. The datasets are forcibly unmounted, potentially resulting in unexpected behavior by the applications which had open files on those datasets. Export a pool that is not in use: [source,shell] .... # zpool export mypool .... Importing a pool automatically mounts the datasets. If this is undesired behavior, use `zpool import -N` to prevent it. `zpool import -o` sets temporary properties for this specific import. `zpool import altroot=` allows importing a pool with a base mount point instead of the root of the file system. If the pool was last used on a different system and was not properly exported, force the import using `zpool import -f`. `zpool import -a` imports all pools that do not appear to be in use by another system. List all available pools for import: [source,shell] .... # zpool import pool: mypool id: 9930174748043525076 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: mypool ONLINE ada2p3 ONLINE .... Import the pool with an alternative root directory: [source,shell] .... # zpool import -o altroot=/mnt mypool # zfs list zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 110K 47.0G 31K /mnt/mypool .... [[zfs-zpool-upgrade]] === Upgrading a Storage Pool After upgrading FreeBSD, or if importing a pool from a system using an older version, manually upgrade the pool to the latest ZFS version to support newer features. Consider whether the pool may ever need importing on an older system before upgrading. Upgrading is a one-way process. Upgrade older pools is possible, but downgrading pools with newer features is not. Upgrade a v28 pool to support `Feature Flags`: [source,shell] .... # zpool status pool: mypool state: ONLINE status: The pool is formatted using a legacy on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on software that does not support feat flags. scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 errors: No known data errors # zpool upgrade This system supports ZFS pool feature flags. The following pools are formatted with legacy version numbers and are upgraded to use feature flags. After being upgraded, these pools will no longer be accessible by software that does not support feature flags. VER POOL --- ------------ 28 mypool Use 'zpool upgrade -v' for a list of available legacy versions. Every feature flags pool has all supported features enabled. # zpool upgrade mypool This system supports ZFS pool feature flags. Successfully upgraded 'mypool' from version 28 to feature flags. Enabled the following features on 'mypool': async_destroy empty_bpobj lz4_compress multi_vdev_crash_dump .... The newer features of ZFS will not be available until `zpool upgrade` has completed. Use `zpool upgrade -v` to see what new features the upgrade provides, as well as which features are already supported. Upgrade a pool to support new feature flags: [source,shell] .... # zpool status pool: mypool state: ONLINE status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 errors: No known data errors # zpool upgrade This system supports ZFS pool feature flags. All pools are formatted using feature flags. Some supported features are not enabled on the following pools. Once a feature is enabled the pool may become incompatible with software that does not support the feature. See zpool-features(7) for details. POOL FEATURE --------------- zstore multi_vdev_crash_dump spacemap_histogram enabled_txg hole_birth extensible_dataset bookmarks filesystem_limits # zpool upgrade mypool This system supports ZFS pool feature flags. Enabled the following features on 'mypool': spacemap_histogram enabled_txg hole_birth extensible_dataset bookmarks filesystem_limits .... [WARNING] ==== Update the boot code on systems that boot from a pool to support the new pool version. Use `gpart bootcode` on the partition that contains the boot code. Two types of bootcode are available, depending on way the system boots: GPT (the most common option) and EFI (for more modern systems). For legacy boot using GPT, use the following command: [source,shell] .... # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1 .... For systems using EFI to boot, execute the following command: [source,shell] .... # gpart bootcode -p /boot/boot1.efi -i 1 ada1 .... Apply the bootcode to all bootable disks in the pool. See man:gpart[8] for more information. ==== [[zfs-zpool-history]] === Displaying Recorded Pool History ZFS records commands that change the pool, including creating datasets, changing properties, or replacing a disk. Reviewing history about a pool's creation is useful, as is checking which user performed a specific action and when. History is not kept in a log file, but is part of the pool itself. The command to review this history is aptly named `zpool history`: [source,shell] .... # zpool history History for 'tank': 2013-02-26.23:02:35 zpool create tank mirror /dev/ada0 /dev/ada1 2013-02-27.18:50:58 zfs set atime=off tank 2013-02-27.18:51:09 zfs set checksum=fletcher4 tank 2013-02-27.18:51:18 zfs create tank/backup .... The output shows `zpool` and `zfs` commands altering the pool in some way along with a timestamp. Commands like `zfs list` are not included. When specifying no pool name, ZFS displays history of all pools. `zpool history` can show even more information when providing the options `-i` or `-l`. `-i` displays user-initiated events as well as internally logged ZFS events. [source,shell] .... # zpool history -i History for 'tank': 2013-02-26.23:02:35 [internal pool create txg:5] pool spa 28; zfs spa 28; zpl 5;uts 9.1-RELEASE 901000 amd64 2013-02-27.18:50:53 [internal property set txg:50] atime=0 dataset = 21 2013-02-27.18:50:58 zfs set atime=off tank 2013-02-27.18:51:04 [internal property set txg:53] checksum=7 dataset = 21 2013-02-27.18:51:09 zfs set checksum=fletcher4 tank 2013-02-27.18:51:13 [internal create txg:55] dataset = 39 2013-02-27.18:51:18 zfs create tank/backup .... Show more details by adding `-l`. Showing history records in a long format, including information like the name of the user who issued the command and the hostname on which the change happened. [source,shell] .... # zpool history -l History for 'tank': 2013-02-26.23:02:35 zpool create tank mirror /dev/ada0 /dev/ada1 [user 0 (root) on :global] 2013-02-27.18:50:58 zfs set atime=off tank [user 0 (root) on myzfsbox:global] 2013-02-27.18:51:09 zfs set checksum=fletcher4 tank [user 0 (root) on myzfsbox:global] 2013-02-27.18:51:18 zfs create tank/backup [user 0 (root) on myzfsbox:global] .... The output shows that the `root` user created the mirrored pool with disks [.filename]#/dev/ada0# and [.filename]#/dev/ada1#. The hostname `myzfsbox` is also shown in the commands after the pool's creation. The hostname display becomes important when exporting the pool from one system and importing on another. It's possible to distinguish the commands issued on the other system by the hostname recorded for each command. Combine both options to `zpool history` to give the most detailed information possible for any given pool. Pool history provides valuable information when tracking down the actions performed or when needing more detailed output for debugging. [[zfs-zpool-iostat]] === Performance Monitoring A built-in monitoring system can display pool I/O statistics in real time. It shows the amount of free and used space on the pool, read and write operations performed per second, and I/O bandwidth used. By default, ZFS monitors and displays all pools in the system. Provide a pool name to limit monitoring to that pool. A basic example: [source,shell] .... # zpool iostat capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- data 288G 1.53T 2 11 11.3K 57.1K .... To continuously see I/O activity, specify a number as the last parameter, indicating an interval in seconds to wait between updates. The next statistic line prints after each interval. Press kbd:[Ctrl+C] to stop this continuous monitoring. Give a second number on the command line after the interval to specify the total number of statistics to display. Display even more detailed I/O statistics with `-v`. Each device in the pool appears with a statistics line. This is useful for seeing read and write operations performed on each device, and can help determine if any individual device is slowing down the pool. This example shows a mirrored pool with two devices: [source,shell] .... # zpool iostat -v capacity operations bandwidth pool alloc free read write read write ----------------------- ----- ----- ----- ----- ----- ----- data 288G 1.53T 2 12 9.23K 61.5K mirror 288G 1.53T 2 12 9.23K 61.5K ada1 - - 0 4 5.61K 61.7K ada2 - - 1 4 5.04K 61.7K ----------------------- ----- ----- ----- ----- ----- ----- .... [[zfs-zpool-split]] === Splitting a Storage Pool ZFS can split a pool consisting of one or more mirror vdevs into two pools. Unless otherwise specified, ZFS detaches the last member of each mirror and creates a new pool containing the same data. Be sure to make a dry run of the operation with `-n` first. This displays the details of the requested operation without actually performing it. This helps confirm that the operation will do what the user intends. [[zfs-zfs]] == `zfs` Administration The `zfs` utility can create, destroy, and manage all existing ZFS datasets within a pool. To manage the pool itself, use crossref:zfs[zfs-zpool,`zpool`]. [[zfs-zfs-create]] === Creating and Destroying Datasets Unlike traditional disks and volume managers, space in ZFS is _not_ preallocated. With traditional file systems, after partitioning and assigning the space, there is no way to add a new file system without adding a new disk. With ZFS, creating new file systems is possible at any time. Each crossref:zfs[zfs-term-dataset,_dataset_] has properties including features like compression, deduplication, caching, and quotas, as well as other useful properties like readonly, case sensitivity, network file sharing, and a mount point. Nesting datasets within each other is possible and child datasets will inherit properties from their ancestors. crossref:zfs[zfs-zfs-allow,Delegate], crossref:zfs[zfs-zfs-send,replicate], crossref:zfs[zfs-zfs-snapshot,snapshot], crossref:zfs[zfs-zfs-jail,jail] allows administering and destroying each dataset as a unit. Creating a separate dataset for each different type or set of files has advantages. The drawbacks to having a large number of datasets are that some commands like `zfs list` will be slower, and that mounting of hundreds or even thousands of datasets will slow the FreeBSD boot process. Create a new dataset and enable crossref:zfs[zfs-term-compression-lz4,LZ4 compression] on it: [source,shell] .... # zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 781M 93.2G 144K none mypool/ROOT 777M 93.2G 144K none mypool/ROOT/default 777M 93.2G 777M / mypool/tmp 176K 93.2G 176K /tmp mypool/usr 616K 93.2G 144K /usr mypool/usr/home 184K 93.2G 184K /usr/home mypool/usr/ports 144K 93.2G 144K /usr/ports mypool/usr/src 144K 93.2G 144K /usr/src mypool/var 1.20M 93.2G 608K /var mypool/var/crash 148K 93.2G 148K /var/crash mypool/var/log 178K 93.2G 178K /var/log mypool/var/mail 144K 93.2G 144K /var/mail mypool/var/tmp 152K 93.2G 152K /var/tmp # zfs create -o compress=lz4 mypool/usr/mydataset # zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 781M 93.2G 144K none mypool/ROOT 777M 93.2G 144K none mypool/ROOT/default 777M 93.2G 777M / mypool/tmp 176K 93.2G 176K /tmp mypool/usr 704K 93.2G 144K /usr mypool/usr/home 184K 93.2G 184K /usr/home mypool/usr/mydataset 87.5K 93.2G 87.5K /usr/mydataset mypool/usr/ports 144K 93.2G 144K /usr/ports mypool/usr/src 144K 93.2G 144K /usr/src mypool/var 1.20M 93.2G 610K /var mypool/var/crash 148K 93.2G 148K /var/crash mypool/var/log 178K 93.2G 178K /var/log mypool/var/mail 144K 93.2G 144K /var/mail mypool/var/tmp 152K 93.2G 152K /var/tmp .... Destroying a dataset is much quicker than deleting the files on the dataset, as it does not involve scanning the files and updating the corresponding metadata. Destroy the created dataset: [source,shell] .... # zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 880M 93.1G 144K none mypool/ROOT 777M 93.1G 144K none mypool/ROOT/default 777M 93.1G 777M / mypool/tmp 176K 93.1G 176K /tmp mypool/usr 101M 93.1G 144K /usr mypool/usr/home 184K 93.1G 184K /usr/home mypool/usr/mydataset 100M 93.1G 100M /usr/mydataset mypool/usr/ports 144K 93.1G 144K /usr/ports mypool/usr/src 144K 93.1G 144K /usr/src mypool/var 1.20M 93.1G 610K /var mypool/var/crash 148K 93.1G 148K /var/crash mypool/var/log 178K 93.1G 178K /var/log mypool/var/mail 144K 93.1G 144K /var/mail mypool/var/tmp 152K 93.1G 152K /var/tmp # zfs destroy mypool/usr/mydataset # zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 781M 93.2G 144K none mypool/ROOT 777M 93.2G 144K none mypool/ROOT/default 777M 93.2G 777M / mypool/tmp 176K 93.2G 176K /tmp mypool/usr 616K 93.2G 144K /usr mypool/usr/home 184K 93.2G 184K /usr/home mypool/usr/ports 144K 93.2G 144K /usr/ports mypool/usr/src 144K 93.2G 144K /usr/src mypool/var 1.21M 93.2G 612K /var mypool/var/crash 148K 93.2G 148K /var/crash mypool/var/log 178K 93.2G 178K /var/log mypool/var/mail 144K 93.2G 144K /var/mail mypool/var/tmp 152K 93.2G 152K /var/tmp .... In modern versions of ZFS, `zfs destroy` is asynchronous, and the free space might take minutes to appear in the pool. Use `zpool get freeing _poolname_` to see the `freeing` property, that shows which datasets are having their blocks freed in the background. If there are child datasets, like crossref:zfs[zfs-term-snapshot,snapshots] or other datasets, destroying the parent is impossible. To destroy a dataset and its children, use `-r` to recursively destroy the dataset and its children. Use `-n -v` to list datasets and snapshots destroyed by this operation, without actually destroy anything. Space reclaimed by destroying snapshots is also shown. [[zfs-zfs-volume]] === Creating and Destroying Volumes A volume is a special dataset type. Rather than mounting as a file system, expose it as a block device under [.filename]#/dev/zvol/poolname/dataset#. This allows using the volume for other file systems, to back the disks of a virtual machine, or to make it available to other network hosts using protocols like iSCSI or HAST. Format a volume with any file system or without a file system to store raw data. To the user, a volume appears to be a regular disk. Putting ordinary file systems on these _zvols_ provides features that ordinary disks or file systems do not have. For example, using the compression property on a 250 MB volume allows creation of a compressed FAT file system. [source,shell] .... # zfs create -V 250m -o compression=on tank/fat32 # zfs list tank NAME USED AVAIL REFER MOUNTPOINT tank 258M 670M 31K /tank # newfs_msdos -F32 /dev/zvol/tank/fat32 # mount -t msdosfs /dev/zvol/tank/fat32 /mnt # df -h /mnt | grep fat32 Filesystem Size Used Avail Capacity Mounted on /dev/zvol/tank/fat32 249M 24k 249M 0% /mnt # mount | grep fat32 /dev/zvol/tank/fat32 on /mnt (msdosfs, local) .... Destroying a volume is much the same as destroying a regular file system dataset. The operation is nearly instantaneous, but it may take minutes to reclaim the free space in the background. [[zfs-zfs-rename]] === Renaming a Dataset To change the name of a dataset, use `zfs rename`. To change the parent of a dataset, use this command as well. Renaming a dataset to have a different parent dataset will change the value of those properties inherited from the parent dataset. Renaming a dataset unmounts then remounts it in the new location (inherited from the new parent dataset). To prevent this behavior, use `-u`. Rename a dataset and move it to be under a different parent dataset: [source,shell] .... # zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 780M 93.2G 144K none mypool/ROOT 777M 93.2G 144K none mypool/ROOT/default 777M 93.2G 777M / mypool/tmp 176K 93.2G 176K /tmp mypool/usr 704K 93.2G 144K /usr mypool/usr/home 184K 93.2G 184K /usr/home mypool/usr/mydataset 87.5K 93.2G 87.5K /usr/mydataset mypool/usr/ports 144K 93.2G 144K /usr/ports mypool/usr/src 144K 93.2G 144K /usr/src mypool/var 1.21M 93.2G 614K /var mypool/var/crash 148K 93.2G 148K /var/crash mypool/var/log 178K 93.2G 178K /var/log mypool/var/mail 144K 93.2G 144K /var/mail mypool/var/tmp 152K 93.2G 152K /var/tmp # zfs rename mypool/usr/mydataset mypool/var/newname # zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 780M 93.2G 144K none mypool/ROOT 777M 93.2G 144K none mypool/ROOT/default 777M 93.2G 777M / mypool/tmp 176K 93.2G 176K /tmp mypool/usr 616K 93.2G 144K /usr mypool/usr/home 184K 93.2G 184K /usr/home mypool/usr/ports 144K 93.2G 144K /usr/ports mypool/usr/src 144K 93.2G 144K /usr/src mypool/var 1.29M 93.2G 614K /var mypool/var/crash 148K 93.2G 148K /var/crash mypool/var/log 178K 93.2G 178K /var/log mypool/var/mail 144K 93.2G 144K /var/mail mypool/var/newname 87.5K 93.2G 87.5K /var/newname mypool/var/tmp 152K 93.2G 152K /var/tmp .... Renaming snapshots uses the same command. Due to the nature of snapshots, rename cannot change their parent dataset. To rename a recursive snapshot, specify `-r`; this will also rename all snapshots with the same name in child datasets. [source,shell] .... # zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT mypool/var/newname@first_snapshot 0 - 87.5K - # zfs rename mypool/var/newname@first_snapshot new_snapshot_name # zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT mypool/var/newname@new_snapshot_name 0 - 87.5K - .... [[zfs-zfs-set]] === Setting Dataset Properties Each ZFS dataset has properties that control its behavior. Most properties are automatically inherited from the parent dataset, but can be overridden locally. Set a property on a dataset with `zfs set _property=value dataset_`. Most properties have a limited set of valid values, `zfs get` will display each possible property and valid values. Using `zfs inherit` reverts most properties to their inherited values. User-defined properties are also possible. They become part of the dataset configuration and provide further information about the dataset or its contents. To distinguish these custom properties from the ones supplied as part of ZFS, use a colon (`:`) to create a custom namespace for the property. [source,shell] .... # zfs set custom:costcenter=1234 tank # zfs get custom:costcenter tank NAME PROPERTY VALUE SOURCE tank custom:costcenter 1234 local .... To remove a custom property, use `zfs inherit` with `-r`. If the custom property is not defined in any of the parent datasets, this option removes it (but the pool's history still records the change). [source,shell] .... # zfs inherit -r custom:costcenter tank # zfs get custom:costcenter tank NAME PROPERTY VALUE SOURCE tank custom:costcenter - - # zfs get all tank | grep custom:costcenter # .... [[zfs-zfs-set-share]] ==== Getting and Setting Share Properties Two commonly used and useful dataset properties are the NFS and SMB share options. Setting these defines if and how ZFS shares datasets on the network. At present, FreeBSD supports setting NFS sharing alone. To get the current status of a share, enter: [source,shell] .... # zfs get sharenfs mypool/usr/home NAME PROPERTY VALUE SOURCE mypool/usr/home sharenfs on local # zfs get sharesmb mypool/usr/home NAME PROPERTY VALUE SOURCE mypool/usr/home sharesmb off local .... To enable sharing of a dataset, enter: [source,shell] .... # zfs set sharenfs=on mypool/usr/home .... Set other options for sharing datasets through NFS, such as `-alldirs`, `-maproot` and `-network`. To set options on a dataset shared through NFS, enter: [source,shell] .... # zfs set sharenfs="-alldirs,-maproot=root,-network=192.168.1.0/24" mypool/usr/home .... [[zfs-zfs-snapshot]] === Managing Snapshots crossref:zfs[zfs-term-snapshot,Snapshots] are one of the most powerful features of ZFS. A snapshot provides a read-only, point-in-time copy of the dataset. With Copy-On-Write (COW), ZFS creates snapshots fast by preserving older versions of the data on disk. If no snapshots exist, ZFS reclaims space for future use when data is rewritten or deleted. Snapshots preserve disk space by recording just the differences between the current dataset and a previous version. Allowing snapshots on whole datasets, not on individual files or directories. A snapshot from a dataset duplicates everything contained in it. This includes the file system properties, files, directories, permissions, and so on. Snapshots use no extra space when first created, but consume space as the blocks they reference change. Recursive snapshots taken with `-r` create snapshots with the same name on the dataset and its children, providing a consistent moment-in-time snapshot of the file systems. This can be important when an application has files on related datasets or that depend upon each other. Without snapshots, a backup would have copies of the files from different points in time. Snapshots in ZFS provide a variety of features that even other file systems with snapshot functionality lack. A typical example of snapshot use is as a quick way of backing up the current state of the file system when performing a risky action like a software installation or a system upgrade. If the action fails, rolling back to the snapshot returns the system to the same state when creating the snapshot. If the upgrade was successful, delete the snapshot to free up space. Without snapshots, a failed upgrade often requires restoring backups, which is tedious, time consuming, and may require downtime during which the system is unusable. Rolling back to snapshots is fast, even while the system is running in normal operation, with little or no downtime. The time savings are enormous with multi-terabyte storage systems considering the time required to copy the data from backup. Snapshots are not a replacement for a complete backup of a pool, but offer a quick and easy way to store a dataset copy at a specific time. [[zfs-zfs-snapshot-creation]] ==== Creating Snapshots To create snapshots, use `zfs snapshot _dataset_@_snapshotname_`. Adding `-r` creates a snapshot recursively, with the same name on all child datasets. Create a recursive snapshot of the entire pool: [source,shell] .... # zfs list -t all NAME USED AVAIL REFER MOUNTPOINT mypool 780M 93.2G 144K none mypool/ROOT 777M 93.2G 144K none mypool/ROOT/default 777M 93.2G 777M / mypool/tmp 176K 93.2G 176K /tmp mypool/usr 616K 93.2G 144K /usr mypool/usr/home 184K 93.2G 184K /usr/home mypool/usr/ports 144K 93.2G 144K /usr/ports mypool/usr/src 144K 93.2G 144K /usr/src mypool/var 1.29M 93.2G 616K /var mypool/var/crash 148K 93.2G 148K /var/crash mypool/var/log 178K 93.2G 178K /var/log mypool/var/mail 144K 93.2G 144K /var/mail mypool/var/newname 87.5K 93.2G 87.5K /var/newname mypool/var/newname@new_snapshot_name 0 - 87.5K - mypool/var/tmp 152K 93.2G 152K /var/tmp # zfs snapshot -r mypool@my_recursive_snapshot # zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT mypool@my_recursive_snapshot 0 - 144K - mypool/ROOT@my_recursive_snapshot 0 - 144K - mypool/ROOT/default@my_recursive_snapshot 0 - 777M - mypool/tmp@my_recursive_snapshot 0 - 176K - mypool/usr@my_recursive_snapshot 0 - 144K - mypool/usr/home@my_recursive_snapshot 0 - 184K - mypool/usr/ports@my_recursive_snapshot 0 - 144K - mypool/usr/src@my_recursive_snapshot 0 - 144K - mypool/var@my_recursive_snapshot 0 - 616K - mypool/var/crash@my_recursive_snapshot 0 - 148K - mypool/var/log@my_recursive_snapshot 0 - 178K - mypool/var/mail@my_recursive_snapshot 0 - 144K - mypool/var/newname@new_snapshot_name 0 - 87.5K - mypool/var/newname@my_recursive_snapshot 0 - 87.5K - mypool/var/tmp@my_recursive_snapshot 0 - 152K - .... Snapshots are not shown by a normal `zfs list` operation. To list snapshots, append `-t snapshot` to `zfs list`. `-t all` displays both file systems and snapshots. Snapshots are not mounted directly, showing no path in the `MOUNTPOINT` column. ZFS does not mention available disk space in the `AVAIL` column, as snapshots are read-only after their creation. Compare the snapshot to the original dataset: [source,shell] .... # zfs list -rt all mypool/usr/home NAME USED AVAIL REFER MOUNTPOINT mypool/usr/home 184K 93.2G 184K /usr/home mypool/usr/home@my_recursive_snapshot 0 - 184K - .... Displaying both the dataset and the snapshot together reveals how snapshots work in crossref:zfs[zfs-term-cow,COW] fashion. They save the changes (_delta_) made and not the complete file system contents all over again. This means that snapshots take little space when making changes. Observe space usage even more by copying a file to the dataset, then creating a second snapshot: [source,shell] .... # cp /etc/passwd /var/tmp # zfs snapshot mypool/var/tmp@after_cp # zfs list -rt all mypool/var/tmp NAME USED AVAIL REFER MOUNTPOINT mypool/var/tmp 206K 93.2G 118K /var/tmp mypool/var/tmp@my_recursive_snapshot 88K - 152K - mypool/var/tmp@after_cp 0 - 118K - .... The second snapshot contains the changes to the dataset after the copy operation. This yields enormous space savings. Notice that the size of the snapshot `_mypool/var/tmp@my_recursive_snapshot_` also changed in the `USED` column to show the changes between itself and the snapshot taken afterwards. [[zfs-zfs-snapshot-diff]] ==== Comparing Snapshots ZFS provides a built-in command to compare the differences in content between two snapshots. This is helpful with a lot of snapshots taken over time when the user wants to see how the file system has changed over time. For example, `zfs diff` lets a user find the latest snapshot that still contains a file deleted by accident. Doing this for the two snapshots created in the previous section yields this output: [source,shell] .... # zfs list -rt all mypool/var/tmp NAME USED AVAIL REFER MOUNTPOINT mypool/var/tmp 206K 93.2G 118K /var/tmp mypool/var/tmp@my_recursive_snapshot 88K - 152K - mypool/var/tmp@after_cp 0 - 118K - # zfs diff mypool/var/tmp@my_recursive_snapshot M /var/tmp/ + /var/tmp/passwd .... The command lists the changes between the specified snapshot (in this case `_mypool/var/tmp@my_recursive_snapshot_`) and the live file system. The first column shows the change type: [.informaltable] [cols="20%,80%"] |=== |+ |Adding the path or file. |- |Deleting the path or file. |M |Modifying the path or file. |R |Renaming the path or file. |=== Comparing the output with the table, it becomes clear that ZFS added [.filename]#passwd# after creating the snapshot `_mypool/var/tmp@my_recursive_snapshot_`. This also resulted in a modification to the parent directory mounted at `_/var/tmp_`. Comparing two snapshots is helpful when using the ZFS replication feature to transfer a dataset to a different host for backup purposes. Compare two snapshots by providing the full dataset name and snapshot name of both datasets: [source,shell] .... # cp /var/tmp/passwd /var/tmp/passwd.copy # zfs snapshot mypool/var/tmp@diff_snapshot # zfs diff mypool/var/tmp@my_recursive_snapshot mypool/var/tmp@diff_snapshot M /var/tmp/ + /var/tmp/passwd + /var/tmp/passwd.copy # zfs diff mypool/var/tmp@my_recursive_snapshot mypool/var/tmp@after_cp M /var/tmp/ + /var/tmp/passwd .... A backup administrator can compare two snapshots received from the sending host and determine the actual changes in the dataset. See the crossref:zfs[zfs-zfs-send,Replication] section for more information. [[zfs-zfs-snapshot-rollback]] ==== Snapshot Rollback When at least one snapshot is available, roll back to it at any time. Most often this is the case when the current state of the dataset is no longer valid or an older version is preferred. Scenarios such as local development tests gone wrong, botched system updates hampering the system functionality, or the need to restore deleted files or directories are all too common occurrences. To roll back a snapshot, use `zfs rollback _snapshotname_`. If a lot of changes are present, the operation will take a long time. During that time, the dataset always remains in a consistent state, much like a database that conforms to ACID principles is performing a rollback. This is happening while the dataset is live and accessible without requiring a downtime. Once the snapshot rolled back, the dataset has the same state as it had when the snapshot was originally taken. Rolling back to a snapshot discards all other data in that dataset not part of the snapshot. Taking a snapshot of the current state of the dataset before rolling back to a previous one is a good idea when requiring some data later. This way, the user can roll back and forth between snapshots without losing data that is still valuable. In the first example, roll back a snapshot because a careless `rm` operation removed more data than intended. [source,shell] .... # zfs list -rt all mypool/var/tmp NAME USED AVAIL REFER MOUNTPOINT mypool/var/tmp 262K 93.2G 120K /var/tmp mypool/var/tmp@my_recursive_snapshot 88K - 152K - mypool/var/tmp@after_cp 53.5K - 118K - mypool/var/tmp@diff_snapshot 0 - 120K - # ls /var/tmp passwd passwd.copy vi.recover # rm /var/tmp/passwd* # ls /var/tmp vi.recover .... At this point, the user notices the removal of extra files and wants them back. ZFS provides an easy way to get them back using rollbacks, when performing snapshots of important data on a regular basis. To get the files back and start over from the last snapshot, issue the command: [source,shell] .... # zfs rollback mypool/var/tmp@diff_snapshot # ls /var/tmp passwd passwd.copy vi.recover .... The rollback operation restored the dataset to the state of the last snapshot. Rolling back to a snapshot taken much earlier with other snapshots taken afterwards is also possible. When trying to do this, ZFS will issue this warning: [source,shell] .... # zfs list -rt snapshot mypool/var/tmp AME USED AVAIL REFER MOUNTPOINT mypool/var/tmp@my_recursive_snapshot 88K - 152K - mypool/var/tmp@after_cp 53.5K - 118K - mypool/var/tmp@diff_snapshot 0 - 120K - # zfs rollback mypool/var/tmp@my_recursive_snapshot cannot rollback to 'mypool/var/tmp@my_recursive_snapshot': more recent snapshots exist use '-r' to force deletion of the following snapshots: mypool/var/tmp@after_cp mypool/var/tmp@diff_snapshot .... This warning means that snapshots exist between the current state of the dataset and the snapshot to which the user wants to roll back. To complete the rollback delete these snapshots. ZFS cannot track all the changes between different states of the dataset, because snapshots are read-only. ZFS will not delete the affected snapshots unless the user specifies `-r` to confirm that this is the desired action. If that is the intention, and understanding the consequences of losing all intermediate snapshots, issue the command: [source,shell] .... # zfs rollback -r mypool/var/tmp@my_recursive_snapshot # zfs list -rt snapshot mypool/var/tmp NAME USED AVAIL REFER MOUNTPOINT mypool/var/tmp@my_recursive_snapshot 8K - 152K - # ls /var/tmp vi.recover .... The output from `zfs list -t snapshot` confirms the removal of the intermediate snapshots as a result of `zfs rollback -r`. [[zfs-zfs-snapshot-snapdir]] ==== Restoring Individual Files from Snapshots Snapshots live in a hidden directory under the parent dataset: [.filename]#.zfs/snapshots/snapshotname#. By default, these directories will not show even when executing a standard `ls -a` . Although the directory doesn't show, access it like any normal directory. The property named `snapdir` controls whether these hidden directories show up in a directory listing. Setting the property to `visible` allows them to appear in the output of `ls` and other commands that deal with directory contents. [source,shell] .... # zfs get snapdir mypool/var/tmp NAME PROPERTY VALUE SOURCE mypool/var/tmp snapdir hidden default # ls -a /var/tmp . .. passwd vi.recover # zfs set snapdir=visible mypool/var/tmp # ls -a /var/tmp . .. .zfs passwd vi.recover .... Restore individual files to a previous state by copying them from the snapshot back to the parent dataset. The directory structure below [.filename]#.zfs/snapshot# has a directory named like the snapshots taken earlier to make it easier to identify them. The next example shows how to restore a file from the hidden [.filename]#.zfs# directory by copying it from the snapshot containing the latest version of the file: [source,shell] .... # rm /var/tmp/passwd # ls -a /var/tmp . .. .zfs vi.recover # ls /var/tmp/.zfs/snapshot after_cp my_recursive_snapshot # ls /var/tmp/.zfs/snapshot/after_cp passwd vi.recover # cp /var/tmp/.zfs/snapshot/after_cp/passwd /var/tmp .... Even if the `snapdir` property is set to hidden, running `ls .zfs/snapshot` will still list the contents of that directory. The administrator decides whether to display these directories. This is a per-dataset setting. Copying files or directories from this hidden [.filename]#.zfs/snapshot# is simple enough. Trying it the other way around results in this error: [source,shell] .... # cp /etc/rc.conf /var/tmp/.zfs/snapshot/after_cp/ cp: /var/tmp/.zfs/snapshot/after_cp/rc.conf: Read-only file system .... The error reminds the user that snapshots are read-only and cannot change after creation. Copying files into and removing them from snapshot directories are both disallowed because that would change the state of the dataset they represent. Snapshots consume space based on how much the parent file system has changed since the time of the snapshot. The `written` property of a snapshot tracks the space the snapshot uses. To destroy snapshots and reclaim the space, use `zfs destroy _dataset_@_snapshot_`. Adding `-r` recursively removes all snapshots with the same name under the parent dataset. Adding `-n -v` to the command displays a list of the snapshots to be deleted and an estimate of the space it would reclaim without performing the actual destroy operation. [[zfs-zfs-clones]] === Managing Clones A clone is a copy of a snapshot treated more like a regular dataset. Unlike a snapshot, a clone is writeable and mountable, and has its own properties. After creating a clone using `zfs clone`, destroying the originating snapshot is impossible. To reverse the child/parent relationship between the clone and the snapshot use `zfs promote`. Promoting a clone makes the snapshot become a child of the clone, rather than of the original parent dataset. This will change how ZFS accounts for the space, but not actually change the amount of space consumed. Mounting the clone anywhere within the ZFS file system hierarchy is possible, not only below the original location of the snapshot. To show the clone feature use this example dataset: [source,shell] .... # zfs list -rt all camino/home/joe NAME USED AVAIL REFER MOUNTPOINT camino/home/joe 108K 1.3G 87K /usr/home/joe camino/home/joe@plans 21K - 85.5K - camino/home/joe@backup 0K - 87K - .... A typical use for clones is to experiment with a specific dataset while keeping the snapshot around to fall back to in case something goes wrong. Since snapshots cannot change, create a read/write clone of a snapshot. After achieving the desired result in the clone, promote the clone to a dataset and remove the old file system. Removing the parent dataset is not strictly necessary, as the clone and dataset can coexist without problems. [source,shell] .... # zfs clone camino/home/joe@backup camino/home/joenew # ls /usr/home/joe* /usr/home/joe: backup.txz plans.txt /usr/home/joenew: backup.txz plans.txt # df -h /usr/home Filesystem Size Used Avail Capacity Mounted on usr/home/joe 1.3G 31k 1.3G 0% /usr/home/joe usr/home/joenew 1.3G 31k 1.3G 0% /usr/home/joenew .... Creating a clone makes it an exact copy of the state the dataset was in when taking the snapshot. Changing the clone independently from its originating dataset is possible now. The connection between the two is the snapshot. ZFS records this connection in the property `origin`. Promoting the clone with `zfs promote` makes the clone an independent dataset. This removes the value of the `origin` property and disconnects the newly independent dataset from the snapshot. This example shows it: [source,shell] .... # zfs get origin camino/home/joenew NAME PROPERTY VALUE SOURCE camino/home/joenew origin camino/home/joe@backup - # zfs promote camino/home/joenew # zfs get origin camino/home/joenew NAME PROPERTY VALUE SOURCE camino/home/joenew origin - - .... After making some changes like copying [.filename]#loader.conf# to the promoted clone, for example, the old directory becomes obsolete in this case. Instead, the promoted clone can replace it. To do this, `zfs destroy` the old dataset first and then `zfs rename` the clone to the old dataset name (or to an entirely different name). [source,shell] .... # cp /boot/defaults/loader.conf /usr/home/joenew # zfs destroy -f camino/home/joe # zfs rename camino/home/joenew camino/home/joe # ls /usr/home/joe backup.txz loader.conf plans.txt # df -h /usr/home Filesystem Size Used Avail Capacity Mounted on usr/home/joe 1.3G 128k 1.3G 0% /usr/home/joe .... The cloned snapshot is now an ordinary dataset. It contains all the data from the original snapshot plus the files added to it like [.filename]#loader.conf#. Clones provide useful features to ZFS users in different scenarios. For example, provide jails as snapshots containing different sets of installed applications. Users can clone these snapshots and add their own applications as they see fit. Once satisfied with the changes, promote the clones to full datasets and provide them to end users to work with like they would with a real dataset. This saves time and administrative overhead when providing these jails. [[zfs-zfs-send]] === Replication Keeping data on a single pool in one location exposes it to risks like theft and natural or human disasters. Making regular backups of the entire pool is vital. ZFS provides a built-in serialization feature that can send a stream representation of the data to standard output. Using this feature, storing this data on another pool connected to the local system is possible, as is sending it over a network to another system. Snapshots are the basis for this replication (see the section on crossref:zfs[zfs-zfs-snapshot,ZFS snapshots]). The commands used for replicating data are `zfs send` and `zfs receive`. These examples show ZFS replication with these two pools: [source,shell] .... # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT backup 960M 77K 896M - - 0% 0% 1.00x ONLINE - mypool 984M 43.7M 940M - - 0% 4% 1.00x ONLINE - .... The pool named _mypool_ is the primary pool where writing and reading data happens on a regular basis. Using a second standby pool _backup_ in case the primary pool becomes unavailable. Note that this fail-over is not done automatically by ZFS, but must be manually done by a system administrator when needed. Use a snapshot to provide a consistent file system version to replicate. After creating a snapshot of _mypool_, copy it to the _backup_ pool by replicating snapshots. This does not include changes made since the most recent snapshot. [source,shell] .... # zfs snapshot mypool@backup1 # zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT mypool@backup1 0 - 43.6M - .... Now that a snapshot exists, use `zfs send` to create a stream representing the contents of the snapshot. Store this stream as a file or receive it on another pool. Write the stream to standard output, but redirect to a file or pipe or an error appears: [source,shell] .... # zfs send mypool@backup1 Error: Stream can not be written to a terminal. You must redirect standard output. .... To back up a dataset with `zfs send`, redirect to a file located on the mounted backup pool. Ensure that the pool has enough free space to accommodate the size of the sent snapshot, which means the data contained in the snapshot, not the changes from the previous snapshot. [source,shell] .... # zfs send mypool@backup1 > /backup/backup1 # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT backup 960M 63.7M 896M - - 0% 6% 1.00x ONLINE - mypool 984M 43.7M 940M - - 0% 4% 1.00x ONLINE - .... The `zfs send` transferred all the data in the snapshot called _backup1_ to the pool named _backup_. To create and send these snapshots automatically, use a man:cron[8] job. Instead of storing the backups as archive files, ZFS can receive them as a live file system, allowing direct access to the backed up data. To get to the actual data contained in those streams, use `zfs receive` to transform the streams back into files and directories. The example below combines `zfs send` and `zfs receive` using a pipe to copy the data from one pool to another. Use the data directly on the receiving pool after the transfer is complete. It is only possible to replicate a dataset to an empty dataset. [source,shell] .... # zfs snapshot mypool@replica1 # zfs send -v mypool@replica1 | zfs receive backup/mypool send from @ to mypool@replica1 estimated size is 50.1M total estimated size is 50.1M TIME SENT SNAPSHOT # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT backup 960M 63.7M 896M - - 0% 6% 1.00x ONLINE - mypool 984M 43.7M 940M - - 0% 4% 1.00x ONLINE - .... [[zfs-send-incremental]] ==== Incremental Backups `zfs send` can also determine the difference between two snapshots and send individual differences between the two. This saves disk space and transfer time. For example: [source,shell] .... # zfs snapshot mypool@replica2 # zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT mypool@replica1 5.72M - 43.6M - mypool@replica2 0 - 44.1M - # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT backup 960M 61.7M 898M - - 0% 6% 1.00x ONLINE - mypool 960M 50.2M 910M - - 0% 5% 1.00x ONLINE - .... Create a second snapshot called _replica2_. This second snapshot contains changes made to the file system between now and the previous snapshot, _replica1_. Using `zfs send -i` and indicating the pair of snapshots generates an incremental replica stream containing the changed data. This succeeds if the initial snapshot already exists on the receiving side. [source,shell] .... # zfs send -v -i mypool@replica1 mypool@replica2 | zfs receive /backup/mypool send from @replica1 to mypool@replica2 estimated size is 5.02M total estimated size is 5.02M TIME SENT SNAPSHOT # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT backup 960M 80.8M 879M - - 0% 8% 1.00x ONLINE - mypool 960M 50.2M 910M - - 0% 5% 1.00x ONLINE - # zfs list NAME USED AVAIL REFER MOUNTPOINT backup 55.4M 240G 152K /backup backup/mypool 55.3M 240G 55.2M /backup/mypool mypool 55.6M 11.6G 55.0M /mypool # zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT backup/mypool@replica1 104K - 50.2M - backup/mypool@replica2 0 - 55.2M - mypool@replica1 29.9K - 50.0M - mypool@replica2 0 - 55.0M - .... The incremental stream replicated the changed data rather than the entirety of _replica1_. Sending the differences alone took much less time to transfer and saved disk space by not copying the whole pool each time. This is useful when replicating over a slow network or one charging per transferred byte. A new file system, _backup/mypool_, is available with the files and data from the pool _mypool_. Specifying `-p` copies the dataset properties including compression settings, quotas, and mount points. Specifying `-R` copies all child datasets of the dataset along with their properties. Automate sending and receiving to create regular backups on the second pool. [[zfs-send-ssh]] ==== Sending Encrypted Backups over SSH Sending streams over the network is a good way to keep a remote backup, but it does come with a drawback. Data sent over the network link is not encrypted, allowing anyone to intercept and transform the streams back into data without the knowledge of the sending user. This is undesirable when sending the streams over the internet to a remote host. Use SSH to securely encrypt data sent over a network connection. Since ZFS requires redirecting the stream from standard output, piping it through SSH is easy. To keep the contents of the file system encrypted in transit and on the remote system, consider using https://wiki.freebsd.org/PEFS[PEFS]. Change some settings and take security precautions first. This describes the necessary steps required for the `zfs send` operation; for more information on SSH, see crossref:security[openssh,"OpenSSH"]. Change the configuration as follows: * Passwordless SSH access between sending and receiving host using SSH keys * ZFS requires the privileges of the `root` user to send and receive streams. This requires logging in to the receiving system as `root`. * Security reasons prevent `root` from logging in by default. * Use the crossref:zfs[zfs-zfs-allow,ZFS Delegation] system to allow a non-`root` user on each system to perform the respective send and receive operations. On the sending system: [source,shell] .... # zfs allow -u someuser send,snapshot mypool .... * To mount the pool, the unprivileged user must own the directory, and regular users need permission to mount file systems. On the receiving system: [source,shell] .... # sysctl vfs.usermount=1 vfs.usermount: 0 -> 1 # echo vfs.usermount=1 >> /etc/sysctl.conf # zfs create recvpool/backup # zfs allow -u someuser create,mount,receive recvpool/backup # chown someuser /recvpool/backup .... The unprivileged user can receive and mount datasets now, and replicates the _home_ dataset to the remote system: [source,shell] .... % zfs snapshot -r mypool/home@monday % zfs send -R mypool/home@monday | ssh someuser@backuphost zfs recv -dvu recvpool/backup .... Create a recursive snapshot called _monday_ of the file system dataset _home_ on the pool _mypool_. Then `zfs send -R` includes the dataset, all child datasets, snapshots, clones, and settings in the stream. Pipe the output through SSH to the waiting `zfs receive` on the remote host _backuphost_. Using an IP address or fully qualified domain name is good practice. The receiving machine writes the data to the _backup_ dataset on the _recvpool_ pool. Adding `-d` to `zfs recv` overwrites the name of the pool on the receiving side with the name of the snapshot. `-u` causes the file systems to not mount on the receiving side. Using `-v` shows more details about the transfer, including the elapsed time and the amount of data transferred. [[zfs-zfs-quota]] === Dataset, User, and Group Quotas Use crossref:zfs[zfs-term-quota,Dataset quotas] to restrict the amount of space consumed by a particular dataset. crossref:zfs[zfs-term-refquota,Reference Quotas] work in much the same way, but count the space used by the dataset itself, excluding snapshots and child datasets. Similarly, use crossref:zfs[zfs-term-userquota,user] and crossref:zfs[zfs-term-groupquota,group] quotas to prevent users or groups from using up all the space in the pool or dataset. The following examples assume that the users already exist in the system. Before adding a user to the system, make sure to create their home dataset first and set the `mountpoint` to `/home/_bob_`. Then, create the user and make the home directory point to the dataset's `mountpoint` location. This will properly set owner and group permissions without shadowing any pre-existing home directory paths that might exist. To enforce a dataset quota of 10 GB for [.filename]#storage/home/bob#: [source,shell] .... # zfs set quota=10G storage/home/bob .... To enforce a reference quota of 10 GB for [.filename]#storage/home/bob#: [source,shell] .... # zfs set refquota=10G storage/home/bob .... To remove a quota of 10 GB for [.filename]#storage/home/bob#: [source,shell] .... # zfs set quota=none storage/home/bob .... The general format is `userquota@_user_=_size_`, and the user's name must be in one of these formats: * POSIX compatible name such as _joe_. * POSIX numeric ID such as _789_. * SID name such as _joe.bloggs@example.com_. * SID numeric ID such as _S-1-123-456-789_. For example, to enforce a user quota of 50 GB for the user named _joe_: [source,shell] .... # zfs set userquota@joe=50G .... To remove any quota: [source,shell] .... # zfs set userquota@joe=none .... [NOTE] ==== User quota properties are not displayed by `zfs get all`. Non-`root` users can't see other's quotas unless granted the `userquota` privilege. Users with this privilege are able to view and set everyone's quota. ==== The general format for setting a group quota is: `groupquota@_group_=_size_`. To set the quota for the group _firstgroup_ to 50 GB, use: [source,shell] .... # zfs set groupquota@firstgroup=50G .... To remove the quota for the group _firstgroup_, or to make sure that one is not set, instead use: [source,shell] .... # zfs set groupquota@firstgroup=none .... As with the user quota property, non-`root` users can see the quotas associated with the groups to which they belong. A user with the `groupquota` privilege or `root` can view and set all quotas for all groups. To display the amount of space used by each user on a file system or snapshot along with any quotas, use `zfs userspace`. For group information, use `zfs groupspace`. For more information about supported options or how to display specific options alone, refer to man:zfs[1]. Privileged users and `root` can list the quota for [.filename]#storage/home/bob# using: [source,shell] .... # zfs get quota storage/home/bob .... [[zfs-zfs-reservation]] === Reservations crossref:zfs[zfs-term-reservation,Reservations] guarantee an always-available amount of space on a dataset. The reserved space will not be available to any other dataset. This useful feature ensures that free space is available for an important dataset or log files. The general format of the `reservation` property is `reservation=_size_`, so to set a reservation of 10 GB on [.filename]#storage/home/bob#, use: [source,shell] .... # zfs set reservation=10G storage/home/bob .... To clear any reservation: [source,shell] .... # zfs set reservation=none storage/home/bob .... The same principle applies to the `refreservation` property for setting a crossref:zfs[zfs-term-refreservation,Reference Reservation], with the general format `refreservation=_size_`. This command shows any reservations or refreservations that exist on [.filename]#storage/home/bob#: [source,shell] .... # zfs get reservation storage/home/bob # zfs get refreservation storage/home/bob .... [[zfs-zfs-compression]] === Compression ZFS provides transparent compression. Compressing data written at the block level saves space and also increases disk throughput. If data compresses by 25% the compressed data writes to the disk at the same rate as the uncompressed version, resulting in an effective write speed of 125%. Compression can also be a great alternative to crossref:zfs[zfs-zfs-deduplication,Deduplication] because it does not require extra memory. ZFS offers different compression algorithms, each with different trade-offs. The introduction of LZ4 compression in ZFS v5000 enables compressing the entire pool without the large performance trade-off of other algorithms. The biggest advantage to LZ4 is the _early abort_ feature. If LZ4 does not achieve at least 12.5% compression in the header part of the data, ZFS writes the block uncompressed to avoid wasting CPU cycles trying to compress data that is either already compressed or uncompressible. For details about the different compression algorithms available in ZFS, see the crossref:zfs[zfs-term-compression,Compression] entry in the terminology section. The administrator can see the effectiveness of compression using dataset properties. [source,shell] .... # zfs get used,compressratio,compression,logicalused mypool/compressed_dataset NAME PROPERTY VALUE SOURCE mypool/compressed_dataset used 449G - mypool/compressed_dataset compressratio 1.11x - mypool/compressed_dataset compression lz4 local mypool/compressed_dataset logicalused 496G - .... The dataset is using 449 GB of space (the used property). Without compression, it would have taken 496 GB of space (the `logicalused` property). This results in a 1.11:1 compression ratio. Compression can have an unexpected side effect when combined with crossref:zfs[zfs-term-userquota,User Quotas]. User quotas restrict how much actual space a user consumes on a dataset _after compression_. If a user has a quota of 10 GB, and writes 10 GB of compressible data, they will still be able to store more data. If they later update a file, say a database, with more or less compressible data, the amount of space available to them will change. This can result in the odd situation where a user did not increase the actual amount of data (the `logicalused` property), but the change in compression caused them to reach their quota limit. Compression can have a similar unexpected interaction with backups. Quotas are often used to limit data storage to ensure there is enough backup space available. Since quotas do not consider compression ZFS may write more data than would fit with uncompressed backups. [[zfs-zfs-compression-zstd]] === Zstandard Compression OpenZFS 2.0 added a new compression algorithm. -Zstandard (Zstd) offers higher compression ratios than the default LZ4 while offering much greater speeds than the alternative, gzip. OpenZFS 2.0 is available starting with FreeBSD 12.1-RELEASE via package:sysutils/openzfs[] and has been the default in since FreeBSD 13.0-RELEASE. +Zstandard (Zstd) offers higher compression ratios than the default LZ4 while offering much greater speeds than the alternative, gzip. OpenZFS 2.0 is available starting with FreeBSD 12.1-RELEASE via package:sysutils/openzfs[] and has been the default since FreeBSD 13.0-RELEASE. Zstd provides a large selection of compression levels, providing fine-grained control over performance versus compression ratio. One of the main advantages of Zstd is that the decompression speed is independent of the compression level. For data written once but read often, Zstd allows the use of the highest compression levels without a read performance penalty. Even with frequent data updates, enabling compression often provides higher performance. One of the biggest advantages comes from the compressed ARC feature. ZFS's Adaptive Replacement Cache (ARC) caches the compressed version of the data in RAM, decompressing it each time. This allows the same amount of RAM to store more data and metadata, increasing the cache hit ratio. ZFS offers 19 levels of Zstd compression, each offering incrementally more space savings in exchange for slower compression. The default level is `zstd-3` and offers greater compression than LZ4 without being much slower. Levels above 10 require large amounts of memory to compress each block and systems with less than 16 GB of RAM should not use them. ZFS uses a selection of the Zstd_fast_ levels also, which get correspondingly faster but supports lower compression ratios. ZFS supports `zstd-fast-1` through `zstd-fast-10`, `zstd-fast-20` through `zstd-fast-100` in increments of 10, and `zstd-fast-500` and `zstd-fast-1000` which provide minimal compression, but offer high performance. If ZFS is not able to get the required memory to compress a block with Zstd, it will fall back to storing the block uncompressed. This is unlikely to happen except at the highest levels of Zstd on memory constrained systems. ZFS counts how often this has occurred since loading the ZFS module with `kstat.zfs.misc.zstd.compress_alloc_fail`. [[zfs-zfs-deduplication]] === Deduplication When enabled, crossref:zfs[zfs-term-deduplication,deduplication] uses the checksum of each block to detect duplicate blocks. When a new block is a duplicate of an existing block, ZFS writes a new reference to the existing data instead of the whole duplicate block. Tremendous space savings are possible if the data contains a lot of duplicated files or repeated information. Warning: deduplication requires a large amount of memory, and enabling compression instead provides most of the space savings without the extra cost. To activate deduplication, set the `dedup` property on the target pool: [source,shell] .... # zfs set dedup=on pool .... Deduplicating only affects new data written to the pool. Merely activating this option will not deduplicate data already written to the pool. A pool with a freshly activated deduplication property will look like this example: [source,shell] .... # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT pool 2.84G 2.19M 2.83G - - 0% 0% 1.00x ONLINE - .... The `DEDUP` column shows the actual rate of deduplication for the pool. A value of `1.00x` shows that data has not deduplicated yet. The next example copies some system binaries three times into different directories on the deduplicated pool created above. [source,shell] .... # for d in dir1 dir2 dir3; do > mkdir $d && cp -R /usr/bin $d & > done .... To observe deduplicating of redundant data, use: [source,shell] .... # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT pool 2.84G 20.9M 2.82G - - 0% 0% 3.00x ONLINE - .... The `DEDUP` column shows a factor of `3.00x`. Detecting and deduplicating copies of the data uses a third of the space. The potential for space savings can be enormous, but comes at the cost of having enough memory to keep track of the deduplicated blocks. Deduplication is not always beneficial when the data in a pool is not redundant. ZFS can show potential space savings by simulating deduplication on an existing pool: [source,shell] .... # zdb -S pool Simulated DDT histogram: bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 2.58M 289G 264G 264G 2.58M 289G 264G 264G 2 206K 12.6G 10.4G 10.4G 430K 26.4G 21.6G 21.6G 4 37.6K 692M 276M 276M 170K 3.04G 1.26G 1.26G 8 2.18K 45.2M 19.4M 19.4M 20.0K 425M 176M 176M 16 174 2.83M 1.20M 1.20M 3.33K 48.4M 20.4M 20.4M 32 40 2.17M 222K 222K 1.70K 97.2M 9.91M 9.91M 64 9 56K 10.5K 10.5K 865 4.96M 948K 948K 128 2 9.50K 2K 2K 419 2.11M 438K 438K 256 5 61.5K 12K 12K 1.90K 23.0M 4.47M 4.47M 1K 2 1K 1K 1K 2.98K 1.49M 1.49M 1.49M Total 2.82M 303G 275G 275G 3.20M 319G 287G 287G dedup = 1.05, compress = 1.11, copies = 1.00, dedup * compress / copies = 1.16 .... After `zdb -S` finishes analyzing the pool, it shows the space reduction ratio that activating deduplication would achieve. In this case, `1.16` is a poor space saving ratio mainly provided by compression. Activating deduplication on this pool would not save any amount of space, and is not worth the amount of memory required to enable deduplication. Using the formula _ratio = dedup * compress / copies_, system administrators can plan the storage allocation, deciding whether the workload will contain enough duplicate blocks to justify the memory requirements. If the data is reasonably compressible, the space savings may be good. Good practice is to enable compression first as compression also provides greatly increased performance. Enable deduplication in cases where savings are considerable and with enough available memory for the crossref:zfs[zfs-term-deduplication,DDT]. [[zfs-zfs-jail]] === ZFS and Jails Use `zfs jail` and the corresponding `jailed` property to delegate a ZFS dataset to a crossref:jails[jails,Jail]. `zfs jail _jailid_` attaches a dataset to the specified jail, and `zfs unjail` detaches it. To control the dataset from within a jail, set the `jailed` property. ZFS forbids mounting a jailed dataset on the host because it may have mount points that would compromise the security of the host. [[zfs-zfs-allow]] == Delegated Administration A comprehensive permission delegation system allows unprivileged users to perform ZFS administration functions. For example, if each user's home directory is a dataset, users need permission to create and destroy snapshots of their home directories. A user performing backups can get permission to use replication features. ZFS allows a usage statistics script to run with access to only the space usage data for all users. Delegating the ability to delegate permissions is also possible. Permission delegation is possible for each subcommand and most properties. [[zfs-zfs-allow-create]] === Delegating Dataset Creation `zfs allow _someuser_ create _mydataset_` gives the specified user permission to create child datasets under the selected parent dataset. A caveat: creating a new dataset involves mounting it. That requires setting the FreeBSD `vfs.usermount` man:sysctl[8] to `1` to allow non-root users to mount a file system. Another restriction aimed at preventing abuse: non-`root` users must own the mountpoint where mounting the file system. [[zfs-zfs-allow-allow]] === Delegating Permission Delegation `zfs allow _someuser_ allow _mydataset_` gives the specified user the ability to assign any permission they have on the target dataset, or its children, to other users. If a user has the `snapshot` permission and the `allow` permission, that user can then grant the `snapshot` permission to other users. [[zfs-advanced]] == Advanced Topics [[zfs-advanced-tuning]] === Tuning Adjust tunables to make ZFS perform best for different workloads. * [[zfs-advanced-tuning-arc_max]] `_vfs.zfs.arc.max_` starting with 13.x (`vfs.zfs.arc_max` for 12.x) - Upper size of the crossref:zfs[zfs-term-arc,ARC]. The default is all RAM but 1 GB, or 5/8 of all RAM, whichever is more. Use a lower value if the system runs any other daemons or processes that may require memory. Adjust this value at runtime with man:sysctl[8] and set it in [.filename]#/boot/loader.conf# or [.filename]#/etc/sysctl.conf#. * [[zfs-advanced-tuning-arc_meta_limit]] `_vfs.zfs.arc.meta_limit_` starting with 13.x (`vfs.zfs.arc_meta_limit` for 12.x) - Limit the amount of the crossref:zfs[zfs-term-arc,ARC] used to store metadata. The default is one fourth of `vfs.zfs.arc.max`. Increasing this value will improve performance if the workload involves operations on a large number of files and directories, or frequent metadata operations, at the cost of less file data fitting in the crossref:zfs[zfs-term-arc,ARC]. Adjust this value at runtime with man:sysctl[8] in [.filename]#/boot/loader.conf# or [.filename]#/etc/sysctl.conf#. * [[zfs-advanced-tuning-arc_min]] `_vfs.zfs.arc.min_` starting with 13.x (`vfs.zfs.arc_min` for 12.x) - Lower size of the crossref:zfs[zfs-term-arc,ARC]. The default is one half of `vfs.zfs.arc.meta_limit`. Adjust this value to prevent other applications from pressuring out the entire crossref:zfs[zfs-term-arc,ARC]. Adjust this value at runtime with man:sysctl[8] and in [.filename]#/boot/loader.conf# or [.filename]#/etc/sysctl.conf#. * [[zfs-advanced-tuning-vdev-cache-size]] `_vfs.zfs.vdev.cache.size_` - A preallocated amount of memory reserved as a cache for each device in the pool. The total amount of memory used will be this value multiplied by the number of devices. Set this value at boot time and in [.filename]#/boot/loader.conf#. * [[zfs-advanced-tuning-min-auto-ashift]] `_vfs.zfs.min_auto_ashift_` - Lower `ashift` (sector size) used automatically at pool creation time. The value is a power of two. The default value of `9` represents `2^9 = 512`, a sector size of 512 bytes. To avoid _write amplification_ and get the best performance, set this value to the largest sector size used by a device in the pool. + Common drives have 4 KB sectors. Using the default `ashift` of `9` with these drives results in write amplification on these devices. Data contained in a single 4 KB write is instead written in eight 512-byte writes. ZFS tries to read the native sector size from all devices when creating a pool, but drives with 4 KB sectors report that their sectors are 512 bytes for compatibility. Setting `vfs.zfs.min_auto_ashift` to `12` (`2^12 = 4096`) before creating a pool forces ZFS to use 4 KB blocks for best performance on these drives. + Forcing 4 KB blocks is also useful on pools with planned disk upgrades. Future disks use 4 KB sectors, and `ashift` values cannot change after creating a pool. + In some specific cases, the smaller 512-byte block size might be preferable. When used with 512-byte disks for databases or as storage for virtual machines, less data transfers during small random reads. This can provide better performance when using a smaller ZFS record size. * [[zfs-advanced-tuning-prefetch_disable]] `_vfs.zfs.prefetch.disable_` - Disable prefetch. A value of `0` enables and `1` disables it. The default is `0`, unless the system has less than 4 GB of RAM. Prefetch works by reading larger blocks than requested into the crossref:zfs[zfs-term-arc,ARC] in hopes to soon need the data. If the workload has a large number of random reads, disabling prefetch may actually improve performance by reducing unnecessary reads. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-vdev-trim_on_init]] `_vfs.zfs.vdev.trim_on_init_` - Control whether new devices added to the pool have the `TRIM` command run on them. This ensures the best performance and longevity for SSDs, but takes extra time. If the device has already been secure erased, disabling this setting will make the addition of the new device faster. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-vdev-max_pending]] `_vfs.zfs.vdev.max_pending_` - Limit the number of pending I/O requests per device. A higher value will keep the device command queue full and may give higher throughput. A lower value will reduce latency. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-top_maxinflight]] `_vfs.zfs.top_maxinflight_` - Upper number of outstanding I/Os per top-level crossref:zfs[zfs-term-vdev,vdev]. Limits the depth of the command queue to prevent high latency. The limit is per top-level vdev, meaning the limit applies to each crossref:zfs[zfs-term-vdev-mirror,mirror], crossref:zfs[zfs-term-vdev-raidz,RAID-Z], or other vdev independently. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-l2arc_write_max]] `_vfs.zfs.l2arc_write_max_` - Limit the amount of data written to the crossref:zfs[zfs-term-l2arc,L2ARC] per second. This tunable extends the longevity of SSDs by limiting the amount of data written to the device. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-l2arc_write_boost]] `_vfs.zfs.l2arc_write_boost_` - Adds the value of this tunable to crossref:zfs[zfs-advanced-tuning-l2arc_write_max,`vfs.zfs.l2arc_write_max`] and increases the write speed to the SSD until evicting the first block from the crossref:zfs[zfs-term-l2arc,L2ARC]. This "Turbo Warmup Phase" reduces the performance loss from an empty crossref:zfs[zfs-term-l2arc,L2ARC] after a reboot. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-scrub_delay]]`_vfs.zfs.scrub_delay_` - Number of ticks to delay between each I/O during a crossref:zfs[zfs-term-scrub,`scrub`]. To ensure that a `scrub` does not interfere with the normal operation of the pool, if any other I/O is happening the `scrub` will delay between each command. This value controls the limit on the total IOPS (I/Os Per Second) generated by the `scrub`. The granularity of the setting is determined by the value of `kern.hz` which defaults to 1000 ticks per second. Changing this setting results in a different effective IOPS limit. The default value is `4`, resulting in a limit of: 1000 ticks/sec / 4 = 250 IOPS. Using a value of _20_ would give a limit of: 1000 ticks/sec / 20 = 50 IOPS. Recent activity on the pool limits the speed of `scrub`, as determined by crossref:zfs[zfs-advanced-tuning-scan_idle,`vfs.zfs.scan_idle`]. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-resilver_delay]] `_vfs.zfs.resilver_delay_` - Number of milliseconds of delay inserted between each I/O during a crossref:zfs[zfs-term-resilver,resilver]. To ensure that a resilver does not interfere with the normal operation of the pool, if any other I/O is happening the resilver will delay between each command. This value controls the limit of total IOPS (I/Os Per Second) generated by the resilver. ZFS determins the granularity of the setting by the value of `kern.hz` which defaults to 1000 ticks per second. Changing this setting results in a different effective IOPS limit. The default value is 2, resulting in a limit of: 1000 ticks/sec / 2 = 500 IOPS. Returning the pool to an crossref:zfs[zfs-term-online,Online] state may be more important if another device failing could crossref:zfs[zfs-term-faulted,Fault] the pool, causing data loss. A value of 0 will give the resilver operation the same priority as other operations, speeding the healing process. Other recent activity on the pool limits the speed of resilver, as determined by crossref:zfs[zfs-advanced-tuning-scan_idle,`vfs.zfs.scan_idle`]. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-scan_idle]] `_vfs.zfs.scan_idle_` - Number of milliseconds since the last operation before considering the pool is idle. ZFS disables the rate limiting for crossref:zfs[zfs-term-scrub,`scrub`] and crossref:zfs[zfs-term-resilver,resilver] when the pool is idle. Adjust this value at any time with man:sysctl[8]. * [[zfs-advanced-tuning-txg-timeout]] `_vfs.zfs.txg.timeout_` - Upper number of seconds between crossref:zfs[zfs-term-txg,transaction group]s. The current transaction group writes to the pool and a fresh transaction group starts if this amount of time elapsed since the previous transaction group. A transaction group may trigger earlier if writing enough data. The default value is 5 seconds. A larger value may improve read performance by delaying asynchronous writes, but this may cause uneven performance when writing the transaction group. Adjust this value at any time with man:sysctl[8]. [[zfs-advanced-i386]] === ZFS on i386 Some of the features provided by ZFS are memory intensive, and may require tuning for upper efficiency on systems with limited RAM. ==== Memory As a lower value, the total system memory should be at least one gigabyte. The amount of recommended RAM depends upon the size of the pool and which features ZFS uses. A general rule of thumb is 1 GB of RAM for every 1 TB of storage. If using the deduplication feature, a general rule of thumb is 5 GB of RAM per TB of storage to deduplicate. While some users use ZFS with less RAM, systems under heavy load may panic due to memory exhaustion. ZFS may require further tuning for systems with less than the recommended RAM requirements. ==== Kernel Configuration Due to the address space limitations of the i386(TM) platform, ZFS users on the i386(TM) architecture must add this option to a custom kernel configuration file, rebuild the kernel, and reboot: [.programlisting] .... options KVA_PAGES=512 .... This expands the kernel address space, allowing the `vm.kvm_size` tunable to push beyond the imposed limit of 1 GB, or the limit of 2 GB for PAE. To find the most suitable value for this option, divide the desired address space in megabytes by four. In this example `512` for 2 GB. ==== Loader Tunables Increases the [.filename]#kmem# address space on all FreeBSD architectures. A test system with 1 GB of physical memory benefitted from adding these options to [.filename]#/boot/loader.conf# and then restarting: [.programlisting] .... vm.kmem_size="330M" vm.kmem_size_max="330M" vfs.zfs.arc.max="40M" vfs.zfs.vdev.cache.size="5M" .... For a more detailed list of recommendations for ZFS-related tuning, see https://wiki.freebsd.org/ZFSTuningGuide[]. [[zfs-links]] == Further Resources * https://openzfs.org/[OpenZFS] * https://wiki.freebsd.org/ZFSTuningGuide[FreeBSD Wiki - ZFS Tuning] * https://calomel.org/zfs_raid_speed_capacity.html[Calomel Blog - ZFS Raidz Performance, Capacity and Integrity] [[zfs-term]] == ZFS Features and Terminology More than a file system, ZFS is fundamentally different. ZFS combines the roles of file system and volume manager, enabling new storage devices to add to a live system and having the new space available on the existing file systems in that pool at once. By combining the traditionally separate roles, ZFS is able to overcome previous limitations that prevented RAID groups being able to grow. A _vdev_ is a top level device in a pool and can be a simple disk or a RAID transformation such as a mirror or RAID-Z array. ZFS file systems (called _datasets_) each have access to the combined free space of the entire pool. Used blocks from the pool decrease the space available to each file system. This approach avoids the common pitfall with extensive partitioning where free space becomes fragmented across the partitions. [.informaltable] [cols="10%,90%"] |=== |[[zfs-term-pool]]pool |A storage _pool_ is the most basic building block of ZFS. A pool consists of one or more vdevs, the underlying devices that store the data. A pool is then used to create one or more file systems (datasets) or block devices (volumes). These datasets and volumes share the pool of remaining free space. Each pool is uniquely identified by a name and a GUID. The ZFS version number on the pool determines the features available. |[[zfs-term-vdev]]vdev Types a|A pool consists of one or more vdevs, which themselves are a single disk or a group of disks, transformed to a RAID. When using a lot of vdevs, ZFS spreads data across the vdevs to increase performance and maximize usable space. All vdevs must be at least 128 MB in size. * [[zfs-term-vdev-disk]] _Disk_ - The most basic vdev type is a standard block device. This can be an entire disk (such as [.filename]#/dev/ada0# or [.filename]#/dev/da0#) or a partition ([.filename]#/dev/ada0p3#). On FreeBSD, there is no performance penalty for using a partition rather than the entire disk. This differs from recommendations made by the Solaris documentation. + [CAUTION] ==== Using an entire disk as part of a bootable pool is strongly discouraged, as this may render the pool unbootable. Likewise, you should not use an entire disk as part of a mirror or RAID-Z vdev. Reliably determining the size of an unpartitioned disk at boot time is impossible and there's no place to put in boot code. ==== * [[zfs-term-vdev-file]] _File_ - Regular files may make up ZFS pools, which is useful for testing and experimentation. Use the full path to the file as the device path in `zpool create`. * [[zfs-term-vdev-mirror]] _Mirror_ - When creating a mirror, specify the `mirror` keyword followed by the list of member devices for the mirror. A mirror consists of two or more devices, writing all data to all member devices. A mirror vdev will hold as much data as its smallest member. A mirror vdev can withstand the failure of all but one of its members without losing any data. + [NOTE] ==== To upgrade a regular single disk vdev to a mirror vdev at any time, use `zpool crossref:zfs[zfs-zpool-attach,attach]`. ==== * [[zfs-term-vdev-raidz]] _RAID-Z_ - ZFS uses RAID-Z, a variation on standard RAID-5 that offers better distribution of parity and eliminates the "RAID-5 write hole" in which the data and parity information become inconsistent after an unexpected restart. ZFS supports three levels of RAID-Z which provide varying levels of redundancy in exchange for decreasing levels of usable storage. ZFS uses RAID-Z1 through RAID-Z3 based on the number of parity devices in the array and the number of disks which can fail before the pool stops being operational. + In a RAID-Z1 configuration with four disks, each 1 TB, usable storage is 3 TB and the pool will still be able to operate in degraded mode with one faulted disk. If another disk goes offline before replacing and resilvering the faulted disk would result in losing all pool data. + In a RAID-Z3 configuration with eight disks of 1 TB, the volume will provide 5 TB of usable space and still be able to operate with three faulted disks. Sun(TM) recommends no more than nine disks in a single vdev. If more disks make up the configuration, the recommendation is to divide them into separate vdevs and stripe the pool data across them. + A configuration of two RAID-Z2 vdevs consisting of 8 disks each would create something like a RAID-60 array. A RAID-Z group's storage capacity is about the size of the smallest disk multiplied by the number of non-parity disks. Four 1 TB disks in RAID-Z1 has an effective size of about 3 TB, and an array of eight 1 TB disks in RAID-Z3 will yield 5 TB of usable space. * [[zfs-term-vdev-spare]] _Spare_ - ZFS has a special pseudo-vdev type for keeping track of available hot spares. Note that installed hot spares are not deployed automatically; manually configure them to replace the failed device using `zfs replace`. * [[zfs-term-vdev-log]] _Log_ - ZFS Log Devices, also known as ZFS Intent Log (crossref:zfs[zfs-term-zil,ZIL]) move the intent log from the regular pool devices to a dedicated device, typically an SSD. Having a dedicated log device improves the performance of applications with a high volume of synchronous writes like databases. Mirroring of log devices is possible, but RAID-Z is not supported. If using a lot of log devices, writes will be load-balanced across them. * [[zfs-term-vdev-cache]] _Cache_ - Adding a cache vdev to a pool will add the storage of the cache to the crossref:zfs[zfs-term-l2arc,L2ARC]. Mirroring cache devices is impossible. Since a cache device stores only new copies of existing data, there is no risk of data loss. |[[zfs-term-txg]] Transaction Group (TXG) |Transaction Groups are the way ZFS groups blocks changes together and writes them to the pool. Transaction groups are the atomic unit that ZFS uses to ensure consistency. ZFS assigns each transaction group a unique 64-bit consecutive identifier. There can be up to three active transaction groups at a time, one in each of these three states: * _Open_ - A new transaction group begins in the open state and accepts new writes. There is always a transaction group in the open state, but the transaction group may refuse new writes if it has reached a limit. Once the open transaction group has reached a limit, or reaching the crossref:zfs[zfs-advanced-tuning-txg-timeout,`vfs.zfs.txg.timeout`], the transaction group advances to the next state. * _Quiescing_ - A short state that allows any pending operations to finish without blocking the creation of a new open transaction group. Once all the transactions in the group have completed, the transaction group advances to the final state. * _Syncing_ - Write all the data in the transaction group to stable storage. This process will in turn change other data, such as metadata and space maps, that ZFS will also write to stable storage. The process of syncing involves several passes. On the first and biggest, all the changed data blocks; next come the metadata, which may take several passes to complete. Since allocating space for the data blocks generates new metadata, the syncing state cannot finish until a pass completes that does not use any new space. The syncing state is also where _synctasks_ complete. Synctasks are administrative operations such as creating or destroying snapshots and datasets that complete the uberblock change. Once the sync state completes the transaction group in the quiescing state advances to the syncing state. All administrative functions, such as crossref:zfs[zfs-term-snapshot,`snapshot`] write as part of the transaction group. ZFS adds a created synctask to the open transaction group, and that group advances as fast as possible to the syncing state to reduce the latency of administrative commands. |[[zfs-term-arc]]Adaptive Replacement Cache (ARC) |ZFS uses an Adaptive Replacement Cache (ARC), rather than a more traditional Least Recently Used (LRU) cache. An LRU cache is a simple list of items in the cache, sorted by how recently object was used, adding new items to the head of the list. When the cache is full, evicting items from the tail of the list makes room for more active objects. An ARC consists of four lists; the Most Recently Used (MRU) and Most Frequently Used (MFU) objects, plus a ghost list for each. These ghost lists track evicted objects to prevent adding them back to the cache. This increases the cache hit ratio by avoiding objects that have a history of occasional use. Another advantage of using both an MRU and MFU is that scanning an entire file system would evict all data from an MRU or LRU cache in favor of this freshly accessed content. With ZFS, there is also an MFU that tracks the most frequently used objects, and the cache of the most commonly accessed blocks remains. |[[zfs-term-l2arc]]L2ARC |L2ARC is the second level of the ZFS caching system. RAM stores the primary ARC. Since the amount of available RAM is often limited, ZFS can also use crossref:zfs[zfs-term-vdev-cache,cache vdevs]. Solid State Disks (SSDs) are often used as these cache devices due to their higher speed and lower latency compared to traditional spinning disks. L2ARC is entirely optional, but having one will increase read speeds for cached files on the SSD instead of having to read from the regular disks. L2ARC can also speed up crossref:zfs[zfs-term-deduplication,deduplication] because a deduplication table (DDT) that does not fit in RAM but does fit in the L2ARC will be much faster than a DDT that must read from disk. Limits on the data rate added to the cache devices prevents prematurely wearing out SSDs with extra writes. Until the cache is full (the first block evicted to make room), writes to the L2ARC limit to the sum of the write limit and the boost limit, and afterwards limit to the write limit. A pair of man:sysctl[8] values control these rate limits. crossref:zfs[zfs-advanced-tuning-l2arc_write_max,`vfs.zfs.l2arc_write_max`] controls the number of bytes written to the cache per second, while crossref:zfs[zfs-advanced-tuning-l2arc_write_boost,`vfs.zfs.l2arc_write_boost`] adds to this limit during the "Turbo Warmup Phase" (Write Boost). |[[zfs-term-zil]]ZIL |ZIL accelerates synchronous transactions by using storage devices like SSDs that are faster than those used in the main storage pool. When an application requests a synchronous write (a guarantee that the data is stored to disk rather than merely cached for later writes), writing the data to the faster ZIL storage then later flushing it out to the regular disks greatly reduces latency and improves performance. Synchronous workloads like databases will profit from a ZIL alone. Regular asynchronous writes such as copying files will not use the ZIL at all. |[[zfs-term-cow]]Copy-On-Write |Unlike a traditional file system, ZFS writes a different block rather than overwriting the old data in place. When completing this write the metadata updates to point to the new location. When a shorn write (a system crash or power loss in the middle of writing a file) occurs, the entire original contents of the file are still available and ZFS discards the incomplete write. This also means that ZFS does not require a man:fsck[8] after an unexpected shutdown. |[[zfs-term-dataset]]Dataset |_Dataset_ is the generic term for a ZFS file system, volume, snapshot or clone. Each dataset has a unique name in the format _poolname/path@snapshot_. The root of the pool is a dataset as well. Child datasets have hierarchical names like directories. For example, _mypool/home_, the home dataset, is a child of _mypool_ and inherits properties from it. Expand this further by creating _mypool/home/user_. This grandchild dataset will inherit properties from the parent and grandparent. Set properties on a child to override the defaults inherited from the parent and grandparent. Administration of datasets and their children can be crossref:zfs[zfs-zfs-allow,delegated]. |[[zfs-term-filesystem]]File system |A ZFS dataset is most often used as a file system. Like most other file systems, a ZFS file system mounts somewhere in the systems directory hierarchy and contains files and directories of its own with permissions, flags, and other metadata. |[[zfs-term-volume]]Volume |ZFS can also create volumes, which appear as disk devices. Volumes have a lot of the same features as datasets, including copy-on-write, snapshots, clones, and checksumming. Volumes can be useful for running other file system formats on top of ZFS, such as UFS virtualization, or exporting iSCSI extents. |[[zfs-term-snapshot]]Snapshot |The crossref:zfs[zfs-term-cow,copy-on-write] (COW) design of ZFS allows for nearly instantaneous, consistent snapshots with arbitrary names. After taking a snapshot of a dataset, or a recursive snapshot of a parent dataset that will include all child datasets, new data goes to new blocks, but without reclaiming the old blocks as free space. The snapshot contains the original file system version and the live file system contains any changes made since taking the snapshot using no other space. New data written to the live file system uses new blocks to store this data. The snapshot will grow as the blocks are no longer used in the live file system, but in the snapshot alone. Mount these snapshots read-only allows recovering of previous file versions. A crossref:zfs[zfs-zfs-snapshot,rollback] of a live file system to a specific snapshot is possible, undoing any changes that took place after taking the snapshot. Each block in the pool has a reference counter which keeps track of the snapshots, clones, datasets, or volumes use that block. As files and snapshots get deleted, the reference count decreases, reclaiming the free space when no longer referencing a block. Marking snapshots with a crossref:zfs[zfs-zfs-snapshot,hold] results in any attempt to destroy it will returns an `EBUSY` error. Each snapshot can have holds with a unique name each. The crossref:zfs[zfs-zfs-snapshot,release] command removes the hold so the snapshot can deleted. Snapshots, cloning, and rolling back works on volumes, but independently mounting does not. |[[zfs-term-clone]]Clone |Cloning a snapshot is also possible. A clone is a writable version of a snapshot, allowing the file system to fork as a new dataset. As with a snapshot, a clone initially consumes no new space. As new data written to a clone uses new blocks, the size of the clone grows. When blocks are overwritten in the cloned file system or volume, the reference count on the previous block decreases. Removing the snapshot upon which a clone bases is impossible because the clone depends on it. The snapshot is the parent, and the clone is the child. Clones can be _promoted_, reversing this dependency and making the clone the parent and the previous parent the child. This operation requires no new space. Since the amount of space used by the parent and child reverses, it may affect existing quotas and reservations. |[[zfs-term-checksum]]Checksum |Every block is also checksummed. The checksum algorithm used is a per-dataset property, see crossref:zfs[zfs-zfs-set,`set`]. The checksum of each block is transparently validated when read, allowing ZFS to detect silent corruption. If the data read does not match the expected checksum, ZFS will attempt to recover the data from any available redundancy, like mirrors or RAID-Z. Triggering a validation of all checksums with crossref:zfs[zfs-term-scrub,`scrub`]. Checksum algorithms include: * `fletcher2` * `fletcher4` * `sha256` The `fletcher` algorithms are faster, but `sha256` is a strong cryptographic hash and has a much lower chance of collisions at the cost of some performance. Deactivating checksums is possible, but strongly discouraged. |[[zfs-term-compression]]Compression |Each dataset has a compression property, which defaults to off. Set this property to an available compression algorithm. This causes compression of all new data written to the dataset. Beyond a reduction in space used, read and write throughput often increases because fewer blocks need reading or writing. [[zfs-term-compression-lz4]] * _LZ4_ - Added in ZFS pool version 5000 (feature flags), LZ4 is now the recommended compression algorithm. LZ4 works about 50% faster than LZJB when operating on compressible data, and is over three times faster when operating on uncompressible data. LZ4 also decompresses about 80% faster than LZJB. On modern CPUs, LZ4 can often compress at over 500 MB/s, and decompress at over 1.5 GB/s (per single CPU core). [[zfs-term-compression-lzjb]] * _LZJB_ - The default compression algorithm. Created by Jeff Bonwick (one of the original creators of ZFS). LZJB offers good compression with less CPU overhead compared to GZIP. In the future, the default compression algorithm will change to LZ4. [[zfs-term-compression-gzip]] * _GZIP_ - A popular stream compression algorithm available in ZFS. One of the main advantages of using GZIP is its configurable level of compression. When setting the `compress` property, the administrator can choose the level of compression, ranging from `gzip1`, the lowest level of compression, to `gzip9`, the highest level of compression. This gives the administrator control over how much CPU time to trade for saved disk space. [[zfs-term-compression-zle]] * _ZLE_ - Zero Length Encoding is a special compression algorithm that compresses continuous runs of zeros alone. This compression algorithm is useful when the dataset contains large blocks of zeros. |[[zfs-term-copies]]Copies |When set to a value greater than 1, the `copies` property instructs ZFS to maintain copies of each block in the crossref:zfs[zfs-term-filesystem,file system] or crossref:zfs[zfs-term-volume,volume]. Setting this property on important datasets provides added redundancy from which to recover a block that does not match its checksum. In pools without redundancy, the copies feature is the single form of redundancy. The copies feature can recover from a single bad sector or other forms of minor corruption, but it does not protect the pool from the loss of an entire disk. |[[zfs-term-deduplication]]Deduplication |Checksums make it possible to detect duplicate blocks when writing data. With deduplication, the reference count of an existing, identical block increases, saving storage space. ZFS keeps a deduplication table (DDT) in memory to detect duplicate blocks. The table contains a list of unique checksums, the location of those blocks, and a reference count. When writing new data, ZFS calculates checksums and compares them to the list. When finding a match it uses the existing block. Using the SHA256 checksum algorithm with deduplication provides a secure cryptographic hash. Deduplication is tunable. If `dedup` is `on`, then a matching checksum means that the data is identical. Setting `dedup` to `verify`, ZFS performs a byte-for-byte check on the data ensuring they are actually identical. If the data is not identical, ZFS will note the hash collision and store the two blocks separately. As the DDT must store the hash of each unique block, it consumes a large amount of memory. A general rule of thumb is 5-6 GB of ram per 1 TB of deduplicated data). In situations not practical to have enough RAM to keep the entire DDT in memory, performance will suffer greatly as the DDT must read from disk before writing each new block. Deduplication can use L2ARC to store the DDT, providing a middle ground between fast system memory and slower disks. Consider using compression instead, which often provides nearly as much space savings without the increased memory. |[[zfs-term-scrub]]Scrub |Instead of a consistency check like man:fsck[8], ZFS has `scrub`. `scrub` reads all data blocks stored on the pool and verifies their checksums against the known good checksums stored in the metadata. A periodic check of all the data stored on the pool ensures the recovery of any corrupted blocks before needing them. A scrub is not required after an unclean shutdown, but good practice is at least once every three months. ZFS verifies the checksum of each block during normal use, but a scrub makes certain to check even infrequently used blocks for silent corruption. ZFS improves data security in archival storage situations. Adjust the relative priority of `scrub` with crossref:zfs[zfs-advanced-tuning-scrub_delay,`vfs.zfs.scrub_delay`] to prevent the scrub from degrading the performance of other workloads on the pool. |[[zfs-term-quota]]Dataset Quota a|ZFS provides fast and accurate dataset, user, and group space accounting as well as quotas and space reservations. This gives the administrator fine grained control over space allocation and allows reserving space for critical file systems. ZFS supports different types of quotas: the dataset quota, the crossref:zfs[zfs-term-refquota,reference quota (refquota)], the crossref:zfs[zfs-term-userquota,user quota], and the crossref:zfs[zfs-term-groupquota,group quota]. Quotas limit the total size of a dataset and its descendants, including snapshots of the dataset, child datasets, and the snapshots of those datasets. [NOTE] ==== Volumes do not support quotas, as the `volsize` property acts as an implicit quota. ==== |[[zfs-term-refquota]]Reference Quota |A reference quota limits the amount of space a dataset can consume by enforcing a hard limit. This hard limit includes space referenced by the dataset alone and does not include space used by descendants, such as file systems or snapshots. |[[zfs-term-userquota]]User Quota |User quotas are useful to limit the amount of space used by the specified user. |[[zfs-term-groupquota]]Group Quota |The group quota limits the amount of space that a specified group can consume. |[[zfs-term-reservation]]Dataset Reservation |The `reservation` property makes it possible to guarantee an amount of space for a specific dataset and its descendants. This means that setting a 10 GB reservation on [.filename]#storage/home/bob# prevents other datasets from using up all free space, reserving at least 10 GB of space for this dataset. Unlike a regular crossref:zfs[zfs-term-refreservation,`refreservation`], space used by snapshots and descendants is not counted against the reservation. For example, if taking a snapshot of [.filename]#storage/home/bob#, enough disk space other than the `refreservation` amount must exist for the operation to succeed. Descendants of the main data set are not counted in the `refreservation` amount and so do not encroach on the space set. Reservations of any sort are useful in situations such as planning and testing the suitability of disk space allocation in a new system, or ensuring that enough space is available on file systems for audio logs or system recovery procedures and files. |[[zfs-term-refreservation]]Reference Reservation |The `refreservation` property makes it possible to guarantee an amount of space for the use of a specific dataset _excluding_ its descendants. This means that setting a 10 GB reservation on [.filename]#storage/home/bob#, and another dataset tries to use the free space, reserving at least 10 GB of space for this dataset. In contrast to a regular crossref:zfs[zfs-term-reservation,reservation], space used by snapshots and descendant datasets is not counted against the reservation. For example, if taking a snapshot of [.filename]#storage/home/bob#, enough disk space other than the `refreservation` amount must exist for the operation to succeed. Descendants of the main data set are not counted in the `refreservation` amount and so do not encroach on the space set. |[[zfs-term-resilver]]Resilver |When replacing a failed disk, ZFS must fill the new disk with the lost data. _Resilvering_ is the process of using the parity information distributed across the remaining drives to calculate and write the missing data to the new drive. |[[zfs-term-online]]Online |A pool or vdev in the `Online` state has its member devices connected and fully operational. Individual devices in the `Online` state are functioning. |[[zfs-term-offline]]Offline |The administrator puts individual devices in an `Offline` state if enough redundancy exists to avoid putting the pool or vdev into a crossref:zfs[zfs-term-faulted,Faulted] state. An administrator may choose to offline a disk in preparation for replacing it, or to make it easier to identify. |[[zfs-term-degraded]]Degraded |A pool or vdev in the `Degraded` state has one or more disks that disappeared or failed. The pool is still usable, but if other devices fail, the pool may become unrecoverable. Reconnecting the missing devices or replacing the failed disks will return the pool to an crossref:zfs[zfs-term-online,Online] state after the reconnected or new device has completed the crossref:zfs[zfs-term-resilver,Resilver] process. |[[zfs-term-faulted]]Faulted |A pool or vdev in the `Faulted` state is no longer operational. Accessing the data is no longer possible. A pool or vdev enters the `Faulted` state when the number of missing or failed devices exceeds the level of redundancy in the vdev. If reconnecting missing devices the pool will return to an crossref:zfs[zfs-term-online,Online] state. Insufficient redundancy to compensate for the number of failed disks loses the pool contents and requires restoring from backups. |===